<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.jenitennison.com/blog" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>xtech</title>
 <link>http://www.jenitennison.com/blog/taxonomy/term/4</link>
 <description>The taxonomy view with a depth of 0.</description>
 <language>en</language>
<item>
 <title>XTech 2007: Thursday 17th May Afternoon</title>
 <link>http://www.jenitennison.com/blog/node/21</link>
 <description>&lt;p&gt;&lt;strong&gt;UPDATE:&lt;/strong&gt; Dare Obasanjo has written &lt;a href=&quot;http://www.25hoursaday.com/weblog/2007/06/09/WhyGDataAPPFailsAsAGeneralPurposeEditingProtocolForTheWeb.aspx&quot; title=&quot;Why GData/APP Fails as a General Purpose Editing Protocol for the Web&quot;&gt;an interesting critique&lt;/a&gt; on using the &lt;a href=&quot;http://bitworking.org/projects/atom/draft-ietf-atompub-protocol-15.html&quot; title=&quot;Atom Publishing Protocol (v15)&quot;&gt;Atom Publishing Protocol&lt;/a&gt; as the basis for general purpose sharing of data in the way that the &lt;a href=&quot;http://code.google.com/apis/gdata/index.html&quot; title=&quot;Google Data API&quot;&gt;Google Data API&lt;/a&gt; does.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Thursday afternoon had a few really interesting talks. I learned about the Google Data API (no longer called gData); Oracle&amp;#8217;s use of XLink to represent relationships between documents, and the requirements that entails; using XSLT to create JSON to use Exhibit widgets; and using XMPP to enhance instant messaging.&lt;/p&gt;

&lt;!--break--&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/33&quot; title=&quot;Google Data API (Talk)&quot;&gt;Google Data API&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;Frank Mantek&lt;/h3&gt;

&lt;p&gt;The &lt;a href=&quot;http://code.google.com/apis/gdata/index.html&quot; title=&quot;Google Data API&quot;&gt;Google Data API&lt;/a&gt; is the unified API that Google offers to all its services, such as Google Base, Blogger, Google Calendar, Google Spreadsheets and so on.&lt;/p&gt;

&lt;p&gt;Frank talked about how awful SOAP/WSDL is, in particular how two services developed in different platforms can&amp;#8217;t talk to each other (which one might imagine is rather the point of Web Services). (Later, when challenged by a Microsoft guy about this claim, he revealed that he&amp;#8217;d been a major developer of the SOAP/WSDL stuff at Microsoft, so knew exactly what he was talking about from bitter experience.)&lt;/p&gt;

&lt;p&gt;So the Google Data API is a RESTful API, using the &lt;a href=&quot;http://bitworking.org/projects/atom/draft-ietf-atompub-protocol-15.html&quot; title=&quot;Atom Publishing Protocol (v15)&quot;&gt;Atom Publishing Protocol&lt;/a&gt; with a few additions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;extra data model&lt;/li&gt;
&lt;li&gt;querying&lt;/li&gt;
&lt;li&gt;concurrency control&lt;/li&gt;
&lt;li&gt;extra authentication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What this basically means is that you can query any of the Google services using HTTP, and get back an Atom document. The URI can contain queries (the precise nature of which depend on the service; &lt;a href=&quot;http://base.google.com/&quot; title=&quot;Google Base&quot;&gt;Google Base&lt;/a&gt;, for example, uses a single URI request parameter that has a complex internal query syntax), and you get back the feed with the items that you&amp;#8217;d requested. The Atom items themselves have the basic Atom elements, but then a bunch of service-specific elements that provide the extra information you need.&lt;/p&gt;

&lt;p&gt;Listening to this talk I finally got what &lt;a href=&quot;http://www.tbray.org/ongoing/&quot; title=&quot;ongoing&quot;&gt;Tim Bray&lt;/a&gt; was talking about at the &lt;a href=&quot;http://www.xmlsummerschool.com/&quot; title=&quot;XML Summer School, Oxford&quot;&gt;XML Summer School&lt;/a&gt; a couple of years ago: REST gives us verbs and Atom gives us objects and lists of objects. I didn&amp;#8217;t get it before, because, after all, aren&amp;#8217;t all XML documents objects? But I think the point is that Atom has a lot of the mechanics that you need for talking about objects built into it, and the extensibility necessary for adding your own information to it (which is what each of Google&amp;#8217;s services are doing).&lt;/p&gt;

&lt;p&gt;The really interesting part of the talk was where Frank started talking about what the problems (still) are. The problems I noted were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Atom&amp;#8217;s verbose&lt;/li&gt;
&lt;li&gt;Google have to use &lt;code&gt;&amp;lt;category&amp;gt;&lt;/code&gt; to indicate the kind of thing they&amp;#8217;re representing (as opposed to using the document element which is what you&amp;#8217;d do with normal XML documents)&lt;/li&gt;
&lt;li&gt;the &lt;code&gt;rel&lt;/code&gt; attribute is too vague&lt;/li&gt;
&lt;li&gt;they made up their own markup languages, rather than reusing existing standards&lt;/li&gt;
&lt;li&gt;they should be using &lt;a href=&quot;http://en.wikipedia.org/wiki/HTTP_ETag&quot; title=&quot;Wikipedia: HTTP ETags&quot;&gt;ETags&lt;/a&gt; for concurrency control&lt;/li&gt;
&lt;li&gt;they haven&amp;#8217;t got any versioning (eek)&lt;/li&gt;
&lt;li&gt;incremental updates are a problem; they don&amp;#8217;t want to serve the whole Atom feed (to a mobile device) when only a small amount has changed, so what they do is have several feeds, each of which reveals a different part of the information&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/81&quot; title=&quot;From Trees to Graphs: Evolving XML for building enterprise applications&quot;&gt;From Trees to Graphs: Evolving XML for building enterprise applications&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;Ravi Murthy&lt;/h3&gt;

&lt;p&gt;Ravi Murthy talked about the provision for defining links between documents in &lt;a href=&quot;http://www.oracle.com/&quot; title=&quot;Oracle&quot;&gt;Oracle&lt;/a&gt;&amp;#8217;s database, and their consequent requirements. Information Oracle&amp;#8217;s XML database has a file system abstraction (every XML &amp;#8216;object&amp;#8217; has a file path) with access control, versioning, metadata and protocol access. Within an XML &amp;#8216;object&amp;#8217; stored in the database, they use XLink to represent the relationships with other objects. When you export the XML, the XLinks get resolved to create the XML document.&lt;/p&gt;

&lt;p&gt;Using XLink to represent relationships between documents brings a whole new set of constraints that you might want to express in a schema language, or annotations that you can use to describe the links (depending on how you look at it):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;type&lt;/strong&gt; of the linked resource (eg the document element&amp;#8217;s name, substitution group or XSD type)&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;scope&lt;/strong&gt; of a particular reference, similar to the scoping of XSD&amp;#8217;s identity constraints&lt;/li&gt;
&lt;li&gt;That a particular link is &lt;strong&gt;acyclic&lt;/strong&gt; (eg, given an XPath expression, keep evaluating it and make sure you never get back to where you started)&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;kind&lt;/strong&gt; of a link, one of:
&lt;ul&gt;&lt;li&gt;&lt;strong&gt;hard&lt;/strong&gt;: the target of the link must exist, and cannot be deleted while this resource exists (but can be renamed) &amp;#8212; these are similar to links in normal databases&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;symbolic&lt;/strong&gt;: trust the file path specified by the link and only resolve it on demand&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;weak&lt;/strong&gt;: like a hard link, except the target can be deleted, in which case the link becomes symbolic&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;versioning&lt;/strong&gt; of a link, whether it points to the &amp;#8220;current&amp;#8221; version of a resource or a specific version&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These extra constraints are expressed as annotations on the definitions of &lt;code&gt;xlink:href&lt;/code&gt; attributes in XSD schemas for the documents held in the database.&lt;/p&gt;

&lt;p&gt;Ravi also talked a bit about expressing decomposition rules: how an XML document should be shredded when it gets put into the database. They use XPath to specify rules that indicate that particular elements should be placed at a particular filepath.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;I was really flattered in the tea break. Chatting with a guy called &lt;a href=&quot;http://philwilson.org/blog/&quot; title=&quot;Phil&#039;s Blog&quot;&gt;Phil&lt;/a&gt; working at the University of Bath, who politely asked about my presentation, and after I&amp;#8217;d explained how it was all to do with overlapping markup and that kind of hard-core theory he said: &amp;#8220;You don&amp;#8217;t &lt;em&gt;look&lt;/em&gt; like a markup geek&amp;#8221;. Me: &amp;#8220;What, because I&amp;#8217;m a girl?&amp;#8221;. Him: &amp;#8220;No, no, that&amp;#8217;s not what I meant. You just look more Web 2.0-ey.&amp;#8221; &lt;a href=&quot;http://lapin-bleu.net/riviera/&quot; title=&quot;Max&#039;s Blog&quot;&gt;Max&lt;/a&gt; was there at the time, and labelled me &amp;#8220;the Geekess of XSLT&amp;#8221;, which I think clarified things. (Actually most of the people at XTech this year were Web 2.0-ey rather than markup geeks, but I&amp;#8217;m glad I &lt;em&gt;looked&lt;/em&gt; as though I fitted in.) &lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/155&quot; title=&quot;XML-powered Exhibit: A Case Study of JSON &amp;amp; XML Coexistence&quot;&gt;XML-powered Exhibit: A Case Study of JSON &amp;amp; XML Coexistence&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://metacognition.info/&quot; title=&quot;Chimezie Ogbuji&#039;s Website&quot;&gt;Chimezie Ogbuji&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;&amp;#8220;What&amp;#8217;s &lt;a href=&quot;http://simile.mit.edu/wiki/Exhibit&quot; title=&quot;Exhibit Wiki&quot;&gt;Exhibit&lt;/a&gt;?&amp;#8221; I hear you ask. Or maybe you&amp;#8217;re more with-it than I am, but that&amp;#8217;s what I was asking. Chimezie never really explained, but I kinda gathered that it&amp;#8217;s a funky AJAX toolset for creating views of data by importing scripts and using magical IDs and extension attributes within web pages. The other phrase that Chimezie dropped in was &lt;a href=&quot;http://www.w3.org/TR/backplane/&quot; title=&quot;Rich Web Application Backplane&quot;&gt;Rich Web Application Backplane&lt;/a&gt;, which again I hadn&amp;#8217;t heard of. Even having read the W3C Note, I still don&amp;#8217;t get it. Ho hum.&lt;/p&gt;

&lt;p&gt;Anyway, Chimezie made the point that while entering data using XForms is great, it&amp;#8217;s too heavy-weight for viewing that data. Exhibit gives a lot more flexibility (take a look at the &lt;a href=&quot;http://simile.mit.edu/exhibit/examples/presidents/presidents.html&quot; title=&quot;US Presidents in Exhibit&quot;&gt;US presidents&lt;/a&gt; example), which enables users to explore data more freely. In Exhibit pages, you provide a JSON schema for your data, a number of lenses/views/widgets that you can use to view the data, then you embed the widgets in the HTML page and point it at the data source. The JSON schema indicates the type of a particular property (eg &amp;#8220;country&amp;#8221;), and gives labels for it (including a plural label (&amp;#8220;countries&amp;#8221;) and a reverse label (&amp;#8220;country of&amp;#8221;)) that it uses in the widgets.&lt;/p&gt;

&lt;p&gt;But that requires JSON, right? Chimezie showed how easy it is (and it&amp;#8217;s &lt;em&gt;really&lt;/em&gt; easy) to transform data-oriented XML into JSON using XSLT.&lt;/p&gt;

&lt;p&gt;You know, there are all these cool ways out there for viewing information, I just wish I had some really meaty data to use them on! &lt;a href=&quot;http://simile.mit.edu/timeline/&quot; title=&quot;SIMILE Timelines&quot;&gt;Timelines&lt;/a&gt; are one thing, but I&amp;#8217;d also love to find some data to employ in &lt;a href=&quot;http://www.gapminder.org/&quot; title=&quot;Gapminder&quot;&gt;Gapminder&lt;/a&gt; or even in an interface like the one for &lt;a href=&quot;http://www.philipglass.com/glassengine/&quot; title=&quot;Philip Glass Engine&quot;&gt;the music of Philip Glass&lt;/a&gt;. Perhaps I should just mine &lt;a href=&quot;http://base.google.com/&quot; title=&quot;Google Base&quot;&gt;Google Base&lt;/a&gt;, but I&amp;#8217;d like it to be something personally or collectively useful.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/97&quot; title=&quot;Real-time user-to-user web with Mozilla and XMPP&quot;&gt;Real-time user-to-user web with Mozilla and XMPP&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://blog.hyperstruct.net/&quot; title=&quot;Massimiliano Mirra&#039;s Website&quot;&gt;Massimiliano Mirra&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This talk was strong on motivation &amp;#8212; the requirement to enhance basic instant messaging functionality &amp;#8212; and strong on demonstration, with Massimiliano chatting and playing with a pre-programmed bot, but really weak on the technical details. It was only through the post-talk questions that we learned that what we&amp;#8217;d seen was based on &lt;a href=&quot;http://www.xmpp.org/&quot; title=&quot;XMPP Standards Foundation&quot;&gt;XMPP (the Extensible Messaging and Presence Protocol)&lt;/a&gt;, which allowed DOM events to be passed between clients. Have to read the paper if you want to learn more.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/21#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/18">atom</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/19">google</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/22">rest</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Sun, 27 May 2007 23:03:24 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">21 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech 2007: Thursday 17th May Morning</title>
 <link>http://www.jenitennison.com/blog/node/20</link>
 <description>&lt;p&gt;On Thursday morning, I was down to chair the first session in the &amp;#8220;Core Technologies&amp;#8221; track. Two interesting papers: one on XForms and one on Google Base. Then I snuck on to the &amp;#8220;Applications&amp;#8221; track to hear about scientific Wikis and the trials of managing schema repositories.&lt;/p&gt;

&lt;!--break--&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/114&quot; title=&quot;XForms, REST, XQuery... and skimming&quot;&gt;XForms, REST, XQuery&amp;#8230; and skimming&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://internet-apps.blogspot.com/&quot; title=&quot;Mark Birbeck&#039;s Blog&quot;&gt;Mark Birbeck&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Mark Birbeck, one of the developers of &lt;a href=&quot;http://www.formsplayer.com/&quot; title=&quot;formsPlayer Website&quot;&gt;formsPlayer&lt;/a&gt; (and an invited expert on the XForms and XHTML WGs), discussed the rationale behind using &lt;a href=&quot;http://www.w3.org/MarkUp/Forms/&quot; title=&quot;XForms W3C Page&quot;&gt;XForms&lt;/a&gt;. The only thing that really stood out for me was the fact that he used an XML document to provide the &lt;em&gt;labels&lt;/em&gt; for the form controls (in just the same way as you can use XML documents to provide the &lt;em&gt;data&lt;/em&gt; in the form controls). That was quite neat, and made me think of the different requirements of data entry and data presentation: a topic that returned in &lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/155&quot; title=&quot;XML-powered Exhibit: A Case Study of JSON &amp;amp; XML Coexistence&quot;&gt;Chimezie Ogbuji&amp;#8217;s talk&lt;/a&gt; later that afternoon.&lt;/p&gt;

&lt;p&gt;Another theme here, for me, was the use of declarative programming: you write a form, which is just some XML and leave all the technical stuff about submitting a PUT HTTP request to the XForms player. Mark talked about using &lt;a href=&quot;http://en.wikipedia.org/wiki/WebDAV&quot; title=&quot;Wikipedia: WebDAV&quot;&gt;WebDAV&lt;/a&gt; and &lt;a href=&quot;http://exist.sourceforge.net/&quot; title=&quot;eXist&quot;&gt;eXist&lt;/a&gt; on the server to store the XML documents, and demonstrated using &lt;a href=&quot;http://www.oxygenxml.com/&quot; title=&quot;oXygen XML editor&quot;&gt;&amp;lt;oXygen/&amp;gt;&lt;/a&gt; to load and save documents. Hmm&amp;#8230; I wonder if I should experiment with XForms and that Unicode database browser I was thinking about&amp;#8230;&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/104&quot; title=&quot;Google Base, a mashups database for the REST of us&quot;&gt;Google Base, a mashups database for the REST of us&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;Jeffrey Scudder&lt;/h3&gt;

&lt;p&gt;A very popular, thought-provoking, and slightly disturbing, talk on &lt;a href=&quot;http://base.google.com/&quot; title=&quot;Google Base&quot;&gt;Google Base&lt;/a&gt;. So Google are asking us to upload data on &lt;em&gt;anything&lt;/em&gt; (jobs, personals, cars, etc.) into their huge databases. And then they&amp;#8217;ll serve us back that information (and other people&amp;#8217;s information) in formats such as &lt;a href=&quot;http://en.wikipedia.org/wiki/Atom_(standard)&quot; title=&quot;Wikipedia: Atom&quot;&gt;Atom&lt;/a&gt;, &lt;a href=&quot;http://en.wikipedia.org/wiki/RSS_(file_format)&quot; title=&quot;Wikipedia: RSS&quot;&gt;RSS&lt;/a&gt; and &lt;a href=&quot;http://www.json.org/&quot; title=&quot;JSON&quot;&gt;JSON&lt;/a&gt;, as well as standard web pages.&lt;/p&gt;

&lt;p&gt;The thought-provoking bit, for me, was the fact that they don&amp;#8217;t have any particular schema for each of these kinds of items. Now, I come from a knowledge engineering background where we&amp;#8217;re very into ontologies and creating conceptual models and all that stuff. But Google don&amp;#8217;t bother: you create categories and structure your data the way you want to, and they&amp;#8217;ll serve it back in that way. But they look at &lt;em&gt;all&lt;/em&gt; the data they have their hands on in order to decide how to display and serve information. So, for example, if I define cars with the property &amp;#8216;shade&amp;#8217; but a hundred other people define them with the property &amp;#8216;colour&amp;#8217; then on a feed that includes all our items, we&amp;#8217;ll see the &amp;#8216;colour&amp;#8217; property.&lt;/p&gt;

&lt;p&gt;This is a kind of bottom-up ontology design: the properties of an item are the properties that other people think are important about an item. One thing that surprised me was that it looks like it&amp;#8217;s not very intelligent yet: simple differences in case (like &amp;#8216;color&amp;#8217; vs. &amp;#8216;Color&amp;#8217;) don&amp;#8217;t seem to be detected, so I guess nothing else is. Time to dig out my old research on automated comparison of ontologies&amp;#8230;&lt;/p&gt;

&lt;p&gt;The slightly disturbing part? Well, Google are trying to get us to upload our data to their servers. And they&amp;#8217;re not putting any limit on how much we upload. One member of the audience asked &amp;#8220;What&amp;#8217;s in it for you?&amp;#8221;; Jeffrey seemed to have a hard time understanding the question and said something like &amp;#8220;Better indexed information means we can give you better information&amp;#8221;, but that doesn&amp;#8217;t really answer the question. Presumably it&amp;#8217;s all about being able to advertise to us better: the more data we upload, the more They know about us, the better targeted Their adverts can be.&lt;/p&gt;

&lt;p&gt;What I found strange was the idea of &lt;em&gt;uploading&lt;/em&gt; data to a &lt;em&gt;central&lt;/em&gt; &lt;em&gt;server&lt;/em&gt;. Surely the whole point of the web is that I put my data on my machine. I don&amp;#8217;t have a problem putting the data together in a nice Atom feed so that Google can index it easily and pointing them at it, but I want to own it, y&amp;#8217;know?&lt;/p&gt;

&lt;p&gt;By the way, one thing that was apparent to me during this talk was how important it is that web pages look good with large font sizes, not just for people with poor eyesight, but also for when you&amp;#8217;re &lt;em&gt;demoing&lt;/em&gt; your cool web applications! The Google Base drop-down menus were impossible to see with increased font size because their height is fixed in pixels.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/134&quot; title=&quot;An Augmented Wiki for Interactive Scientific Visualization and Evolutionary Collaboration&quot;&gt;An Augmented Wiki for Interactive Scientific Visualization and Evolutionary Collaboration&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://csis.pace.edu/~marchese&quot; title=&quot;Frank Marchese&#039;s Website&quot;&gt;Frank Marchese&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;On to the less well-attended &amp;#8220;Applications&amp;#8221; track. This talk was about supporting scientists (specifically biochemists) in providing side-by-side visualisation (of complex molecules) and textual analysis. Frank talked about a Wiki in which &lt;a href=&quot;http://jmol.sourceforge.net/&quot; title=&quot;Jmol molecule viewer&quot;&gt;Jmol&lt;/a&gt; Java applets for visualising molecules are arranged side-by-side with standard journal articles. The articles themselves have links in them that animate the Jmol visualisation: highlighting particular groups of atoms, moving it to show a particular view, and so on.&lt;/p&gt;

&lt;p&gt;It was kind of neat, as pretty pictures of molecules often are, but I didn&amp;#8217;t think the Wikiness of the whole enterprise was really explored: I got the impression that the textual articles were basically static: you could add comments, but not collaboratively create an article about the molecule. Also, the link between the text and the animation of the molecule was through Javascript, as far as I could tell: I&amp;#8217;d expect a declarative method of defining animations would make it a lot more accessible.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/176&quot; title=&quot;Real-world metadata registries; sharing concepts, schemas and semantics&quot;&gt;Real-world metadata registries; sharing concepts, schemas and semantics&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://www.ukoln.ac.uk/&quot; title=&quot;UKOLN Website&quot;&gt;Emma Tonkin&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This talk took me back to the trials of creation of top-down conceptual models, focusing on the definition of metadata schemas. Unfortunately, there was a lot of philosophy and not many practical guidelines in the talk, and I didn&amp;#8217;t get a lot out of it. One thing that Emma touched on, though, was the way that the meaning of a term can change over time, through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;extension or generalisation&lt;/li&gt;
&lt;li&gt;narrowing or specialisation&lt;/li&gt;
&lt;li&gt;amelioration (when a term gains approval)&lt;/li&gt;
&lt;li&gt;deterioration or perjoration (when a term gains disapproval)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The latter two are particularly demonstrated by political correctness, whereby terms like &amp;#8220;Eskimo&amp;#8221; fall out of favour and &amp;#8220;Inuit&amp;#8221; becomes more acceptable (all highly culture-specific; see the &lt;a href=&quot;http://en.wikipedia.org/wiki/Eskimo&quot; title=&quot;Wikipedia: Eskimo&quot;&gt;Wikipedia Eskimo page&lt;/a&gt; for more discussion on what term to use).&lt;/p&gt;

&lt;p&gt;The advantage of a principled conceptual model is that the concept itself and the term(s) you use for that concept are loosely coupled, so if a given term falls out of favour or becomes inappropriate, you can always decouple it. On the other hand, bottom-up tagging tends (I think) to have a 1:1 relationship between term and concept, so if the use of terminology changes you might be left with inaccurate tagging of legacy data. Maybe.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/20#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/18">atom</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/19">google</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/20">ontologies</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/21">wikis</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/17">xforms</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Fri, 25 May 2007 22:34:18 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">20 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech 2007: Wednesday 16th May Afternoon</title>
 <link>http://www.jenitennison.com/blog/node/19</link>
 <description>&lt;p&gt;Yes, I&amp;#8217;m determined to write up every talk I attended at XTech 2007, so that &lt;em&gt;I&lt;/em&gt; have a record of it if nothing else. On Wednesday afternoon, I attended sessions on microformats, internationalisation and NVDL (as well as giving my own talk, of course).&lt;/p&gt;

&lt;!--break--&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/paper/41&quot; title=&quot;Microformats: the nanotechnology of the semantic web&quot;&gt;Microformats: the nanotechnology of the semantic web&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://adactio.com/&quot; title=&quot;Jeremy Keith&#039;s Website&quot;&gt;Jeremy Keith&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This was a supremely well-put-together presentation on &lt;a href=&quot;http://microformats.org/&quot; title=&quot;Microformats Website&quot;&gt;microformats&lt;/a&gt;: beautiful slides, drama and humour, and a reference to &lt;a href=&quot;http://en.wikipedia.org/wiki/Neal_Stephenson&quot; title=&quot;Wikipedia: Neal Stephenson&quot;&gt;Neal Stephenson&amp;#8217;s&lt;/a&gt; &lt;a href=&quot;http://www.amazon.com/Diamond-Age-Illustrated-Primer-Spectra/dp/0553380966&quot; title=&quot;Amazon: Diamond Age&quot;&gt;Diamond Age&lt;/a&gt; (was I really one of only three people in the packed room to have read it?). There was a lot about what microformats are, how they&amp;#8217;re designed, what their niche is (Jeremy was very up-front about the fact they don&amp;#8217;t solve every problem), and how they&amp;#8217;re developed. But there weren&amp;#8217;t any demonstrations of microformat-based applications, which I would have really liked to see. The other thing I thought was worth noting was that Jeremy talked about the dangers of &amp;#8220;grey goo&amp;#8221; (he was using a nanotechnology metaphor): the proliferation of microformats. He expressed the strong desire that the set of microformats be kept small, and even said (I paraphrase) &amp;#8220;Do use semantic class names in your HTML, but don&amp;#8217;t call them microformats [unless they&amp;#8217;ve been through the microformats standardisation process]!&amp;#8221;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.holoweb.net/~liam/&quot; title=&quot;Liam Quin&#039;s Website&quot;&gt;Liam Quin&lt;/a&gt; gave a paper entitled &lt;a href=&quot;http://www.idealliance.org/papers/extreme/proceedings/html/2006/Quin01/EML2006Quin01.html&quot; title=&quot;Microformats: Contaminants or Ingredients&quot;&gt;Microformats: Contaminants or Ingredients&lt;/a&gt; at &lt;a href=&quot;http://www.extrememarkup.com/&quot; title=&quot;Extreme Markup Languages&quot;&gt;Extreme&lt;/a&gt; last year, asking what we, as traditional markup geeks, should do about them. Some were very sceptical, saying something along the lines of &amp;#8220;They&amp;#8217;re headed for a trainwreck; and we should sit back, watch it happen, and pick up the pieces.&amp;#8221; Others wanted to celebrate: the fact that tagging has become understood is really good news for the semantic web, open data and all that jazz. &lt;/p&gt;

&lt;p&gt;Both the traditional markup and the microformats community have the same goals: they want to make information easier to search for, to query, to integrate and so on. The microformats approach is to minimise the cost to those supplying information, and to target just a few, very common, kinds of data such as contact information, events and social networks. Traditional markup, on the other hand, aims to cover every single kind of information you might want to make available, and has to worry about issues like validating, styling, and distinguishing between tag sets.&lt;/p&gt;

&lt;p&gt;It seems that a fundamental problem is that the benefits of including semantic markup aren&amp;#8217;t immediately obvious to the supplier. Whether you use semantic class names in HTML or use elements in known namespaces, it&amp;#8217;s purely a matter of faith that this will make your information easier to locate or use. You can&amp;#8217;t know that search engines will include that information in their weighting algorithms, or that people reading your page will have the screen-scraping software necessary to pull anything out. With so little (obvious) benefit, authors will only supply semantic data if the cost is low. Adding class names to existing HTML elements is easy whether a web page is generated by hand or automatically. Adding namespaces and authoring special CSS might not be that much more costly to do, but it&amp;#8217;s much more costly to grok.&lt;/p&gt;

&lt;p&gt;So if we want authors to start putting elements in their own namespaces in their web pages, we need an application that immediately cranks up the benefit of doing so. I have no idea what that is.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/paper/50&quot; title=&quot;Applying the Internationalization Tag Set&quot;&gt;Applying the Internationalization Tag Set&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://www.translate.com/&quot; title=&quot;Yves Savourel&#039;s Website&quot;&gt;Yves Savourel&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This was a good introduction to [a standard] I only knew about vaguely. It&amp;#8217;s definitely worth knowing about the &lt;code&gt;its:*&lt;/code&gt; attributes for defining i18n features such as indicating which content should be translated, which are terms, providing comments for localisation and so on, just in case you need to build those in to new markup languages.&lt;/p&gt;

&lt;p&gt;I also have much admiration for how the ITS standard doesn&amp;#8217;t expect people to completely rework their markup languages to incorporate ITS data. Instead of using the ITS attributes directly in a document, you can use global rules embedded in the document itself, referenced from the document, or embedded in the schema for the document. I think this approach will prove useful in the development of &lt;a href=&quot;http://www.lmnlwiki.org/index.php/Talk:ECLIX#LIX&quot; title=&quot;LMNL in XML&quot;&gt;LIX&lt;/a&gt;, when we get around to formalising it.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/48&quot; title=&quot;NVDL - a breath of fresh air for compound document validation&quot;&gt;NVDL - a breath of fresh air for compound document validation&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://xmlguru.cz/&quot; title=&quot;Jirka Kosek&#039;s Website&quot;&gt;Jirka Kosek&lt;/a&gt; &amp;amp; &lt;a href=&quot;http://nalevka.com/&quot; title=&quot;Petr Nálevka&#039;s Website&quot;&gt;Petr Nálevka&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://www.nvdl.org/&quot; title=&quot;Namespace-based Validation Dispatching Language&quot;&gt;NVDL&lt;/a&gt; is Part 4 of &lt;a href=&quot;http://www.dsdl.org/&quot; title=&quot;Document Schema Definition Languages&quot;&gt;DSDL&lt;/a&gt;, specifically targeted at organising the validation of documents that incorporate multiple namespaces, such as XHTML documents containing islands of SVG, RDF and MathML. NVDL&amp;#8217;s approach is to identify subtrees within the document that need to be validated against a particular schema. The subtrees don&amp;#8217;t need to only hold one namespace, but often that will be the case.&lt;/p&gt;

&lt;p&gt;The XML Schema wonks in the room (Henry Thompson and Michael Sperberg-McQueen) were a bit befuddled, I think, because with XML Schema you just supply a whole bunch of schema documents to the processor, for different namespaces, and as long as the schemas contain wildcards they&amp;#8217;ll do the right thing. The concept of supplying multiple schemas to a validator isn&amp;#8217;t part of RELAX NG&amp;#8217;s validation approach, so you need something like NVDL if you don&amp;#8217;t want to rework your schema for every combination of namespaces.&lt;/p&gt;

&lt;p&gt;Henry and Michael were particularly concerned about the fact that it means you can override the original schema, allowing elements from foreign namespaces in situations where the original schema hasn&amp;#8217;t allowed them. But as Henry said, it just means that the primary schema you use to define what&amp;#8217;s allowed where is actually an NVDL schema: it&amp;#8217;s not auxiliary validation like Schematron is, but a language for the primary schema you use.&lt;/p&gt;

&lt;p&gt;Later, I wondered how much the &lt;a href=&quot;http://www.w3.org/TR/xproc&quot; title=&quot;XProc: An XML Pipeline Language&quot;&gt;XProc&lt;/a&gt; work would render NVDL irrelevant. After all, XProc can invoke validation of subtrees against multiple external schemas. On the other hand, NVDL&amp;#8217;s syntax is going to be easier to use if that&amp;#8217;s all you want to do. Perhaps someone will write a tool to convert NVDL schemas to XProc pipelines&amp;#8230;&lt;/p&gt;

&lt;p&gt;Actually, Jirka &amp;amp; Petr&amp;#8217;s experience with &lt;a href=&quot;http://sourceforge.net/projects/jnvdl/&quot; title=&quot;Java implementation of NVDL&quot;&gt;JNVDL&lt;/a&gt; is interesting from the XProc viewpoint, in particular the problems that they had with reporting meaningful line numbers when validating subtrees. Something that XProc implementers might want to look at in regard to error reporting with &lt;code&gt;&amp;lt;p:viewport&amp;gt;&lt;/code&gt;.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/19#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/16">markup</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/6">pipelines</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Sun, 20 May 2007 22:52:14 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">19 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech 2007: Wednesday 16th May Morning</title>
 <link>http://www.jenitennison.com/blog/node/18</link>
 <description>&lt;p&gt;Since there&amp;#8217;s next to no &amp;#8216;net connection at XTech 2007 (obviously the Web is not so ubiquitous as all that), I have nothing to do in the sessions but listen! Here are some thoughts about the sessions that I attended on the morning of Wednesday 16th. I haven&amp;#8217;t included the keynotes not because they weren&amp;#8217;t interesting but because I can&amp;#8217;t think of anything to say about them at the moment.&lt;/p&gt;

&lt;!--break--&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/paper/60&quot; title=&quot;XML and LINQ: What&#039;s New in Orcase and Beyond&quot;&gt;XML and LINQ: What&amp;#8217;s New in Orcas and Beyond&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://research.microsoft.com/~emeijer/&quot; title=&quot;Erik Meijer&#039;s Website&quot;&gt;Erik Meijer&lt;/a&gt; (Microsoft)&lt;/h3&gt;

&lt;p&gt;I thought I&amp;#8217;d better go to this one because I&amp;#8217;m supposed to be talking about XML APIs at this year&amp;#8217;s &lt;a href=&quot;http://www.xmlsummerschool.com/&quot; title=&quot;XML Summer School, Oxford&quot;&gt;XML Summer School&lt;/a&gt; and LINQ, or XLINQ, is one of the hot topics. I&amp;#8217;m not a .NET developer, so it&amp;#8217;s all kinda passed me by thus far, and I&amp;#8217;m not sure I really understand it now. (I&amp;#8217;d welcome corrections and clarifications.) The three things that seemed to be important are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;You can get at information held in objects, databases or XML using the same syntax. (Erik showed accessing XML with faulty XQuery syntax, which made me and &lt;a href=&quot;http://www.datypic.com/&quot; title=&quot;Priscilla Walmsley&#039;s Website&quot;&gt;Priscilla Walmsley&lt;/a&gt; grimace at each other.) This means you can decide how you want to actually hold your data further down the line. A big distinction between previous attempts to work across paradigms is that the &lt;em&gt;data&lt;/em&gt; doesn&amp;#8217;t get converted, but the &lt;em&gt;queries&lt;/em&gt; do. So you write your LINQ query in LINQ syntax and it gets mapped on to SQL to query your SQL database, or on to XQuery (I guess) to query your XML document. This all seemed to assume data-oriented information: I have no idea, yet, how or whether mixed content gets handled.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;XML is a &amp;#8220;first class datatype&amp;#8221; in LINQ, so to create static XML you just write XML in your program (a bit like in XQuery). The example Erik showed included an XML declaration, which is just plain weird: dunno if that was an error or it&amp;#8217;s a way of indicating what version of XML you&amp;#8217;re using, or what. To create dynamic portions of the XML, you use &lt;code&gt;&amp;lt;%=...%&amp;gt;&lt;/code&gt; &amp;#8220;expression holes&amp;#8221; which can contain .NET code, including calls to a new API for creating XML elements and attributes (a DOM replacement).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Erik talked about writing applications in .NET and then automatically refactoring them (with a click in a context menu) to work in client/server architectures, and refactoring again to work across several clients. Presumably this creates all the code necessary to make the application work with WS* messaging, so you don&amp;#8217;t have to program it. This all sounded really dodgy to me: I don&amp;#8217;t want to rely on a tool to make a language/approach/architecture usable.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There was an amusing digression into the art of rendering triangles, and thus three-dimensional models, with zero-width, zero-height, bordered &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;s in XHTML. And a mention of the &amp;#8220;backbutton&amp;#8221; problem that you get when you spawn tabs/windows in your web browser and then go back to your original tab/window and hit submit, which made me think that perhaps a RESTful architecture would make a whole lot of complexity go away.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/159&quot; title=&quot;Data Model Perspectives for XML Schema&quot;&gt;Data Model Perspectives for XML Schema&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://www.ee.ethz.ch/&quot; title=&quot;Felix Michel&#039;s Website&quot;&gt;Felix Michel&lt;/a&gt; (ETH Zurich), &lt;a href=&quot;http://dret.net/netdret/&quot; title=&quot;Erik Wilde&#039;s Website&quot;&gt;Erik Wilde&lt;/a&gt; (UC Berkeley)&lt;/h3&gt;

&lt;p&gt;Felix mentioned that I might be interested in his talk in a &lt;a href=&quot;http://www.jenitennison.com/blog/node/2#comment-24&quot; title=&quot;Comment: Re: XTech Preparation&quot;&gt;comment here&lt;/a&gt;, and sure enough I found it fascinating. He&amp;#8217;s created a single-file representation of XML Schemas (consolidating schemas that, by virtue of using different namespaces, must be in different physical documents), and a set of XSLT 2.0 user-defined functions that provide access to and queries on the XML Schema information.&lt;/p&gt;

&lt;p&gt;For example, you can go from an instance element in your document to its type, find out if it&amp;#8217;s an extension or restriction, go to its base type, look at the annotations on it, and so on and so on. And all this in Basic XSLT 2.0 (the functions that work on instance elements traverse the instance document and schema in parallel to locate the element declaration that applies). You could use these functions to do everything you can do in Schema-Aware XSLT 2.0, with more flexibility, at the expense of performance.&lt;/p&gt;

&lt;p&gt;He also mapped content models onto &lt;code&gt;&amp;lt;occurrence&amp;gt;&lt;/code&gt; elements that encode the &amp;#8220;follow set&amp;#8221; for a particular occurrence, so you can easily answer the question &amp;#8220;what elements could come next?&amp;#8221;. I can&amp;#8217;t immediately think of a way of using that information in a stylesheet, but perhaps he can describe one.&lt;/p&gt;

&lt;p&gt;Anyway, I think Felix&amp;#8217;s point was not to provide XSLT programmers with a set of useful functions, but to demonstrate the kind of standard, fairly light-weight, API that we might use to access XML Schema information. There was some discussion, in the development of XPath 2.0, of providing this kind of API, but getting agreement on XDM was hard enough!&lt;/p&gt;

&lt;p&gt;However, my thoughts were veering off in different directions. To my mind, validation and annotation are separable processes, and the data types, element groups and linking behaviour that you might find useful on a data set are processing-specific. For example, it might make sense for one process to annotate the element &lt;code&gt;&amp;lt;foo&amp;gt;2007-05-17&amp;lt;/foo&amp;gt;&lt;/code&gt; as having the type date, while for another process (such as a transformation that deletes all &lt;code&gt;&amp;lt;foo&amp;gt;&lt;/code&gt; elements) it&amp;#8217;s unnecessary. I really don&amp;#8217;t want to have to define an XSD schema for my entire schema just to indicate that the &lt;code&gt;&amp;lt;foo&amp;gt;&lt;/code&gt; element is of type &lt;code&gt;xs:date&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Just as it&amp;#8217;s better to define the links between elements using keys, rather than relying on ID annotations made by a DTD, I think type annotations and node groups (why limit it to elements?) could be defined in the stylesheet. To give an idea:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;!-- all date attributes have the type named &#039;xs:date&#039; --&amp;gt;
&amp;lt;ann:type name=&quot;xs:date&quot; match=&quot;@date&quot; /&amp;gt;
&amp;lt;!-- h1, h2, h3, h4, h5, h6 elements are heading elements --&amp;gt;
&amp;lt;ann:group name=&quot;xhtml:heading&quot; 
  match=&quot;xhtml:h1 | xhtml:h2 | xhtml:h3 | xhtml:h4 | xhtml:h5 | xhtml:h6&quot; /&amp;gt;
&amp;lt;!-- oh, and so&#039;s the h element --&amp;gt;
&amp;lt;ann:group name=&quot;xhtml:heading&quot; match=&quot;xhtml:h&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It&amp;#8217;d be reasonably easy to give rudimentary support for &lt;code&gt;ann:type($node)&lt;/code&gt; and &lt;code&gt;ann:group($node)&lt;/code&gt; user-defined functions based on these, but they&amp;#8217;d really have to be built into the XSLT processor to get full pattern support and to work with modularised stylesheets. This all requires more detail than I have time to write right now, but is it even worth pursuing?&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/18#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/15">xlinq</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Thu, 17 May 2007 21:50:59 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">18 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech Creole presentation fallout</title>
 <link>http://www.jenitennison.com/blog/node/17</link>
 <description>&lt;p&gt;&lt;a href=&quot;http://www.ltg.ed.ac.uk/~ht/&quot; title=&quot;Henry S. Thompson&#039;s Home Page&quot;&gt;Henry Thompson&lt;/a&gt; had a lot to say after &lt;a href=&quot;http://www.jenitennison.com/blog/files/XTech2007CreoleSlides.zip&quot; title=&quot;XTech 2007 Creole presentation&quot;&gt;my Creole presentation&lt;/a&gt; (open takahashi.xul?data=creole.data; requires Firefox) about the benefits of stand-off markup for linguistic information. From his overview, it seems that the &lt;a href=&quot;http://www.ltg.ed.ac.uk/NITE&quot; title=&quot;NITE XML Toolkit&quot;&gt;NITE XML Toolkit&lt;/a&gt; that he&amp;#8217;s been involved with represents overlapping linguistic data by holding atoms (here meaning the &amp;#8220;lowest common denominator&amp;#8221; shared pieces of data) and having multiple trees marking up these atoms. The trees are independently validated (since they are pure XML), with cross-hierarchy validation done through the query language. This is pretty similar to the &lt;a href=&quot;http://www.idealliance.org/papers/extreme/Proceedings/html/2006/Schonefeld01/EML2006Schonefeld01.html&quot; title=&quot;Towards Validation of Concurrent Markup&quot;&gt;XCONCUR&lt;/a&gt; approach, which augments a CONCUR-like multi-grammar validation with a Schematron-like constraint language.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;Now, I have nothing against using constraint languages (like Schematron) to validate documents, but grammars (like RELAX NG) have big advantages. Most importantly, they are easier to write (if they&amp;#8217;re designed properly), and tools can analyse them to do useful things, such as tell you what element or attribute is expected next. If it&amp;#8217;s possible to write cross-grammar constraints in a grammar (like Creole) then why would you use a constraint language to do it?&lt;/p&gt;

&lt;p&gt;I think the big difference between Henry&amp;#8217;s domain and the one that I think will move overlap into the mainstream is between global and local concurrence. With global concurrence, entirely separate hierarchies are applied to the same data, so the natural validation mechanism is to use entirely separate grammars (with perhaps a few small rules to do cross-grammar validation where that proves necessary). With local concurrence, the vast majority of the document follows a single hierarchy with concurrence happening at a low level.&lt;/p&gt;

&lt;p&gt;Actually, the best example for this doesn&amp;#8217;t even involve overlap. Consider HTML paragraphs, which contain various inline elements such as &lt;code&gt;&amp;lt;strong&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;em&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt;. It doesn&amp;#8217;t make sense for these elements to contain themselves (strong text is neither made stronger nor negated by appearing in two &lt;code&gt;&amp;lt;strong&amp;gt;&lt;/code&gt; elements, and it&amp;#8217;s not allowed for links to contain other links). So the natural model in Creole is&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;p      = element p { mixed { strong* &amp;amp; em* &amp;amp; a* } }
strong = range strong { text }
em     = range em { text }
a      = range a { attribute href { text }, ..., text }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This model allows &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; elements to appear within &lt;code&gt;&amp;lt;em&amp;gt;&lt;/code&gt; elements, or vice versa, not because of the content model of &lt;code&gt;&amp;lt;em&amp;gt;&lt;/code&gt; but because the two ranges are interleaved (and one arrangement of interleaved ranges is containment). It doesn&amp;#8217;t allow any of these elements to appear inside themselves. It would be a real maintenance headache to have separate grammars for each of these inline elements, when most of each of the grammars (all the hierarchy down to the paragraph level) would be the same.&lt;/p&gt;

&lt;p&gt;Actually, looking at NITE, it seems like it employs a data model that&amp;#8217;s quite like &lt;a href=&quot;http://www.lmnlwiki.org/index.php/LMNL_data_model&quot; title=&quot;LMNL data model&quot;&gt;LMNL&amp;#8217;s&lt;/a&gt;, in that it has the concept of layers over atoms or ranges/elements. (Interestingly it looks like they get around the problem of identifying which ranges belong to which layers purely by using their name.) Another difference here might be that while I&amp;#8217;m talking about supporting overlap in fairly heavily structured documents (like office documents), they&amp;#8217;re really using fairly flat annotations, where there isn&amp;#8217;t much of a grammar anyway. But I might have that wrong: need to do more reading. The other thing to investigate is whether they have any support for self-overlap (&lt;code&gt;&amp;lt;phrase&amp;gt;&lt;/code&gt; elements overlapping other &lt;code&gt;&amp;lt;phrase&amp;gt;&lt;/code&gt; elements): I kinda gather that they don&amp;#8217;t.&lt;/p&gt;

&lt;p&gt;Anyway, Henry also made the points that (a) that he doesn&amp;#8217;t want a new syntax for overlap and (b) stand-off markup works very well thank you. To address the latter point first, I think stand-off markup works very well if you have the tools to support it. It&amp;#8217;s fine if you have an integrated toolkit which can pull together and display the stand-off markup as embedded markup, and let you create ranges by highlighting text with a mouse. But the great power of HTML and other web technologies is that you don&amp;#8217;t need to use a specialised toolkit to write it: you can just use a text editor and it&amp;#8217;s all right there in front of you with no (or minimal) cross-referencing required. Frankly, I&amp;#8217;m not interested in &amp;#8220;core&amp;#8221; technologies that require me to install a particular piece of software in order to make use of them (cf &lt;a href=&quot;http://research.microsoft.com/~emeijer/&quot; title=&quot;Erik Meijer&#039;s Home Page&quot;&gt;Erik Meijer&lt;/a&gt;&amp;#8217;s talk on &lt;a href=&quot;http://msdn.microsoft.com/data/ref/linq/&quot; title=&quot;LINQ&quot;&gt;LINQ&lt;/a&gt;, which I&amp;#8217;ll have to discuss another time). I expect to be able to write a document containing overlap as easily as I can write a normal XML document.&lt;/p&gt;

&lt;p&gt;On Henry&amp;#8217;s point about yet another syntax for overlap, I am more and more coming to the conclusion that overlap will hit the mainstream if we have a simple way of encoding overlap in normal XML documents, namely something along the lines of &lt;a href=&quot;http://www.lmnlwiki.org/index.php/Talk:ECLIX#LIX&quot; title=&quot;LMNL-in-XML&quot;&gt;LIX&lt;/a&gt;. Interestingly, &lt;a href=&quot;http://www.translate.com/&quot; title=&quot;Yves Savourel&#039;s Website&quot;&gt;Yves Savourel&lt;/a&gt;&amp;#8217;s talk on Applying the &lt;a href=&quot;http://www.w3.org/TR/2007/REC-its-20070403/&quot; title=&quot;Internationalization Tag Set (ITS) Version 1.0&quot;&gt;Internationalization Tag Set&lt;/a&gt; was quite inspirational in this regard, since the working group seem to have put together a standard that both provides a set of standard elements and attributes to guide localisation, along with a method of mapping elements and attributes in existing markup languages onto those ITS elements and attributes. I wonder whether a similar approach could be used with LIX&amp;#8230; but I&amp;#8217;ll have to leave those thoughts for another time.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/17#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/7">creole</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Wed, 16 May 2007 21:48:47 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">17 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech Preparation</title>
 <link>http://www.jenitennison.com/blog/node/2</link>
 <description>&lt;p&gt;&lt;a href=&quot;http://2007.xtech.org/&quot; title=&quot;XTech 2007&quot;&gt;XTech 2007&lt;/a&gt; is less than a month away! I&amp;#8217;m going to be presenting on &lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/77&quot; title=&quot;Summary of my XTech 2007 paper&quot;&gt;Creole: Validating Overlapping Markup&lt;/a&gt; in the Core Technologies track on Wednesday 16th May at 16:00. Yes, I know it&amp;#8217;s really an &lt;a href=&quot;http://www.extrememarkup.com/&quot; title=&quot;Extreme Markup Languages Conference&quot;&gt;Extreme&lt;/a&gt; paper, but (a) I&amp;#8217;m not going to Extreme this year, (b) I&amp;#8217;ll be reaching out to a slightly wider audience, and (c) I just cannot wait any longer to talk about &lt;a href=&quot;http://www.lmnlwiki.org/index.php/Creole&quot; title=&quot;Composable regular expressions for overlapping languages etc.&quot;&gt;Creole&lt;/a&gt;!&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;The only trouble is: how am I going to make overlapping markup interesting to all the Web 2.0 geeks? More to the point: how do I explain the elegance of Brzozowski derivatives without losing at least half the audience?&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/2#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Sun, 22 Apr 2007 21:12:16 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">2 at http://www.jenitennison.com/blog</guid>
</item>
</channel>
</rss>
