schema

RELAX NG for matching

I’m still thinking about doing automatic markup with XML pipelines, and the kind of components that you might need in such a pipeline. These are the useful ones (list inspired by the components offered by GATE):

  • a tokeniser that uses regular expressions to add markup to plain text
  • a gazetteer that uses a lookup to add markup to plain text
  • an annotater that adds attributes to existing elements based on their context/content
  • a grouper that adds markup around sequences of existing markup
  • a stripper that removes markup
  • a general purpose transformer that uses XSLT to do just about everything else

Extension primitives in XSDL

Michael Sperberg McQueen (CMSMcQ) has written a couple of interesting posts about datatypes in W3C’s XML Schema (XSDL). (The second is a response to a comment from John Cowan, and attempts to justify some of the seemingly arbitrary decisions made in the set of datatypes present in XSDL 1.0.) The posts are a discussion of one of the issues against XSDL 1.1 raised by Michael Kay:

Michael proposes: just specify that implementations may provide additional implementation-defined primitive types. In the nature of things, an implementation can do this however it wants. Some implementors will code up email dates and CSS lengths the same way they code the other primitives. Fine. Some implementors will expose the API that their existing primitive types use, so they choose, at the appropriate moment, to link in a set of extension types, or not. Some will allow users to provide implementations of extension types, using that API, and link them at run time. Some may provide extension syntax to allow users to describe new types in some usable way (DTLL, anyone?) without having to write code in Java or C or [name of language here].

XTech 2007: Wednesday 16th May Afternoon

Yes, I’m determined to write up every talk I attended at XTech 2007, so that I have a record of it if nothing else. On Wednesday afternoon, I attended sessions on microformats, internationalisation and NVDL (as well as giving my own talk, of course).

XTech 2007: Wednesday 16th May Morning

Since there’s next to no ‘net connection at XTech 2007 (obviously the Web is not so ubiquitous as all that), I have nothing to do in the sessions but listen! Here are some thoughts about the sessions that I attended on the morning of Wednesday 16th. I haven’t included the keynotes not because they weren’t interesting but because I can’t think of anything to say about them at the moment.

A Creole by any other name...

Argh. I’ve been contacted by the guys at WikiCreole who want me to change the name of Creole. What should I do? Not only is “Creole” a great name for a schema language that deals with concurrent markup, but it’s a great acronym too (Composable regular expressions for overlapping languages etc.)

I did Google when I first came up with the name in August 2006, but didn’t discover WikiCreole (unsurprisingly, since it was only coined in July 2006 itself). But now far more many people know, care about and use WikiCreole than Creole grammars. So any suggestions for alternative names?

Syndicate content