creole

And she's back

So first there was the XML Summer School. This year was my sixth, and it was really great to hang out with chums old and new. I love that

  • you get to meet people from all corners of the XML community, even ones you haven’t got the slightest interest in, and learn that they’re human too (even the web services guys)
  • there’s always something to learn; I’ve seen some talks for six years on the trot, others were completely new this year, but they’re all worth attending because the audience, war stories and discussion are always different. Also, because each talk is aimed at newcomers, you get a great overview of topics that you’re not so familiar with, and you can always chat to the speaker later to find out more
  • there are social events laid on every evening that you’re expected to attend, so you’re practically forced to socialise, which is useful for an insecure introvert like me who’d otherwise be sitting in her hotel room getting miserable imagining everyone else having a good time
  • there’s a creche, so despite being inseparable from two small children over the last four years, I’ve still been able to attend without dragging an entourage with me (not that I object to the entourage, just the expense and the dependency)

I left feeling not only invigorated and inspired, but also a part of a fun and friendly community.

Partitioning overlapping markup

Wendell Piez forwarded me an interesting poster by Bert Van Elsacker on automatic fragmentation of overlapping structures. That’s taking something like:

<bold> this is bold <italic> and italic </bold> text </italic>

and turning it into something well-formed, like:

<bold> this is bold <italic> and italic </italic></bold><italic> text </italic>

When you do this, you have to decide which elements can be split and which can’t, and their relative priorities. Wendell suggested that perhaps Creole might help to do this. I have been thinking about is using Creole to add annotations to markup (something like, you add attributes to the Creole patterns and they get copied on to the matched ranges, or are used to create new ranges), but I haven’t done that yet, and actually I think you probably want a different kind of language to do it (a new kind of schema language like James Clark suggested), because the way in which you break up overlapping structures has a lot to do with how you’re going to process them.

XTech Creole presentation fallout

Henry Thompson had a lot to say after my Creole presentation (open takahashi.xul?data=creole.data; requires Firefox) about the benefits of stand-off markup for linguistic information. From his overview, it seems that the NITE XML Toolkit that he’s been involved with represents overlapping linguistic data by holding atoms (here meaning the “lowest common denominator” shared pieces of data) and having multiple trees marking up these atoms. The trees are independently validated (since they are pure XML), with cross-hierarchy validation done through the query language. This is pretty similar to the XCONCUR approach, which augments a CONCUR-like multi-grammar validation with a Schematron-like constraint language.

A Creole by any other name...

Argh. I’ve been contacted by the guys at WikiCreole who want me to change the name of Creole. What should I do? Not only is “Creole” a great name for a schema language that deals with concurrent markup, but it’s a great acronym too (Composable regular expressions for overlapping languages etc.)

I did Google when I first came up with the name in August 2006, but didn’t discover WikiCreole (unsurprisingly, since it was only coined in July 2006 itself). But now far more many people know, care about and use WikiCreole than Creole grammars. So any suggestions for alternative names?

Syndicate content