I wrote previously about the, to my mind, wrong-headed use of
xml:space in WordML (and OOXML), and promised something a bit more positive about how whitespace should be handled in markup languages. So here it is.
A bit of a disclaimer up front: my attitude on this topic is highly skewed by the fact I use XSLT all the time, and it has particular ways of dealing with whitespace. I happen to think that the way XSLT deals with whitespace is pretty solid, but that might just be because it’s what I’m used to.
The aim of this post is to answer the following question “when designing a markup language, what should I say about whitespace processing?”
I recently filled in a questionnaire that asked about the use of robots in teaching programming. (You can win a robot!) Some of the questions seemed to be particularly about attracting women into the field; I guess the thinking is that programming something that does something in the real world is more engaging (particularly for women?) than doing artificial exercises in linked list manipulation. Or something.
I like programming robots as much as the next geek, and am the proud owner of two regular Lego Mindstorms kits as well as a less complex, but more evil, Dark Side Developers Kit. Thinking around this, it struck me that there are two classes of projects you can do with robots:
- a directive program, where you tell the robot exactly what to do (go forward for 5 seconds, turn, forward for 2 seconds etc.)
- a facilitative program, where you define the feedback between sensors and motors, then just let the robot go
I intend to do a series of “things that make me scream” posts. Many of them will be about WordML (as in the markup language used by Word 2003) because that’s what I’m struggling with at the moment and because it’s so goddam awful. I don’t want to get into the whole ODF vs OOXML open standard-or-not debate. My problems with WordML (and OOXML) are mainly about aesthetics rather than process: I look at it and… well, it makes me want to scream. Examining what it is about the language (or implementation thereof) that prompts this visceral reaction might help in designing better languages.
So: did you know that Word 2003 puts a
xml:space="preserve" attribute on the
<w:wordDocument> document element of the XML that it produces and doesn’t indent its output? This is a nightmare if you ever have to actually look at the documents: auto-indentation programs (like the one in <oXygen/>) quite rightly won’t add whitespace to elements that are in the scope of an
xml:space="preserve" attribute, which means you can’t use these programs to indent XML automatically.
I learnt two new life-skills today.
First, how to tie my shoelaces using the Ian Knot. It’s very quick, and works just as well with anything with loops, such as supermarket or nappy bags.
Second, how to make playdoh. A standard recipe all over the web is:
- 1 cup flour
- 1/2 cup salt
- 2 teaspoons cream of tartar
- 1 cup water
- 2 tablespoons oil
- 1 teaspoon food colouring
I (“Jeni Tennison”) manage to score 10/10 on the online identity calculator, thanks to having a pretty rare name and there being multiple archives of XSL-List, to which I was a prolific contributor in my early XML days. (I think I can also claim to be “Jenni Tennison”, “Jenny Tennison” less so, “Jenifer Tennison” is obviously the pre-XML me, and “Jennifer Tennison” not me at all, and quite rightly so.)
Anyway, I’ve just registered with claimID to get myself an OpenID, to lower the barrier to accessing certain sites. As well as getting a claimID URL (eg
http://claimid.com/jenitennison) to use as an OpenID, you can also use the URL of your own web page as your OpenID identity URL which delegates to the claimID identity URL, by adding links to the claimID server in the head of the web page. (View the source of my home page to see what this looks like.) This provides some flexibility in the event that claimID stops functioning: I can move to another OpenID provider without changing my OpenID.