MSXML serialisation of empty elements

The problem: You’re using MSXML (and therefore XSLT 1.0). You’re outputting XHTML (and therefore using the xml method). You want to output an empty <a> element for an anchor, but want to make sure that you get a start tag and and end tag (<a id="foo"></a>) rather than an empty element tag (<a id="foo"/>).

Is the only solution disable-output-escaping? No! MSXML outputs start and end tags if you create an element using literal result elements, <xsl:copy> or <xsl:element> with any instruction that could potentially produce content. So whereas

<a id="foo"></a>

produces

<a id="foo"/>

putting something innocuous inside, such as

<a id="foo"><xsl:value-of select="''" /></a>

creates

<a id="foo"></a>

Disclaimers

  • Of course with another XSLT processor, or XSLT 2.0, you would use an xhtml output method to make sure that empty element syntax was only used for elements that are declared to be EMPTY, such as <br> and <img>.
  • This trick might not work with other XSLT processors; certainly Saxon always outputs an empty element if the element is empty.
  • And anyway, using empty <a> elements for anchors is not good quality XHTML.

Comments

Re: MSXML serialisation of empty elements

I usually output a comment inside the element, which will come out as <a id=’foo’><!—empty—></a>

Re: MSXML serialisation of empty elements

Yes, good suggestion. When I proposed that to my client, they said they didn’t want the XHTML to be littered with comments. Sigh.

also causes problems with indentation

if uses the ‘indent’ option of the .net xslt engine, this often causes the document not to validate, because the schema doesn’t allow whitespace in the content of an empty element, but the indenter generates xml like this

<add key="name" value="value">
</add>

and the whitespace is not stripped during input, since there are no contained elements.

Re: also causes problems with indentation

I haven’t experienced this, but I have experienced whitespace being added by MSXML when a PI was added within an element. Of course, you can’t use indent=”yes” when outputting XHTML using the xml method because it adds indentation even when it might effect rendering, so we avoid it.

Try adding a nbsp. :)

Try adding a nbsp. :)

Re: Try adding a nbsp. :)

But that changes the appearance of the document. I wondered if there was a good “no character” character that I could use between the tags… any suggestions?

Re: Try adding a nbsp. :)

I would have thought one of the Unicode zero-width non-breaking space characters would be appropriate.

Re: Try adding a nbsp. :)

Yes, I should have investigated. There are several possibilities:

  • ZERO WIDTH SPACE (U+200B)
  • ZERO WIDTH NON-JOINER (U+200C)
  • ZERO WIDTH JOINER (U+200D)
  • WORD JOINER (U+2060)
  • ZERO WIDTH NO-BREAK SPACE (U+FEFF)

In Firefox, at least in some fonts, you get funny characters with ZERO WIDTH NON-JOINER and ZERO WIDTH JOINER. In IE7, you also get funny characters when you use WORD JOINER. That leaves ZERO WIDTH SPACE or ZERO WIDTH NO-BREAK SPACE, but ZERO WIDTH NO-BREAK SPACE is deprecated since that codepoint is also used for the Byte Order Mark (BOM), so looks like ZERO WIDTH SPACE is the one to use.

To see for yourself, here are their names again, but with the spaces replaced by the respective character:

  • ZERO​WIDTH​SPACE​(U+200B)
  • ZERO‌WIDTH‌NON-JOINER‌(U+200C)
  • ZERO‍WIDTH‍JOINER‍(U+200D)
  • WORD⁠JOINER⁠(U+2060)
  • ZEROWIDTHNO-BREAKSPACE(U+FEFF)

Just ID?

Hi,

if you just need an id in the document, why don't you just set it on the parent element, and don't include the anchor at all?

Best regards, Julian

Re: Just ID?

See my final disclaimer. I would never personally use <a> elements to make anchors in an XHTML document. Sometimes my clients have different priorities.