Functions or templates? My rules of thumb

In XSLT 2.0, functions and templates have very similar behaviour: both can accept arguments/parameters and return sequences of any kind. The difference is in how they’re called: functions can be called concisely from within expressions and patterns, and arguments are passed by position; whereas templates have to be called using <xsl:call-template>, and parameters are passed by name. Also, there’s no context node for functions, but there is one for templates. I’ve built up some rules of thumb about which to use when:

  • Use functions to return sequences of atomic values, because their results are often used as arguments to further functions.
  • Use functions to return sequences of existing nodes, because you can call them in the middle of a location path in order to navigate the source document.
  • Use templates (and matching templates as much as possible) to return new nodes.

There is one special situation, though, where I use templates to return atomic values or existing nodes, and that’s when the value that’s returned depends on a single node. In this case, I define a function that accepts the node as one of its arguments and applies templates to that node, in a mode named after the function name. Then I have a number of matching templates in that mode, returning the relevant value for the different kinds of node.

A really simple example is a xhtml:is-heading() function, which returns true for <h1>, <h2>, <h3> and so on, and false otherwise. Here’s the definition:

<xsl:function name="xhtml:is-heading" as="xs:boolean">
  <xsl:param name="element" as="element()" />
  <xsl:apply-templates select="$element" mode="xhtml:is-heading" />
</xsl:function>

<xsl:template match="xhtml:h1 | xhtml:h2 | xhtml:h3 | xhtml:h4 | xhtml:h5 | xhtml:h6"
  mode="xhtml:is-heading" as="xs:boolean">
  <xsl:sequence select="true()" />
</xsl:template>

<xsl:template match="*" mode="xhtml:is-heading" as="xs:boolean">
  <xsl:sequence select="false()" />
</xsl:template>

A more complex example is this one, which gathers all the function definitions in a given namespace in a stylesheet:

<xsl:function name="xsl:function-definitions" as="element(xsl:function)*">
  <xsl:param name="stylesheet" as="node()" />
  <xsl:param name="namespace" as="xs:string" />
  <xsl:apply-templates select="$stylesheet" mode="xsl:function-definitions">
    <xsl:with-param name="namespace" tunnel="yes" />
  </xsl:apply-templates>
</xsl:function>

<xsl:template match="xsl:stylesheet | xsl:transform" mode="xsl:function-definitions"
  as="element(xsl:function)*">
  <xsl:apply-templates mode="xsl:function-definitions" />
</xsl:template>

<xsl:template match="xsl:function" mode="xsl:function-definitions" 
  as="element(xsl:function)*">
  <xsl:param name="namespace" tunnel="yes" as="xs:string" />
  <xsl:variable name="qname" as="xs:QName" select="resolve-QName(@name, .)" />
  <xsl:if test="$namespace = namespace-uri-from-QName($qname)">
    <xsl:sequence select="." />
  </xsl:if>
</xsl:template>

<xsl:template match="xsl:include | xsl:import" mode="xsl:function-definitions"
  as="element(xsl:function)*">
  <xsl:param name="namespace" tunnel="yes" as="xs:string" />
  <xsl:sequence select="xsl:function-definitions(document(@href), $namespace)" />
</xsl:template>

<xsl:template match="*" mode="xsl:function-definitions" />

(Note that this code will give you overridden definitions from imported stylesheets: if you don’t want those then you need to modify the template matching xsl:stylesheet | xsl:transform.)

What I like about this pattern is that it maintains the function interface for calling the code while breaking down what would otherwise be a big <xsl:choose> into pieces of code that are individually manageable and testable. Also, it means I can take advantage of some of the useful features of XSLT template matching, such as the built-in templates and priorities. And it means that the function can be customised very easily by adding or overriding templates in an importing stylesheet. On the downside, it’s that much longer than the code would otherwise be, and, as always with XSLT, debugging through matching templates can be challenging.

This pattern gives you polymorphic functions — in the form of different templates for different node types — at least for one node argument. There are also associations for me here with object-oriented coding (think of the function definition as an abstract method, the mode as a method name and the node as an object instance). Anyway, I thought it worth sharing.

Comments

Re: Functions or templates? My rules of thumb

Hi Jeni,

You start by comparing an xsl:function to a named template (because you say “templates have to be called using <xsl:call-template>, “).

However, further in the text you are no longer talking about calling templates with <xsl:call-template>. Instead, you describe how matching templates can be useful with <xsl:apply-templates>.

I think you are talking about two completely different things here and combining them together will be very confusing to your readers.

To be more informative and useful to the readers, I would just point out that yes, in XSLT 2.0 there really isn’t too much to reccomend about named templates, callable via <xsl:call-template>.

On the other side, matched templates, used via <xsl:apply-templates/> do remain the most powerful feature of XSLT and have their wide uses — one of them to make possible higher-order functional programming as we know it in FXSL 2.

Cheers,

Dimitre Novatchev

Re: Functions or templates? My rules of thumb

You’re right, I should have said “have to be invoked by <xsl:call-template> or <xsl:apply-templates>”. My rule of thumb is to use templates whenever the result is a sequence of newly generated nodes. I use matching templates when I can (when the output largely depends on a single node) and named templates when I have to (when I have to recurse or when there isn’t a parameter that’s a single node).

For example, I would do

<xsl:template name="makeEmptyCells" as="element(td)*">
  <xsl:param name="n" as="xs:integer" />
  <xsl:for-each select="1 to $n">
    <td>&#xA0;</td>
  </xsl:for-each>
</xsl:template>

rather than

<xsl:function name="eg:makeEmptyCells" as="element(td)*">
  <xsl:param name="n" as="xs:integer" />
  <xsl:for-each select="1 to $n">
    <td>&#xA0;</td>
  </xsl:for-each>
</xsl:function>

mostly because doing

<tr>
  <xsl:apply-templates select="$cells" />
  <xsl:sequence select="eg:makeEmptyCells(5 - count($cells))" />
</tr>

doesn’t feel right. (I struggle to rationalise this gut feeling.)

So I still use named templates, even in XSLT 2.0.

Re: Functions or templates? My rules of thumb

 eg:makeEmptyCells(5 - count($cells))

will “feel right” and will be most convenient when the new nodes produced by the function are part of the processing pipeline — that is, when composability is expected and desired.

Cheers,

Dimitre Novatchev

Re: Functions or templates? My rules of thumb

It’s true that if you’re going to immediately query into or re-process the nodes you’ve just created then you’ll want to be able to get at them through a function. I seem to remember that in FXSL you do that with the wrappers you use around sequences-of-sequences?

But in the majority of transformations, new nodes will be composed with element constructors rather than being used in expressions or patterns, so there’s no requirement to use a function to create them. (That’s not to say that you can’t then re-process the nodes, just that the nodes get stored in a temporary tree first.)

Anyway, these are just my rules of thumb, made to be broken (the rules, that is, not my thumbs).

Re: Functions or templates? My rules of thumb

But in the majority of transformations, new nodes will be composed with element constructors rather than being used in expressions or patterns, so there’s no requirement to use a function to create them. (That’s not to say that you can’t then re-process the nodes, just that the nodes get stored in a temporary tree first.)

At least in XSLT 2.0 Basic, soring newly-produced items in the nodes of temporary trees causes the loss of their type information.

This is completely avoided if the newly produced item is returned by a function and is consumed by anoher function, without intermediate storing.

Dimitre Novatchev

Re: Functions or templates? My rules of thumb

I don’t understand what you’re saying. I agree that if you store atomic values in nodes, then you lose the type information about the atomic values. But we’re in agreement that atomic values should be generated by functions. It’s nodes that we’re discussing. And in XSLT 2.0 Basic, nodes can’t have any useful type information associated with them (all elements are xs:untyped, all attributes are xs:untypedAtomic).

Re: Functions or templates? My rules of thumb

Yes, however in XSLT 1.0 it was often the case that we had to wrap atomic results as children of elements, either because the template produced a sequence of values and there are no sequences in the XPath 1 data model, or even when the template produced a singe value, but we needed to postpone/defer its processing.

When converting such a template to XSLT 2.0, it would be best to replace it with an xsl:function that returns a (sequence of) typed value(s).

What I want to say is that now when we have a better XPath (2.0) data model, even the need to write named templates producing non-text nodes is much less than it was in XSLT 1.0. Therefore, we should recommend that during such a conversion process even a named template that produced wrapped-values shoud now be converted into an xsl:function, producing a typed value or a sequence of typed values. Just the fact that the named template produced element(s) is not in itself 100% sufficient justification to leave it “as is” in the XSLT 2.0 code.

As for using a named template when it produces real (not just value-wrappers) elements, it can also be argued about the benefits of coding this as an xsl:function — of course following a naming convention (such as using make<Name> or gen<Name> names) to make the code more readable.

What would we gain if we were coding this as a named template? It would require more lines of code and also would need an intermediate step before being able to pass its result for additional processing.

A benefit could be that a named-template will not confuse the XSLT processor into performing certain types of optimisation, however the XSLT processor still has to analyze all xsl:functions and isolate those that produce new nodes as the XSLT processor cannot rely completely and solely on the programmer’s discipline.

Cheers,

Dimitre Novatchev