I used to know how to arrange my XSLT modules. Each module had to be self-contained, and any common code imported into all the modules that used it. The reason? Because when you have on-going validation of your XSLT stylesheets, if the module can’t stand alone then you get all sorts of spurious errors. For example, if you define a variable in module A, which includes module B which uses that variable, then although the application as a whole will work fine, when you’re editing module B you’ll get errors because the variable isn’t defined in that module.
That rationale just got blown out of the water.
Yes, it’s another <oXygen/> release! Back in the 90s, the release of a new web browser would set my heart racing (yes, I was that much of a geek). Nowadays I get my thrills from new versions of <oXygen/> (yes, I’m still that much of a geek). <oXygen/>’s already packed with features that make my life easier, but somehow each release seems to come up with something that I never knew I needed but find I can’t live without.
It’s as if George Cristian Bina and the team are spying on me. I’ve been dealing recently with some large XSLT applications, written by someone else, that haven’t been designed with standalone modules. And it’s such a headache, not knowing whether an error that’s been reported is a real error (like a mis-spelled variable reference) or something that will be fine when the module is used in context. I’ve had to resort to running the transformation to identify static errors in the stylesheet, which is tedious and time-consuming.
New <oXygen/> release to the rescue! <oXygen/> 8.2 has “Validation Scenarios”, which means you can tell <oXygen/> to validate particular files starting from other modules. Suddenly the only errors that are reported are the ones that you really have to do something about. And the same technique works for schema files, or any document that needs to be validated in context!
There are a couple of other new things that are handy too. Just the other day I was thinking “Hmm, this Outline view is really useful but I wish it would show me the name attribute rather than the id attribute for these elements.” Now I can configure it to show whichever attributes I want. And multi-line search & replace: what a godsend! (There are lots more new features; these are just the ones that I’ve used in the day since I installed it.)
There were a couple of things that came up in the last version of <oXygen/> that I reported and have been fixed. One was a problem opening files with long lines: now if you try to open such a file you get asked if you want to format & indent it on opening. I think that being responsive to users (whether bug reports or requests for features) is a real indication of the development of great software. It makes a huge difference that George Cristian Bina (like Michael Kay) is approachable and active in the community: their applications are better for it, and their users much more loyal!
So now I have to go re-think my rules of thumb on how to organise modules…
[Disclosure: I get a free license for <oXygen/>, and they’ve made it easy for me to provide temporary licenses when I run training courses, so I’m in their debt. But honestly, I’d pay for it if I didn’t get it for free. I really couldn’t live without it.]
Comments
Re: Big XSLT applications just got easier to manage
Hi Jeni,
May I introduce myself: Justin Johansson from Downunder (South Australia).
My adventures with XSLT began in the early days with Instant Saxon which I used to produce my C.V. in multiple formats (txt, html, pdf). Over the years my Googling for XSLT info has often yielded information authored by yourself, among other leaders in the field so I would really respect your comments on the following.
Firstly, I chose this topic "Big XSLT applications just got easier to manage" to reply as on, since as my knowledge and use of XSLT as matured, so my XSLT apps have become bigger and bigger to the point of being bigger than Ben Hur. This, in my mind, seeded the questions: "What's the best way to factorise and reduce the amount of XSLT code one must write and manage, and how can I achieve this in a reasonable amount of time?"
Okay, that's the sizzle and now the sausage (my thought process and solution) ...
1. Use Functional Programming techniques both syntactically (to achieve conciseness of XSLT code) and algorithmically (develop XSLT solutions within an FP library framework).
2. Ponder an FP language for a model. I chose Haskell.
3. Implement how? Haskell to XSLT translator? Hmm, the impedance mismatch between the languages might just be a bit too much. Preprocess XSLT in some other way to achieve 1.?
4. Consider some use cases and see if preprocessing XSLT is the way to go.
Here's one use case that, whilst easy enough in native XSLT, still lends itself to amenability of 3.
Say you want an XSLT function to sum all but the first value in a list of values.
<xsl:function name="sum-rest" as="xs:double">
<xsl:param name="values" as="xs:double*/>
<xsl:value-of select="sum( $values[position() > 1])"/>
</xsl:function>
For conciseness, I would prefer to write something like this:
<s:function name="sum-rest">
<s:type>xs:double* -> xs:double</s:type>
<s:definition>($x : $xs) := sum( $xs)</s:definition>
</s:function>
Still the same number of lines but, to me at least, still concise because the as="" noise is somewhat reduced and the value-of select part is expressed much more succinctly.
Based on a few more complex use cases which could be solved using FP algorithmic metaphors like map, filter, foldl, etc., I concluded that preprocessing Haskell-like XSLT would provide significant advantages over writing in verbose native XSLT.
Whilst aware of the FXSL library, I felt that it did not go far enough in reducing XSLT verbosity.
As a classical musician, I felt that my solution, already largely developed, could aptly be called
Variations on the Haskell Prelude in XPath Major
for the XSLT
May I please have your comments on the idea?
Also would you be interested in reviewing the package which has been developed under Eclipse using Saxon 8b.
Thanks,
Justin Johansson
Adelaide, South Australia
+++ A horse with no name is called Lambda +++
Re: Big XSLT applications just got easier to manage
As you probably know, there have been many attempts over the years to make XSLT less verbose, the most recent I saw being
XQuerySam Wilmott’s RXSLT, presented at Extreme Markup Languages last year. XSLScript seems to have disappeared, but NiceXSL is still around. These attempts are largely, in my experience, suited for one particular person — their developer — and don’t get much use outside that rather small set.But the general approach is very useful, and something that all serious XSLT programmers should have in their toolkit: write an abstract, declarative description of what you want the transformation to do; write a program (perhaps in XSLT) to transform that into XSLT; and run. Basically, it’s writing a compiler from your own notation into XSLT: from one extremely high-level programming language into another very high-level programming language. I wouldn’t personally use the approach as a general XSLT replacement, but I have done it for specific processing, to counter the verbosity of having lots of very similar templates in a stylesheet.
There are downsides, of course, in particular difficulties with debugging, and it’s always a good idea to leave an “escape hatch” so that you can write in normal XSLT if your more abstract syntax doesn’t quite manage all the bells and whistles. The approach will get that much easier to use when XProc gets done, because there’ll be an easy way to string the compilation and the transformation together into a single command.
I’m sure that Dimitre will be very interested in what you’re up to, particularly as you’re using Haskell as your notation, and he would probably be in a better position to talk about the specifics.
Re: Big XSLT applications just got easier to manage
Thank you for your comments Jeni, and also for the links. I’m sure to be talking to Dimitre soon.
Justin Johansson
Re: Big XSLT applications just got easier to manage
Jeni,
Do you think that it is a good practice to have a stand-alone XSLT module, the compile-time validity of which is dependent on global objects (such as xsl:variable s) in other modules? Could you provide an example when doing this is recommended or when it cannot be avoided?
It seems to be just common sense that the including module should depend on the ones it is including/importing — not the opposite.
Even when we have nice tools such as Oxygen, that help make such practice more bearable, does this make it a good practice?
Dimitre Novatchev
Re: Big XSLT applications just got easier to manage
I’m inclined to agree that usually modules should be self-contained, and only depend on declarations from those it imports or includes. But there are situations where, for whatever reason, this is impractical.
For example, the developer of the stylesheets I’m working with at the moment defines all stylesheet parameters in the principal stylesheet module. One reason for this is that he’s using a tool to run the stylesheet which sets parameters by editing the DOM for the stylesheet document, but he also believes that that’s where stylesheet parameters belong. I have some sympathy with that view; you could define them twice, but there’s always the danger that the definitions will get out of sync.
I think that there are two main reasons for having several modules:
In the first case, you’ll typically import the module, and here I think it makes sense for the module to be self-contained.
In the second case, you’ll typically include the modules into the principal stylesheet module. Here, I think the argument for modules being self-contained is much weaker. You might split up your stylesheet based on the kind of elements it processes, or the kind of processing it does (for example, one module that does all the grouping work), or the namespace the element comes from, or the version it was introduced in, or for any number of other reasons.
It’s not hard to get into a situation where two modules each contain code that relies, in some way, on the other. For example, you might have one module that has code for processing tables, and another module that has code for processing sections and paragraphs. To do its job correctly, the module that processes tables needs to be able to handle paragraphs, but the module that processes sections and paragraphs needs to be able to handle tables (since tables can appear within sections). In that case, it seems wrong to impose an artificial precedence between the two modules.
Re: Big XSLT applications just got easier to manage
Jeni isn’t talking about standalone modules.
Let me give an example. We want to do l16n of a web form. Generally the structure is going to be all the same, but the labels will be different. So we abstract the shared elements into a separate module that will be imported by the individual localised XSL components. E.g.
layout.xsl:
<dl> <dt><xsl:call-template name='name.label'/></dt> <dd><input type='text' name='name'/></dd> <dt><xsl:call-template name='email.label'/></dt> <dd><input type='text' name='email'/></dd> </dl>form_en.xsl:
form_fr.xsl:
Then you just process using whichever version of form.xsl best fits the locale. layout.xsl isn’t a standalone module, but you still need to edit it.
Re: Big XSLT applications just got easier to manage
The cure for this is not to call templates in the imported module, but to use tamplate references.
So, instead of:
one would write:
There will be no redundand code in the imported module — just the namespace to which “f” is bound needs to be the same in both the imported and the importing modules.
In the importing module, instead of
one would have this:
To summarise, this is a demonstration of a good practice to design modularised XSLT applications, which totally avoids the problem of preventing the imported module from being able to exist stand-alone.
Cheers,
Dimitre Novatchev
Re: Big XSLT applications just got easier to manage
I think that to get this to work, in the imported module you would have something like
and then use
(Or something similar.) This is a technique that Dimitre uses to great effect in FXSL, but it can be a bit hard to follow unless you’re familiar with the pattern.
Personally, in this scenario, I’d still be tempted to make the imported module “standalone” (by which I mean that it doesn’t contain any static errors when validated as the principal module, even if it doesn’t produce any useful output). In Chris’ set-up, I’d define the named templates in the imported module, and in Dimitre’s I’d provide templates that matched the
<f:name.label>and<f:email.label>elements. They could either provide some kind of default return value, or a terminating message that warned that they hadn’t been overridden. (In fact, I think it’s particularly important to do this in Dimitre’s example, since otherwise you’ll get no warning if you forget to define those templates.)Re: Big XSLT applications just got easier to manage
Personally, in this scenario, I’d still be tempted to make the imported module “standalone” (by which I mean that it doesn’t contain any static errors when validated as the principal module, even if it doesn’t produce any useful output). In Chris’ set-up, I’d define the named templates in the imported module, and in Dimitre’s I’d provide templates that matched the and elements. They could either provide some kind of default return value, or a terminating message that warned that they hadn’t been overridden. (In fact, I think it’s particularly important to do this in Dimitre’s example, since otherwise you’ll get no warning if you forget to define those templates.)
Right, Jeni,
This is is not necessary in the case of FXSL (for XSLT 2.0), because the code of the imported modules is organised as xsl:function s and is accessed through function calls. In case any template reference parameter were not supplied there would be a static compile time error, noting the wrong number and/or wrong type of the arguments passed to the function.
Cheers,
Dimitre Novatchev
Re: Big XSLT applications just got easier to manage
Agreed. My solution was designed for XSLT 1.0 compatibility, and to avoid any dependency on the structure of the source document(s).
Re: Big XSLT applications just got easier to manage
First off, I absolutely love your blog! You are a legend of XSLT and it shows.
Secondly, it would be awesome if you consider going into more detail regarding practices for creating modules for XSLT. One issue I have ran into in trying to make XSLT modules is that it is difficult for me to break everything up in a way that allows reuse and allows simple use from different applications/libraries. Part of my issue is somewhat specific to using Python where a library I write utilizes XSLT for working with XML can be packaged as a standalone egg. Since I can’t depend on a single point of install it is not trivial to simply include the other XSLT b/c they are essentially within a package that is not a typical filesystem.
I have some ideas of how to get around this and it really shouldn’t be too hard, but if you have any design patterns regarding this sort of thing please consider posting.
Oh yes, Oxygen does rule. The Eclipse plugin with emacs keybindings is awesome. Thanks for blogging!!
Re: Big XSLT applications just got easier to manage
Do you mean that you have several XSLT applications, some of which reuse the same modules, but you can’t refer to a central location for them because you don’t know where they’re going to be installed?
Re: Big XSLT applications just got easier to manage
vim works for me.