Those of you who have been following this blog will know that I’ve been thinking recently about how to handle uncertainty related to RDF triples (specifically in the context of a genealogical web app). Certainty isn’t the only kind of metadata-about-triples that you’d want to keep in an app like this. We need to know things like:
- who made the statement
- when the statement was made
- what evidence that led to the statement being made
- licensing information about the reuse of the statement
- (if we go with the rating idea) what ratings the statement has been given
- (if we allow editing of statements) what changes have been made to the statement over time
and so on. In short, all the metadata that you’d want to associate with resources you’d also want to associate with statements.
A question about how to refactor some repetitive templates.
The issue is in creating XHTML headings.
For a small docbook article, I have the following templates in one of my stylesheets:
The Free Our Bills campaign was launched recently in the UK. Some of the comments I’ve seen about the campaign makes me think that it might be helpful if people understood more about how Bills and legislation get published in the UK. I thought I’d offer a bit of background based on my experience (though there are many people with more intimate knowledge of the processes involved; perhaps they’ll correct me when I get it wrong).
The Xpath for accessing a particular part’s title would be /law/part2/title so the PRESTO URLs would need some kind of convention.
Now, I am not sure I understand the issues well enough to say which system for indexing is absolutely best. But I think the advantage of
http://www.eg.com/law/part2/titleis that it is probably a more common case that your system is interested in
/law/part/titlerather than all titles of parts
/law/part/title. But it is a matter of the particular use case and the consequent virtual schema.
I’m still thinking about doing automatic markup with XML pipelines, and the kind of components that you might need in such a pipeline. These are the useful ones (list inspired by the components offered by GATE):
- a tokeniser that uses regular expressions to add markup to plain text
- a gazetteer that uses a lookup to add markup to plain text
- an annotater that adds attributes to existing elements based on their context/content
- a grouper that adds markup around sequences of existing markup
- a stripper that removes markup
- a general purpose transformer that uses XSLT to do just about everything else