There are a couple of additional things that I think are worth drawing out.
This is the second instalment in a series of posts about how to create linked data from existing data sets, using traffic count data as an example. In the last instalment, I talked about analysing and modelling data. This instalment discusses the creation of URIs for the various things that have been identified within the model.
This part of the process is the same as what you’d do if you were simply creating a RESTful API to a website. The principal is that everything has a URI, and if you resolve that URI you get information about the thing.
I’m sure that you’ve noticed that my recent posts have been somewhat obsessed with publishing and using public sector information. It’s because I’ve somehow been sucked into the work going on within the UK government, with Tim Berners-Lee and Nigel Shadbolt advising, to publish its data as linked data.
I’ve been talking about URIs a lot recently. One of the things that has bothered me about some of the conversations is the conflation of the concepts of “opaque URIs” and “non-human-readable URIs”. This is my argument for keeping the concepts separate.
The opacity of URIs is an important axiom in web architecture. It states that web applications must not try to pick apart URIs in order to work out information from them. Applications must not, for example, use the fact that a URI has
.html at the end to infer that it resolves to an HTML document. It’s closely related to hypertext as engine of application state, in that opaque URIs should not be generated by web applications either: they must be discovered through links and the submission of forms.
But this has nothing to do with readability or hackability, both of which are extremely important for human users. Readable URIs help human users understand something about the resource that the URI is pointing to. Hackable URIs (by which I mean ones that people might manipulate by altering or removing portions of the path or query) enable human users to locate other resources that they might be interested in.
Yesterday I went along to a workshop on developing URI guidelines for the UK public sector. Because of the current drive to get more UK public sector information online, and the fact that we have Tim Berners-Lee on board, there’s a growing recognition of the fact that we need URIs for the real-world and conceptual things that we talk about in the public sector: schools, roads, hospitals, services, councils, and so on.
One of the particular points of contention at the meeting was whether URIs for non-information resources (ie for real-world and conceptual things) should contain dates or version numbers, or not.