I’ve been talking about URIs a lot recently. One of the things that has bothered me about some of the conversations is the conflation of the concepts of “opaque URIs” and “non-human-readable URIs”. This is my argument for keeping the concepts separate.
The opacity of URIs is an important axiom in web architecture. It states that web applications must not try to pick apart URIs in order to work out information from them. Applications must not, for example, use the fact that a URI has .html at the end to infer that it resolves to an HTML document. It’s closely related to hypertext as engine of application state, in that opaque URIs should not be generated by web applications either: they must be discovered through links and the submission of forms.
But this has nothing to do with readability or hackability, both of which are extremely important for human users. Readable URIs help human users understand something about the resource that the URI is pointing to. Hackable URIs (by which I mean ones that people might manipulate by altering or removing portions of the path or query) enable human users to locate other resources that they might be interested in.
Yesterday I went along to a workshop on developing URI guidelines for the UK public sector. Because of the current drive to get more UK public sector information online, and the fact that we have Tim Berners-Lee on board, there’s a growing recognition of the fact that we need URIs for the real-world and conceptual things that we talk about in the public sector: schools, roads, hospitals, services, councils, and so on.
One of the particular points of contention at the meeting was whether URIs for non-information resources (ie for real-world and conceptual things) should contain dates or version numbers, or not.
This is the talk I prepared for the UKGovWeb Barcamp, in blog form. It’s probably better this way. Most of what’s written here seems blindingly obvious to me, and probably to most readers of this blog, but maybe Google will direct someone here who finds it useful.
Working with public-sector information on the web, one of the things that I take an interest in is making government data freely available for anyone to re-present, mash-up, analyse and generally do whatever they want to do. This post is born out of a feeling that the people who control data don’t realise that the smallest changes can be beneficial: they don’t need to do everything right now, just something.
XTech was subtitled “the mobile web”, but one of the major themes for me was that of the distributed web. The first keynote, by Simon Wardley, gave a vision of a future in which hardware, frameworks and applications are services in the cloud rather than products on machines we own: where we use flickr to store our photographs, Google App Engine to host our applications, and Amazon S3 to store our data. In David Recordon’s keynote (written up by Jeremy Keith), he talked about small, specific services provided by sites that aren’t “destination sites”. The same picture was painted by Gareth Rushgrove in his talk on Design Strategies for a Distributed Web.
I finally have some time to write about XTech. What a great conference! I know that Edd would like it bigger, but its modest size gives it a family feel. Like a family gathering, there are pontificating oldsters whose wisdom goes largely unappreciated by young upstarts who themselves bring energy and innovation to the crowd. And a bunch in the middle trying to translate across the gap: to explain the vision to the old and the reality to the new.
Roughly ten years ago, I was attending KAW’98. I remember that conference as one of the best weeks of my life. I had great company. I saw scenery like I’d never seen before. I presented my PhD work for the first time to people who were (at least politely) interested in it. And I learned a lot, both from the presentations and less formal discussions.
(I remember driving back to Nottingham when we returned; a rainbow appeared in front of us, seeming to arch over our destination in a perfect finale.)
Looking back at that paper is like looking at my past generally is: much of it makes me cringe, but parts of it are surprisingly good. What’s interesting is that if you swap a few terms for modern buzzwords, it’s still a pretty neat idea. It’s also amazing how far we’ve come — how much has become common-place — in just ten years.
Rick Jelliffe has been writing recently about PRESTO, most recently about the design of URLs based on the PRESTO system. In his latest post, Rick talks about using XPath as the basis of a URL scheme:
The Xpath for accessing a particular part’s title would be /law/part[2]/title so the PRESTO URLs would need some kind of convention.
[snip]
Now, I am not sure I understand the issues well enough to say which system for indexing is absolutely best. But I think the advantage of
http://www.eg.com/law/part2/titleoverhttp://www.eg.com/law/part2/titleis that it is probably a more common case that your system is interested in/law/part[2]/titlerather than all titles of parts/law/part/title. But it is a matter of the particular use case and the consequent virtual schema.
There were a couple of comments on my previous post about RDF and uncertainty in our Web 2.0 genealogy project concerning the problems of privacy in a genealogy app. My ideas about this aren’t fully thought-through, let alone implemented, but I thought I’d share them anyway.
First, the things we could restrict access to are:
Questions of identity and privacy are rather topical at the moment, especially here in Britain where last week a database dump including the names, addresses, bank details of half the country, along with our children’s names and dates of birth, got “lost in the post”.
So what better time to announce a new online identity metric? My PhD supervisor, Nigel Shadbolt is the CTO of Garlik, so earlier in the week I got an invite to the launch of QDOS.
Like the online identity calculator that I wrote about before, QDOS gives you a score based on your online presence. However, this score isn’t just based on a Google search. It has four components (which are each represented by a different colour, and are combined to give a very pretty pictorial “fingerprint”; check out Tim Berners-Lee’s QDOS, for example).
OK, so I can’t remain a Luddite for long. What’s a technological solution to the posterity problem, in particular in regard to web applications that tuck away all your data in their databases, just waiting to be forgotten?
Well, what if web applications accepted information as feeds rather than through forms? The original data would be distributed rather than centralised. Web applications would use the web as more than a distribution medium: they would be of the web rather than simply on the web.