The new membership of the W3C’s Technical Architecture Group (TAG), and some of the recent discussions on the TAG list about polyglot markup, have made me think about what the TAG should stand for and the role the TAG should play.
Fundamentally, the web is for everyone, whatever gender, whatever race, whatever sexual orientation, whatever visual or mental ability and so on. The web community should fight to keep the web open to all. And it should try to be a community that is open to all.
Disclaimer: As usual, this post contains my personal opinion and does not reflect that of any organisation with which you might associate me.
The other day, I had a lovely conversation with some folks from the BBC about some of their future plans. In the course of the conversation, Michael Smethurst spoke about his frustration when dealing with people involved with particular programmes at the BBC, where every single one of them thinks their programme is a “precious snowflake”, completely unique, that simply can’t be treated in the same way as all the other programmes described on the site.
Michael’s point, of course, is that TV programmes have a hell of a lot of similarities with each other. They all have episodes and cast members and may have trailers or be available on iPlayer. When the BBC models them in the same way, they gain enormous efficiencies in their ability to store and access information about programmes: they can reuse code, share content between programmes, and perform analyses over the aggregated data set. It’s great for users too: they get the same fantastic user experience no matter which programme they are viewing information about, and can apply the experience they gain when navigating pages about one programme when they need to find information about another.
The ability to classify and categorise, to bring order to what seems like chaos, to create a model of the world, is one of the things that marks humans from animals. We can look at a hundred people, with different colour hair and skin; different height and build; smiling, talking, crying, and still call them all Person because the essential characteristics that govern how we interact with them are the same.
But if there’s one thing that the last five long, hard years working with legislation has taught me, it’s that in any vaguely interesting domain, this search for order will always fall down in the face of reality.
As you may know, I accepted an appointment to the W3C’s Technical Architecture Group earlier this year. Last week was the first face-to-face meeting that I attended, hosted in the Stata Center at MIT. As you can tell from the agenda (which was in fact revised as we went along) it was a packed three days.
What I intend to do here is to briefly report on the major areas that we discussed and give a tiny bit of my own personal take on them. In no way should any of what I write here be judged as revealing the official opinion of the TAG, it’s just me saying what I think, and I’m not going to go into anything in depth because they’re all incredibly gnarly and contentious topics and I’d not only be here all year but also end up in a tar pit.
I’ve been talking about URIs a lot recently. One of the things that has bothered me about some of the conversations is the conflation of the concepts of “opaque URIs” and “non-human-readable URIs”. This is my argument for keeping the concepts separate.
The opacity of URIs is an important axiom in web architecture. It states that web applications must not try to pick apart URIs in order to work out information from them. Applications must not, for example, use the fact that a URI has
.html at the end to infer that it resolves to an HTML document. It’s closely related to hypertext as engine of application state, in that opaque URIs should not be generated by web applications either: they must be discovered through links and the submission of forms.
But this has nothing to do with readability or hackability, both of which are extremely important for human users. Readable URIs help human users understand something about the resource that the URI is pointing to. Hackable URIs (by which I mean ones that people might manipulate by altering or removing portions of the path or query) enable human users to locate other resources that they might be interested in.
Yesterday I went along to a workshop on developing URI guidelines for the UK public sector. Because of the current drive to get more UK public sector information online, and the fact that we have Tim Berners-Lee on board, there’s a growing recognition of the fact that we need URIs for the real-world and conceptual things that we talk about in the public sector: schools, roads, hospitals, services, councils, and so on.
One of the particular points of contention at the meeting was whether URIs for non-information resources (ie for real-world and conceptual things) should contain dates or version numbers, or not.
This is the talk I prepared for the UKGovWeb Barcamp, in blog form. It’s probably better this way. Most of what’s written here seems blindingly obvious to me, and probably to most readers of this blog, but maybe Google will direct someone here who finds it useful.
Working with public-sector information on the web, one of the things that I take an interest in is making government data freely available for anyone to re-present, mash-up, analyse and generally do whatever they want to do. This post is born out of a feeling that the people who control data don’t realise that the smallest changes can be beneficial: they don’t need to do everything right now, just something.
XTech was subtitled “the mobile web”, but one of the major themes for me was that of the distributed web. The first keynote, by Simon Wardley, gave a vision of a future in which hardware, frameworks and applications are services in the cloud rather than products on machines we own: where we use flickr to store our photographs, Google App Engine to host our applications, and Amazon S3 to store our data. In David Recordon’s keynote (written up by Jeremy Keith), he talked about small, specific services provided by sites that aren’t “destination sites”. The same picture was painted by Gareth Rushgrove in his talk on Design Strategies for a Distributed Web.
I finally have some time to write about XTech. What a great conference! I know that Edd would like it bigger, but its modest size gives it a family feel. Like a family gathering, there are pontificating oldsters whose wisdom goes largely unappreciated by young upstarts who themselves bring energy and innovation to the crowd. And a bunch in the middle trying to translate across the gap: to explain the vision to the old and the reality to the new.
Roughly ten years ago, I was attending KAW’98. I remember that conference as one of the best weeks of my life. I had great company. I saw scenery like I’d never seen before. I presented my PhD work for the first time to people who were (at least politely) interested in it. And I learned a lot, both from the presentations and less formal discussions.
(I remember driving back to Nottingham when we returned; a rainbow appeared in front of us, seeming to arch over our destination in a perfect finale.)
Looking back at that paper is like looking at my past generally is: much of it makes me cringe, but parts of it are surprisingly good. What’s interesting is that if you swap a few terms for modern buzzwords, it’s still a pretty neat idea. It’s also amazing how far we’ve come — how much has become common-place — in just ten years.
The Xpath for accessing a particular part’s title would be /law/part/title so the PRESTO URLs would need some kind of convention.
Now, I am not sure I understand the issues well enough to say which system for indexing is absolutely best. But I think the advantage of
http://www.eg.com/law/part2/titleis that it is probably a more common case that your system is interested in
/law/part/titlerather than all titles of parts
/law/part/title. But it is a matter of the particular use case and the consequent virtual schema.