This week, the Cabinet Office went live with a preview version of hmg.gov.uk/data, available only to those who subscribe to the UK Government Data Developers Google Group. Harry Metcalfe has written a great review, or of course you can check it out yourselves.
Already, though, there are discussions starting on the mailing list about how the data is being made available, and I’m worried that these might distract us from getting things done.
There are a range of ways in which data can be made available on the web:
All of us Open Data advocates agree that we need to encourage government away from the first two methods of making data “available” and towards the last three. But there’s a vast array of opinion within the developer community about which of the latter three are most useful, and precisely which technologies to use in each of those categories.
What’s the real answer? “All of them!”
We need the raw data because it enables us to double-check all the other interfaces which are provided to it. We need RESTful APIs. We need them to serve RDF and XML and JSON and CSV and all the other formats that people ask for. We need the data to be made available in SQL databases and NoSQL databases and triplestores; we need access to SQL queries and SPARQL queries and map/reduce processing. All of that, all of them, and more.
This is not a zero-sum game. Just because someone makes edubase available on the Talis platform through a SPARQL interface does not prevent someone else making it available on Amazon S3. The more methods of access there are, the more widely available and therefore useful the data is. The more things we try, the more lessons we learn, the better we get.
One thing is certain, though: the government cannot do all of this itself. They simply don’t have the resources or expertise. If we think something’s important, we can help by doing it. And we can help them, and each other, by sharing both the results of our work (so that others can build on it) and how we got them (so that others can follow the same patterns for other datasets). That, as far as I’m concerned, is what hmg.gov.uk/data is for.
Whatever our technology preferences, we can help each other by sharing our results whenever we:
The hmg.gov.uk/data has a certain bias towards Linked Data, it’s true, and this should come as no surprise given its advisors. But whichever side of that particular argument we’re on, we’re shooting ourselves in the feet if we assert that this is an exclusive choice.
Comments
Re: hmg.gov.uk/data and What We Can Do
Agreed that it’s not a zero-sum game when considering everyone - but within government it certainly is. They have a limited budget / man hours / number of meetings per week - which means they should stick to doing the things that other people like me can’t. As far as I’m concerned that’s providing access to the raw data, and sorting out the licensing.
So of your “last three” I’d go for downloadable files, since anyone can make the services/apis etc with such “raw” data. (take OpenStreetMap as a shining example). But any effort they put into making RDF / SPARQL etc is taking away from making more data available.