As we encourage linked data adoption within the UK public sector, something we run into again and again is that (unsurprisingly) particular domain areas have pre-existing standard ways of thinking about the data that they care about. There are existing models, often with multiple serialisations, such as in XML and a text-based form, that are supported by existing tool chains.
In contrast, if there is existing RDF in that domain area, it’s usually been designed by people who are more interested in the RDF than in the domain area, and is thus generally more focused on the goals of the typical casual data re-user rather than the professionals in the area.
As you probably know, I’ve been working quite a lot recently on the UK government’s use of linked data, and in particular on providing guidance for people who want to publish their data as linked data. One of the things that we need to provide guidance about is how to publish linked data that changes over time. I’ve touched on this topic before but things have progressed now to the stage where we have to make some real, practical, recommendations.
data.gov.uk was finally launched to the public last week (still in beta, but now a more public beta than the beta that it’s been in for the last few months). It’s a great step forward, and everyone involved should be proud of both the amount of data that’s been made available and the website itself, which (unlike a lot of UK government IT) was developed rapidly by a small team based on open source software (and at low cost).
This is a first step on a long road.
This is the fifth part in this series about creating linked data. I’ve talked previously about analysis and modelling, defining URIs, defining concept schemes and defining a vocabulary. In this instalment I’ll talk about the finishing touches that can make linked data easier to browse, query, locate and trust.
Note that we don’t have to do any of these things; they’re not part of the core data. We shouldn’t beat ourselves up if we don’t have time to do it right now, because we can always add them later, and it might be that you just don’t agree that they should be done. But many of them don’t take a lot of time and can enhance the user’s experience of the data.
This is the fourth instalment in a series about turning an existing dataset into some linked data. I’ve previously talked about analysis and modelling, defining URIs and defining concept schemes. In this instalment, we’ll look at developing a schema in which we define the classes, properties and datatypes that we want to use in the RDF that describes the things in our dataset.
This is the third instalment in a series that I’m writing about turning data into linked data. I’m using traffic count data as the example, since that’s a dataset that I’m currently working on. In the last two instalments, I talked about analysing and modelling the data and about designing URIs for the things in that model.
Within the model, there are three sets of things that are concepts:
This is the second instalment in a series of posts about how to create linked data from existing data sets, using traffic count data as an example. In the last instalment, I talked about analysing and modelling data. This instalment discusses the creation of URIs for the various things that have been identified within the model.
This part of the process is the same as what you’d do if you were simply creating a RESTful API to a website. The principal is that everything has a URI, and if you resolve that URI you get information about the thing.
In the Linked Data world, we talk a lot about having URIs that are identifiers for things, and making them HTTP URIs so that they can be dereferenced and people can find more information about those things.
This raises the questions of “what information should you publish?” Let’s make this concrete by using a real example: UK Legislation, which TSO is publishing for OPSI as Linked Data.
Update 2009-11-08: The developers of the Provenance Vocabulary tell me that the pattern I used below isn’t correct, and there doesn’t currently seem to be a method of describing what I want to describe using that vocabulary. But it’s still under development, so hopefully it will become usable soon.
One of my favourite tweets from Rob McKinnon (aka @delineator) is this one:

because it’s one of the things that bugs me on occasion too, and because the issues he mentions are so vitally important when we’re talking about public sector information but (because they’re the hard issues) are easy to de-prioritise in the rush to make data available.
This week, the Cabinet Office went live with a preview version of hmg.gov.uk/data, available only to those who subscribe to the UK Government Data Developers Google Group. Harry Metcalfe has written a great review, or of course you can check it out yourselves.
Already, though, there are discussions starting on the mailing list about how the data is being made available, and I’m worried that these might distract us from getting things done.