This is the third instalment in a series that I’m writing about turning data into linked data. I’m using traffic count data as the example, since that’s a dataset that I’m currently working on. In the last two instalments, I talked about analysing and modelling the data and about designing URIs for the things in that model.
Within the model, there are three sets of things that are concepts:
This is the second instalment in a series of posts about how to create linked data from existing data sets, using traffic count data as an example. In the last instalment, I talked about analysing and modelling data. This instalment discusses the creation of URIs for the various things that have been identified within the model.
This part of the process is the same as what you’d do if you were simply creating a RESTful API to a website. The principal is that everything has a URI, and if you resolve that URI you get information about the thing.
One of the goals of the government’s Data Project is to equip the people who own data with the capability to publish it as linked data. There’s an overwhelming amount of work to do here from providing tool support to changing a culture that makes it hard to publish data. But we can start by taking some baby steps that simply explain what’s involved in turning existing data into linked data.
I’m currently reworking the traffic count linked data that I first transformed back in September, and I thought it would be helpful to talk through that process for several reasons:
Rather than creating one massive blog post, I’m going to break it down into several steps. These are:
This is the first instalment.
In the Linked Data world, we talk a lot about having URIs that are identifiers for things, and making them HTTP URIs so that they can be dereferenced and people can find more information about those things.
Update 2009-11-08: The developers of the Provenance Vocabulary tell me that the pattern I used below isn’t correct, and there doesn’t currently seem to be a method of describing what I want to describe using that vocabulary. But it’s still under development, so hopefully it will become usable soon.
One of my favourite tweets from Rob McKinnon (aka @delineator) is this one:
because it’s one of the things that bugs me on occasion too, and because the issues he mentions are so vitally important when we’re talking about public sector information but (because they’re the hard issues) are easy to de-prioritise in the rush to make data available.