As part of the ongoing discussion about how to reconcile RDFa and microdata (if at all), Nathan Rixham has put together a suggested Microdata RDFa Merge which brings together parts of microdata and parts of RDFa, creating a completely new set of attributes, but a parsing model that more or less follows microdata’s.
I want here to put forward another possibility to the debate. I should say that this is just some noodling on my part as a way of exploring options, not any kind of official position on the behalf of the W3C or the TAG or any other body that you might associate me with, nor even a decided position on my part.
One of the things that’s been niggling at the back of my mind since the schema.org announcement is how small a role search engine results plays in the wider data sharing efforts that I’m more familiar with in my work on legislation.gov.uk, and more generally how my day job experience differs from (what seem to be) more common experiences of development on the web. In this post, I’m going to talk about that experience, and about the particular problems that I see with the coexistence of microdata and RDFa as a result.
My previous post talked about how to install 4store as a triplestore, and use the Ruby library RDF.rb in order to process RDF extracted from that store. This was a response to Richard Pope’s Linked Data/RDF/SPARQL Documentation Challenge which asks for documentation of how to install a triplestore, load data into it, retrieve it using SPARQL and access the results as native structures using Ruby, Python or PHP.
I quite enjoyed writing the last one, so I thought I’d try again. As before, I am on Mac OS X, but this time I’m going to use Python, which I have not programmed in before. I like a challenge. You might not like the results!
Updated to include some of Arto Bendicken’s recommendations.
This post is a response to Richard Pope’s Linked Data/RDF/SPARQL Documentation Challenge. In it, he asks for documentation of the following steps:
- Install an RDF store from a package management system on a computer running either Apple’s OSX or Ubuntu Desktop.
- Install a code library (again from a package management system) for talking to the RDF store in either PHP, Ruby or Python.
- Programatically load some real-world data into the RDF datastore using either PHP, Ruby or Python.
- Programatically retrieve data from the datastore with SPARQL using using either PHP, Ruby or Python.
- Convert retrieved data into an object or datatype that can be used by the chosen programming language (e.g. a Python dictionary).
I’ve been told so many time how RDF sucks for mainstream developers that it was the main point of my TPAC talk late last year. I think that this is a great motivating challenge for improving not only the documentation of how to use RDF stores and libraries but how to improve their generally installability and usability for developers as well.
Anyway, I thought I’d try to get as far as I could to see just how bad things really are. I am on Mac OS X, and I’m going to use Ruby (although I don’t really know it all that well, so please forgive my mistakes). I’ll breeze on through as if everything is hunky dory, but there are some caveats at the end.
I got a little bit of pushback on my previous blog post for suggesting that W3C should standardise an API for RDF. (I’m talking here about a programming-interface-kind-of-API to enable developers to extract information out of an RDF document rather than a website-API to enable them to access RDF data in the first place.)
I just wanted to talk about a couple of actual real-life scenarios that make me want a standard RDF API:
A couple of weeks ago I did a talk at the TPAC Plenary Day about why RDF hasn’t had the uptake that it might and what could be done about it.
I felt quite uncomfortable about doing this for many reasons. The predominant one is that I’m well aware that the world is made by the people who turn up. It is far far easier to snipe from the sidelines than it is to put in the effort to attend telcons and face-to-face meetings, to engage on mailing lists, to write specifications and implementations and tutorials.
On the other hand, what I hope is that the perspective of someone who is outside that process, someone who tries to understand and interpret and use the results of that process, might be valuable. And so I aimed to provide that honestly.
In that spirit, I’m going to put my stake in the ground and say that there are three areas where I think W3C should be concentrating its efforts:
and that it should specifically not put its efforts into standardising another syntax for RDF based on JSON.
As we encourage linked data adoption within the UK public sector, something we run into again and again is that (unsurprisingly) particular domain areas have pre-existing standard ways of thinking about the data that they care about. There are existing models, often with multiple serialisations, such as in XML and a text-based form, that are supported by existing tool chains.
In contrast, if there is existing RDF in that domain area, it’s usually been designed by people who are more interested in the RDF than in the domain area, and is thus generally more focused on the goals of the typical casual data re-user rather than the professionals in the area.
This is the fifth part in this series about creating linked data. I’ve talked previously about analysis and modelling, defining URIs, defining concept schemes and defining a vocabulary. In this instalment I’ll talk about the finishing touches that can make linked data easier to browse, query, locate and trust.
Note that we don’t have to do any of these things; they’re not part of the core data. We shouldn’t beat ourselves up if we don’t have time to do it right now, because we can always add them later, and it might be that you just don’t agree that they should be done. But many of them don’t take a lot of time and can enhance the user’s experience of the data.
This is the fourth instalment in a series about turning an existing dataset into some linked data. I’ve previously talked about analysis and modelling, defining URIs and defining concept schemes. In this instalment, we’ll look at developing a schema in which we define the classes, properties and datatypes that we want to use in the RDF that describes the things in our dataset.
This is the third instalment in a series that I’m writing about turning data into linked data. I’m using traffic count data as the example, since that’s a dataset that I’m currently working on. In the last two instalments, I talked about analysing and modelling the data and about designing URIs for the things in that model.
Within the model, there are three sets of things that are concepts: