This post was imported from my old Drupal blog. To see the full thing, including comments, it's best to visit the Internet Archive.
During Guardian Hack Day 2, Leigh ported the Guardian’s MP’s Expenses data into Talis. Most wonderfully, this gives a SPARQL endpoint that can be used to query the data. I thought I’d try to use the same approach as I blogged about recently, using a SPARQL query as a Data Source for a Google Visualisation of the MP’s expenses data.
To cut to the chase, here’s a screenshot of the result (follow the link for the more interactive version):
I created this visualisation with the same general approach as I explained last time.
First, I’ve been working on the visualisation utils.php
, which is a reasonably simple PHP script that exposes a SPARQL endpoint as a Google Visualisation Data Source. Requests to a Data Source use a special query language to indicate the information that should be included, how it should be sorted, how many rows of data there should be, and so on.
Previously, utils.php
only understood the select
portion of the tq
parameter which contains this query; I’ve expanded it to understand (somewhat limited versions of) the select
, where
, order by
, limit
and offset
parts of the query, which of course have equivalents in SPARQL. Since these parts of the Google Visualisation query language are pretty close to SPARQL, this is actually just a bunch of string munging, which isn’t particularly interesting, so just grab hold of it if you want to use it.
Second, I created a PHP script (mp-travel.php
) specifically for the MPs expenses data that pulls out the parts that I’m interested in and exposes them as variables which can be used in the query language. This is what the file looks like:
<?php
include "utils.php";
proxy('?rMP a <http://guardian.dataincubator.org/ns/MemberOfParliament> .
?rMP <http://xmlns.com/foaf/0.1/name> ?mp .
?rMP <http://guardian.dataincubator.org/ns/mp-expenses/majority> ?majority .
?rMP <http://dbpedia.org/property/constituency> ?rConstituency .
?rConstituency rdfs:label ?constituency .
?rConstituency <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?rConstituency <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?rMP <http://guardian.dataincubator.org/ns/mp-expenses/total-travel> ?totalTravel .',
'desc(?totalTravel)',
'guardian');
?>
The second argument to the proxy()
function is the default ordering (desc(?totalTravel)
) and the third is the name of the Talis data store that’s being used (guardian
).
The first argument is a query which determines the variables that are exposed by the Data Source. This Data Source exposes the variables:
mp
: the name of the MPmajority
: the majority that they have in their constituencyconstituency
: the name of the constituencylat
,long
: the latitude and longitude of the constituency (presumably the centre of it)totalTravel
: the total amount claimed for travel by the MPrMP
: the URI used to identify the MPrConstituency
: the URI used to identify the constituency
Third, I created an HTML document that used the Google Visualisation API to create the map visualisation that I’ve shown above. The really important lines are:
var query = new google.visualization.Query('http://www.jenitennison.com/visualisation/data/mp-travel');
query.setQuery('select lat, long, totalTravel, mp order by majority limit 100');
The first line shows the URL for the Data Source, which is essentially a pointer to the mp-travel.php
script. The second line shows the query that’s sent to the Data Source: “select lat, long, totalTravel, mp order by majority limit 100
”.
Put together, when you load http://www.jenitennison.com/visualisation/mp-travel.html, you create a Google Visualisation GeoMap which uses as its data the result of the SPARQL query
SELECT ?lat ?long ?totalTravel ?mp
WHERE {
?rMP a <http://guardian.dataincubator.org/ns/MemberOfParliament> .
?rMP <http://xmlns.com/foaf/0.1/name> ?mp .
?rMP <http://guardian.dataincubator.org/ns/mp-expenses/majority> ?majority .
?rMP <http://dbpedia.org/property/constituency> ?rConstituency .
?rConstituency rdfs:label ?constituency .
?rConstituency <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?rConstituency <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?rMP <http://guardian.dataincubator.org/ns/mp-expenses/total-travel> ?totalTravel .
}
ORDER By ?majority
LIMIT 100
on the SPARQL endpoint at http://api.talis.com/stores/guardian/services/sparql.
Here’s hoping you can reuse the Data Source or the code that was used to make it. Let me know if you do!