<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.jenitennison.com/blog" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>Talis</title>
 <link>http://www.jenitennison.com/blog/taxonomy/term/47</link>
 <description>The taxonomy view with a depth of 0.</description>
 <language>en</language>
<item>
 <title>SPARQL &amp; Visualisation Frustrations: Linked Data</title>
 <link>http://www.jenitennison.com/blog/node/121</link>
 <description>&lt;p&gt;I&amp;#8217;ll start with the problem. To create the graphs I showed in &lt;a href=&quot;http://www.jenitennison.com/blog/node/120&quot;&gt;my last post&lt;/a&gt;, I wanted to split MPs into groups based on their party affiliation. Ideally, I wanted the Google Visualisation query to look like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select mp, additionalCosts, totalTravel, totalBasic 
where party = &#039;Conservative&#039; 
order by totalClaim desc 
limit 25
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;because this is reasonably easy to understand and for a developer to create without having to know any magic URIs.&lt;/p&gt;

&lt;p&gt;The party affiliation for an MP is given in the RDF supplied within the &lt;a href=&quot;http://guardian.dataincubator.org/&quot;&gt;Talis store&lt;/a&gt; as a pointer to one of the resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;http://dbpedia.org/resource/Labour_Party_(UK)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;http://dbpedia.org/resource/Conservative_Party_(UK)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;http://dbpedia.org/resource/Liberal_Democrats&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, if you visit &lt;a href=&quot;http://dbpedia.org/resource/Conservative_Party_(UK)&quot;&gt;http://dbpedia.org/resource/Conservative_Party_(UK)&lt;/a&gt; then you&amp;#8217;ll see precious few properties and none of them give you access to the string &amp;#8216;Conservative&amp;#8217;. If you look at &lt;a href=&quot;http://dbpedia.org/resource/Liberal_Democrats&quot;&gt;http://dbpedia.org/resource/Liberal_Democrats&lt;/a&gt;, you&amp;#8217;ll see plenty of properties, one of which is &lt;code&gt;dbpprop:partyName&lt;/code&gt;. But trying to query on &lt;code&gt;dbpprop:partyName&lt;/code&gt; within the Talis data store gives me nothing, because that information hasn&amp;#8217;t been imported into the particular store that this SPARQL query is running on.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;What I did in &lt;code&gt;utils.php&lt;/code&gt; was extend the parsing of the &lt;code&gt;tq&lt;/code&gt; parameter, which is supposed to be in the Google Visualisation query language, to understand &lt;code&gt;&amp;lt;URI&amp;gt;&lt;/code&gt; as a reference to a resource. In other words, you can create a query like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select mp, additionalCosts, totalTravel, totalBasic 
where rParty = &amp;lt;http://dbpedia.org/resource/Conservative_Party_(UK)&amp;gt; 
order by totalClaim desc 
limit 25
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and this will be mapped to a SPARQL query that looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;SELECT ?mp ?additionalCosts ?totalTravel ?totalBasic 
WHERE {
  ...
  FILTER (?rParty = &amp;lt;http://dbpedia.org/resource/Conservative_Party_(UK)&amp;gt;)
}
ORDER BY desc(?totalClaim)
LIMIT 25
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I don&amp;#8217;t like having done this, because I don&amp;#8217;t want Data Sources that happen to be SPARQL queries to look any different from other Data Sources. Introducing a new syntax for URI literals isn&amp;#8217;t really on.&lt;/p&gt;

&lt;p&gt;The superficial fix is to &lt;strong&gt;always provide basic labelling information for the resources referenced within a triplestore&lt;/strong&gt;. In this case, Leigh actually did include an &lt;code&gt;rdfs:label&lt;/code&gt; property for each of the party URIs within the Guardian store, so it was possible to use the query I wanted to use after all (though it took some experimentation to find this out).&lt;/p&gt;

&lt;p&gt;But underlying this is a bigger issue. Much is made of linked data &amp;#8212; that you can find out more about a particular thing by resolving the link to that thing &amp;#8212; but the best illustrations of the power and benefits of the semantic web tend to revolve around analysis and visualisations of moderately large amounts of data using SPARQL. And SPARQL (as yet) only runs on individual triplestores, which do not contain the entire semantic web. Every SPARQL query is limited by what has been loaded into the particular triplestore that is queried.&lt;/p&gt;

&lt;p&gt;Now, one of the &amp;#8220;time-permitting&amp;#8221; requirements for SPARQL 1.1 is &lt;a href=&quot;http://www.w3.org/TR/sparql-features/#Basic_federated_query&quot;&gt;Federated Queries&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Federated query is the ability to take a query and provide solutions based on information from many different sources. It is a hard problem in its most general form and is the subject of continuing (and continuous) research. A building block is the ability to have one query be able to issue a query on another SPARQL endpoint during query execution.&lt;/p&gt;
  
  &lt;p&gt;Time-permitting, the SPARQL Working Group will define the syntax and semantics for handling a basic class of federated queries in which the SPARQL endpoints to use in executing portions of the query are explicitly given by the query author.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That&amp;#8217;s certainly &amp;#8220;a building block&amp;#8221;, but it can&amp;#8217;t be the only method. For many data publishers, it&amp;#8217;s going to be far far simpler to publish their data as linked data in RDF/XML than it is to provide a SPARQL endpoint for that data. We can ask organisations like &lt;a href=&quot;http://www.talis.com/platform&quot;&gt;Talis&lt;/a&gt; to crawl our data and provide a SPARQL endpoint for it, and hope that the SPARQL Working Group have time to address federated search, but really we need tools that make it easy to aggregate, analyse and visualise linked data directly rather than through a triplestore silo.&lt;/p&gt;

&lt;p&gt;So how about it?&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/121#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/46">linked data</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/51">sparql</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/49">visualisation</category>
 <pubDate>Mon, 03 Aug 2009 20:36:34 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">121 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Map Visualisation of MPs Travel Expenses</title>
 <link>http://www.jenitennison.com/blog/node/119</link>
 <description>&lt;p&gt;During &lt;a href=&quot;http://www.guardian.co.uk/media/pda/2009/jul/31/hacking-opensource1&quot;&gt;Guardian Hack Day 2&lt;/a&gt;, &lt;a href=&quot;http://www.ldodds.com/&quot;&gt;Leigh&lt;/a&gt; ported the &lt;a href=&quot;http://mps-expenses.guardian.co.uk/&quot;&gt;Guardian&amp;#8217;s MP&amp;#8217;s Expenses data&lt;/a&gt; into &lt;a href=&quot;http://guardian.dataincubator.org/&quot;&gt;Talis&lt;/a&gt;. Most wonderfully, this gives a &lt;a href=&quot;http://api.talis.com/stores/guardian/services/sparql&quot;&gt;SPARQL endpoint&lt;/a&gt; that can be used to query the data. I thought I&amp;#8217;d try to use the same approach as I &lt;a href=&quot;http://www.jenitennison.com/blog/node/113&quot;&gt;blogged about recently&lt;/a&gt;, using a SPARQL query as a &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/dev/implementing_data_source.html&quot;&gt;Data Source&lt;/a&gt; for a &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/gallery.html&quot;&gt;Google Visualisation&lt;/a&gt; of the MP&amp;#8217;s expenses data.&lt;/p&gt;

&lt;p&gt;To cut to the chase, here&amp;#8217;s a screenshot of &lt;a href=&quot;http://www.jenitennison.com/visualisation/mp-travel.html&quot;&gt;the result&lt;/a&gt; (follow the link for the more interactive version):&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Map of travel expenses for the 100 MPs with the lowest majorities&quot; src=&quot;/blog/files/mp-travel.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;I created this visualisation with the same general approach as I &lt;a href=&quot;http://www.jenitennison.com/blog/node/113&quot;&gt;explained last time&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;First, I&amp;#8217;ve been working on the visualisation &lt;code&gt;utils.php&lt;/code&gt;, which is a reasonably simple PHP script that exposes a SPARQL endpoint as a Google Visualisation Data Source. Requests to a Data Source use a special &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/querylanguage.html&quot;&gt;query language&lt;/a&gt; to indicate the information that should be included, how it should be sorted, how many rows of data there should be, and so on.&lt;/p&gt;

&lt;p&gt;Previously, &lt;a href=&quot;/blog/files/utils.php_2.txt&quot;&gt;&lt;code&gt;utils.php&lt;/code&gt;&lt;/a&gt; only understood the &lt;code&gt;select&lt;/code&gt; portion of the &lt;code&gt;tq&lt;/code&gt; parameter which contains this query; I&amp;#8217;ve expanded it to understand (somewhat limited versions of) the &lt;code&gt;select&lt;/code&gt;, &lt;code&gt;where&lt;/code&gt;, &lt;code&gt;order by&lt;/code&gt;, &lt;code&gt;limit&lt;/code&gt; and &lt;code&gt;offset&lt;/code&gt; parts of the query, which of course have equivalents in &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-query/&quot;&gt;SPARQL&lt;/a&gt;. Since these parts of the Google Visualisation query language are pretty close to SPARQL, this is actually just a bunch of string munging, which isn&amp;#8217;t particularly interesting, so just &lt;a href=&quot;/blog/files/utils.php_2.txt&quot;&gt;grab hold of it&lt;/a&gt; if you want to use it.&lt;/p&gt;

&lt;p&gt;Second, I created a PHP script (&lt;a href=&quot;/blog/files/mp-travel.php.txt&quot;&gt;&lt;code&gt;mp-travel.php&lt;/code&gt;&lt;/a&gt;) specifically for the MPs expenses data that pulls out the parts that I&amp;#8217;m interested in and exposes them as variables which can be used in the query language. This is what the file looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  include &quot;utils.php&quot;;
  proxy(&#039;?rMP a &amp;lt;http://guardian.dataincubator.org/ns/MemberOfParliament&amp;gt; .
         ?rMP &amp;lt;http://xmlns.com/foaf/0.1/name&amp;gt; ?mp .
         ?rMP &amp;lt;http://guardian.dataincubator.org/ns/mp-expenses/majority&amp;gt; ?majority .
         ?rMP &amp;lt;http://dbpedia.org/property/constituency&amp;gt; ?rConstituency .
         ?rConstituency rdfs:label ?constituency .
         ?rConstituency &amp;lt;http://www.w3.org/2003/01/geo/wgs84_pos#lat&amp;gt; ?lat .
         ?rConstituency &amp;lt;http://www.w3.org/2003/01/geo/wgs84_pos#long&amp;gt; ?long .
         ?rMP &amp;lt;http://guardian.dataincubator.org/ns/mp-expenses/total-travel&amp;gt; ?totalTravel .&#039;,
        &#039;desc(?totalTravel)&#039;, 
        &#039;guardian&#039;);
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The second argument to the &lt;code&gt;proxy()&lt;/code&gt; function is the default ordering (&lt;code&gt;desc(?totalTravel)&lt;/code&gt;) and the third is the name of the Talis data store that&amp;#8217;s being used (&lt;code&gt;guardian&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The first argument is a query which determines the variables that are exposed by the Data Source. This Data Source exposes the variables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;mp&lt;/code&gt;: the name of the MP&lt;/li&gt;
&lt;li&gt;&lt;code&gt;majority&lt;/code&gt;: the majority that they have in their constituency&lt;/li&gt;
&lt;li&gt;&lt;code&gt;constituency&lt;/code&gt;: the name of the constituency&lt;/li&gt;
&lt;li&gt;&lt;code&gt;lat&lt;/code&gt;, &lt;code&gt;long&lt;/code&gt;: the latitude and longitude of the constituency (presumably the centre of it)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;totalTravel&lt;/code&gt;: the total amount claimed for travel by the MP&lt;/li&gt;
&lt;li&gt;&lt;code&gt;rMP&lt;/code&gt;: the URI used to identify the MP&lt;/li&gt;
&lt;li&gt;&lt;code&gt;rConstituency&lt;/code&gt;: the URI used to identify the constituency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Third, I created an &lt;a href=&quot;/blog/files/mp-travel.html&quot;&gt;HTML document&lt;/a&gt; that used the Google Visualisation API to create the map visualisation that I&amp;#8217;ve shown above. The really important lines are:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;var query = new google.visualization.Query(&#039;http://www.jenitennison.com/visualisation/data/mp-travel&#039;);
query.setQuery(&#039;select lat, long, totalTravel, mp order by majority limit 100&#039;);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The first line shows the URL for the Data Source, which is essentially a pointer to the &lt;code&gt;mp-travel.php&lt;/code&gt; script. The second line shows the query that&amp;#8217;s sent to the Data Source: &amp;#8220;&lt;code&gt;select lat, long, totalTravel, mp order by majority limit 100&lt;/code&gt;&amp;#8221;.&lt;/p&gt;

&lt;p&gt;Put together, when you load &lt;a href=&quot;http://www.jenitennison.com/visualisation/mp-travel.html&quot;&gt;http://www.jenitennison.com/visualisation/mp-travel.html&lt;/a&gt;, you create a &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/gallery/geomap.html&quot;&gt;Google Visualisation GeoMap&lt;/a&gt; which uses as its data the result of the SPARQL query&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;SELECT ?lat ?long ?totalTravel ?mp
WHERE {
  ?rMP a &amp;lt;http://guardian.dataincubator.org/ns/MemberOfParliament&amp;gt; .
  ?rMP &amp;lt;http://xmlns.com/foaf/0.1/name&amp;gt; ?mp .
  ?rMP &amp;lt;http://guardian.dataincubator.org/ns/mp-expenses/majority&amp;gt; ?majority .
  ?rMP &amp;lt;http://dbpedia.org/property/constituency&amp;gt; ?rConstituency .
  ?rConstituency rdfs:label ?constituency .
  ?rConstituency &amp;lt;http://www.w3.org/2003/01/geo/wgs84_pos#lat&amp;gt; ?lat .
  ?rConstituency &amp;lt;http://www.w3.org/2003/01/geo/wgs84_pos#long&amp;gt; ?long .
  ?rMP &amp;lt;http://guardian.dataincubator.org/ns/mp-expenses/total-travel&amp;gt; ?totalTravel .
}
ORDER By ?majority
LIMIT 100
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;on the SPARQL endpoint at &lt;a href=&quot;http://api.talis.com/stores/guardian/services/sparql&quot;&gt;http://api.talis.com/stores/guardian/services/sparql&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here&amp;#8217;s hoping you can reuse the Data Source or the code that was used to make it. Let me know if you do!&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/119#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/49">visualisation</category>
 <enclosure url="http://www.jenitennison.com/blog/files/utils.php_2.txt" length="4151" type="text/plain" />
 <pubDate>Fri, 31 Jul 2009 22:33:13 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">119 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>The Real Deal: data.gov.uk</title>
 <link>http://www.jenitennison.com/blog/node/115</link>
 <description>&lt;p&gt;I&amp;#8217;m sure that you&amp;#8217;ve noticed that my recent posts have been somewhat obsessed with publishing and using public sector information. It&amp;#8217;s because I&amp;#8217;ve somehow been sucked into the work going on within the UK government, &lt;a href=&quot;http://blogs.cabinetoffice.gov.uk/digitalengagement/post/2009/06/09/Data-So-what-happens-now.aspx&quot;&gt;with Tim Berners-Lee and Nigel Shadbolt advising&lt;/a&gt;, to publish its data as linked data.&lt;/p&gt;

&lt;p&gt;My &lt;a href=&quot;http://www.jenitennison.com/blog/node/109&quot;&gt;recent&lt;/a&gt; &lt;a href=&quot;http://www.jenitennison.com/blog/node/110&quot;&gt;blog&lt;/a&gt; &lt;a href=&quot;http://www.jenitennison.com/blog/node/111&quot;&gt;posts&lt;/a&gt; about publishing data using &lt;a href=&quot;http://www.talis.com/platform/&quot;&gt;Talis&lt;/a&gt; have actually been a front for much more complex work that I&amp;#8217;ve been doing with a different data set.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;As an early demonstration of how existing government data sets might be turned into linked data, a few weeks ago I was given a CSV file containing road traffic counts; the raw data that lies behind the &lt;a href=&quot;http://www.dft.gov.uk/matrix/&quot;&gt;traffic flow information&lt;/a&gt; available on the Department for Transport website. The data is really interesting and ripe for visualisations and analysis. For each hour of particular days each year, at particular points on many roads within the UK, the Department for Transport measures the number of bicycles, motorbikes, cars, vans, buses and HGVs of various types that roll past in each direction. The data contains information about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the count of each of the various classes of traffic that pass the point in a particular direction on a particular hour of a particular day&lt;/li&gt;
&lt;li&gt;the points at which these measurements were taken&lt;/li&gt;
&lt;li&gt;the roads on which the points are situated&lt;/li&gt;
&lt;li&gt;the areas in which the points are situated&lt;/li&gt;
&lt;li&gt;the local authority that is in charge of these areas&lt;/li&gt;
&lt;li&gt;the region that the area is in&lt;/li&gt;
&lt;li&gt;the country that the region is in &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The challenge was to turn the 386Mb CSV file into linked data. The result is up and available for you to look at; a good starting point is &lt;a href=&quot;http://geo.data.gov.uk/0/country&quot;&gt;http://geo.data.gov.uk/0/country&lt;/a&gt;. Just follow the links from there.&lt;/p&gt;

&lt;p&gt;With a few false starts and mis-steps, this is the process that I went through:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tidied the CSV file so that it could be processed using awk. That meant replacing the commas that were delimiters with &lt;code&gt;|&lt;/code&gt;s. It also meant removing a couple of weird ^M characters that had snuck into the file.&lt;/li&gt;
&lt;li&gt;Examined the data and came up with an informal ontology and prototype URI scheme.&lt;/li&gt;
&lt;li&gt;Created a bunch of awk scripts to extract different data from the files and create RDF/XML from it.&lt;/li&gt;
&lt;li&gt;Ran the scripts to create RDF/XML.&lt;/li&gt;
&lt;li&gt;Uploaded the data into a Talis store.&lt;/li&gt;
&lt;li&gt;Created appropriate PHP for the data and put it into a proxy server.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Some of this has been covered by my recent posts, so I&amp;#8217;m just going to talk about a few of these steps in a bit more detail.&lt;/p&gt;

&lt;p&gt;First, the URIs. Frankly, they&amp;#8217;re an experiment to see how it plays. The templates are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;countries: &lt;code&gt;http://geo.data.gov.uk/0/id/country/{name}&lt;/code&gt;, eg &lt;a href=&quot;http://geo.data.gov.uk/0/id/country/england&quot;&gt;http://geo.data.gov.uk/0/id/country/england&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;regions: &lt;code&gt;http://geo.data.gov.uk/0/id/region/{name}&lt;/code&gt;, eg &lt;a href=&quot;http://geo.data.gov.uk/0/id/region/north-west&quot;&gt;http://geo.data.gov.uk/0/id/region/north-west&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;areas: &lt;code&gt;http://geo.data.gov.uk/0/id/area/{ONS code}&lt;/code&gt;, eg &lt;a href=&quot;http://geo.data.gov.uk/0/id/area/00KA&quot;&gt;http://geo.data.gov.uk/0/id/area/00KA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;local authorities: &lt;code&gt;http://local-government.data.gov.uk/0/id/local-authority/{ONS code for area}&lt;/code&gt;, eg &lt;a href=&quot;http://local-government.data.gov.uk/0/id/local-authority/00KA&quot;&gt;http://local-government.data.gov.uk/0/id/local-authority/00KA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;roads: &lt;code&gt;http://transport.data.gov.uk/0/id/road/{name}&lt;/code&gt; or &lt;code&gt;http://transport.data.gov.uk/0/id/road/U-{random number}&lt;/code&gt;, eg &lt;a href=&quot;http://transport.data.gov.uk/0/id/road/M5&quot;&gt;http://transport.data.gov.uk/0/id/road/M5&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;traffic count points: &lt;code&gt;http://transport.data.gov.uk/0/id/traffic-count-point/{number}&lt;/code&gt;, eg &lt;a href=&quot;http://transport.data.gov.uk/0/id/traffic-count-point/36195&quot;&gt;http://transport.data.gov.uk/0/id/traffic-count-point/36195&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;traffic counts: &lt;code&gt;http://transport.data.gov.uk/0/id/traffic-count/{point number}/{direction}/{date}/{hour}/{traffic type}&lt;/code&gt;, eg &lt;a href=&quot;http://transport.data.gov.uk/0/id/traffic-count/4/N/2008-06-05/08:00:00/HGVr2&quot;&gt;http://transport.data.gov.uk/0/id/traffic-count/4/N/2008-06-05/08:00:00/HGVr2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The subdomains are one way of subdividing the vast set of public sector information into vague categories that might be handled by different departments, without using the (highly changeable) department names in the URI. The &lt;code&gt;/0&lt;/code&gt; portion of each URI is a version number: these URIs are experimental and liable to be unsupported in the future so they&amp;#8217;re marked with a version 0. The &lt;code&gt;/id&lt;/code&gt; portion of each URI indicates that these are URIs for non-information resources; the response is a &lt;code&gt;303 See Other&lt;/code&gt; redirect to the same URIs but without the &lt;code&gt;/id&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;After the &lt;code&gt;/id&lt;/code&gt;, the URIs follow a common pattern of naming a class of resource, followed by an appropriate identifier for that resource. The identifiers themselves are designed to be unique, &lt;a href=&quot;http://www.jenitennison.com/blog/node/112&quot;&gt;unlikely to change&lt;/a&gt;, and &lt;a href=&quot;http://www.jenitennison.com/blog/node/114&quot;&gt;human readable&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The ontologies, well, actually they don&amp;#8217;t exist as yet except in my head. It&amp;#8217;s been more important to make the data available than to provide ontologies for it. Triplestores and SPARQL queries work without ontologies; indeed you have to go out of your way to find applications that actually reason with them. Like schemas for XML documents, they&amp;#8217;re not absolutely essential, but useful for documentation purposes and &lt;em&gt;potentially&lt;/em&gt; useful for applications.&lt;/p&gt;

&lt;p&gt;There are, though, a couple of &lt;a href=&quot;http://www.w3.org/2004/02/skos/&quot;&gt;SKOS&lt;/a&gt; schemes for categorising roads and vehicle types. These are available via:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;http://transport.data.gov.uk/0/category/road&lt;/li&gt;
&lt;li&gt;http://transport.data.gov.uk/0/category/vehicle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They were informed by the &lt;a href=&quot;http://www.cbrd.co.uk/roadsfaq/&quot;&gt;British Roads FAQ&lt;/a&gt; and the &lt;a href=&quot;http://www.dft.gov.uk/matrix/forms/definitions.aspx&quot;&gt;data definitions from the Department for Transport&lt;/a&gt;. I heartily recommend a read; it&amp;#8217;s scintillating stuff!&lt;/p&gt;

&lt;p&gt;Anyway, with this size of file, and the kind of processing that needed to be done with it, the simple XSLT that I talked about &lt;a href=&quot;http://www.jenitennison.com/blog/node/109&quot;&gt;previously&lt;/a&gt; for extracting data out of CSV files just wasn&amp;#8217;t going to cut it. Awk, on the other hand, is designed for this kind of processing. Most of the RDF/XML could be generated by collecting unique values from the file. For example, to generate the RDF/XML for the regions I used:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;BEGIN { 
  FS = &quot;|&quot;;
  print &quot;&amp;lt;rdf:RDF xmlns:rdf=\&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#\&quot;&quot;;
  print &quot;  xmlns:rdfs=\&quot;http://www.w3.org/2000/01/rdf-schema#\&quot;&quot;;
  print &quot;  xmlns:g=\&quot;http://geo.data.gov.uk/0/ontology/geo#\&quot;&amp;gt;&quot;;
}
FNR &amp;gt; 1 {
  countries[$2] = substr($1, 2, length($1) - 2);
  regions[$2] = substr($2, 2, length($2) - 2);
  codes[$2] = substr($3, 2, length($3) - 2);
}
END { 
  for (region in regions) {
    country = countries[region];
    name = regions[region];
    code = codes[region];
    path = tolower(name);
    gsub(&quot; &quot;, &quot;-&quot;, path);
    print &quot;&amp;lt;g:Region rdf:about=\&quot;http://geo.data.gov.uk/0/id/region/&quot; path &quot;\&quot;&amp;gt;&quot;;
    print &quot;  &amp;lt;rdfs:label&amp;gt;&quot; name &quot;&amp;lt;/rdfs:label&amp;gt;&quot;;
    print &quot;  &amp;lt;g:isInCountry&amp;gt;&quot;;
    print &quot;    &amp;lt;g:Country rdf:about=\&quot;http://geo.data.gov.uk/0/id/country/&quot; tolower(country) &quot;\&quot;&amp;gt;&quot;;
    print &quot;      &amp;lt;g:hasRegion rdf:resource=\&quot;http://geo.data.gov.uk/0/id/region/&quot; path &quot;\&quot; /&amp;gt;&quot;;
    print &quot;    &amp;lt;/g:Country&amp;gt;&quot;;
    print &quot;  &amp;lt;/g:isInCountry&amp;gt;&quot;;
    if (code != &quot;&quot;) {
      print &quot;  &amp;lt;g:ONScode rdf:datatype=\&quot;http://www.w3.org/2001/XMLSchema#NCName\&quot;&amp;gt;&quot; code &quot;&amp;lt;/g:ONScode&amp;gt;&quot;;
    }
    print &quot;&amp;lt;/g:Region&amp;gt;&quot;;
  }
  print &quot;&amp;lt;/rdf:RDF&amp;gt;&quot;; 
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This generated RDF/XML that looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
  xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;
  xmlns:g=&quot;http://geo.data.gov.uk/0/ontology/geo#&quot;&amp;gt;
&amp;lt;g:Region rdf:about=&quot;http://geo.data.gov.uk/0/id/region/london&quot;&amp;gt;
  &amp;lt;rdfs:label&amp;gt;London&amp;lt;/rdfs:label&amp;gt;
  &amp;lt;g:isInCountry&amp;gt;
    &amp;lt;g:Country rdf:about=&quot;http://geo.data.gov.uk/0/id/country/england&quot;&amp;gt;
      &amp;lt;g:hasRegion rdf:resource=&quot;http://geo.data.gov.uk/0/id/region/london&quot; /&amp;gt;
    &amp;lt;/g:Country&amp;gt;
  &amp;lt;/g:isInCountry&amp;gt;
  &amp;lt;g:ONScode rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#NCName&quot;&amp;gt;H&amp;lt;/g:ONScode&amp;gt;
&amp;lt;/g:Region&amp;gt;
&amp;lt;g:Region rdf:about=&quot;http://geo.data.gov.uk/0/id/region/yorkshire-and-the-humber&quot;&amp;gt;
  &amp;lt;rdfs:label&amp;gt;Yorkshire and The Humber&amp;lt;/rdfs:label&amp;gt;
  &amp;lt;g:isInCountry&amp;gt;
    &amp;lt;g:Country rdf:about=&quot;http://geo.data.gov.uk/0/id/country/england&quot;&amp;gt;
      &amp;lt;g:hasRegion rdf:resource=&quot;http://geo.data.gov.uk/0/id/region/yorkshire-and-the-humber&quot; /&amp;gt;
    &amp;lt;/g:Country&amp;gt;
  &amp;lt;/g:isInCountry&amp;gt;
  &amp;lt;g:ONScode rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#NCName&quot;&amp;gt;D&amp;lt;/g:ONScode&amp;gt;
&amp;lt;/g:Region&amp;gt;
...
&amp;lt;/rdf:RDF&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In other cases, I needed to split up the RDF/XML that was generated into several files. Uploads to Talis of more than about 2Mb cause the upload to fail. The traffic count point RDF/XML needed to be split into 13 separate files. The traffic counts themselves&amp;#8230; well, I haven&amp;#8217;t managed to do it all yet but to give you an idea, the 2008 data alone generated 1800 RDF/XML files, each about 1.6Mb in size and each taking about a minute to upload. What&amp;#8217;s there now is all the 2008 data, and the overall motor vehicle counts from all the years. More will be added gradually.&lt;/p&gt;

&lt;p&gt;The awk script that generates the count data in separate files is:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;BEGIN { 
  FS = &quot;|&quot;;
  fileCount = 0;
  countCount = 99999;
  curlFile = &quot;traffic-counts.curl.sh&quot;;
}
FNR &amp;gt; 1 &amp;amp;&amp;amp; $15 ~ /\/2008 / {
  countCount += 1;
  if (countCount &amp;gt; 200) {
    if (fileCount != 0) {
      print &quot;&amp;lt;/rdf:RDF&amp;gt;&quot; &amp;gt; fileName; 
      close(fileName);
    }
    countCount = 0;
    fileCount += 1;
    fileName = &quot;traffic-counts/traffic-counts.&quot; fileCount &quot;.rdf&quot;;
    print &quot;creating&quot;, fileName;
    print &quot;echo loading&quot;, fileName &amp;gt; curlFile;
    print &quot;curl -H \&quot;Content-type: application/rdf+xml\&quot; -o progress.txt --digest -u username:password --data-binary @&quot; fileName &quot; http://api.talis.com/stores/transport/meta&quot; &amp;gt; curlFile;

    print &quot;&amp;lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;ASCII\&quot;?&amp;gt;&quot; &amp;gt; fileName;
    print &quot;&amp;lt;rdf:RDF xmlns:rdf=\&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#\&quot;&quot; &amp;gt; fileName;
    print &quot;  xmlns:rdfs=\&quot;http://www.w3.org/2000/01/rdf-schema#\&quot;&quot; &amp;gt; fileName;
    print &quot;  xmlns:xsd=\&quot;http://www.w3.org/2001/XMLSchema#\&quot;&quot; &amp;gt; fileName;
    print &quot;  xmlns:t=\&quot;http://transport.data.gov.uk/0/ontology/traffic#\&quot;&quot; &amp;gt; fileName;
    print &quot;  xml:base=\&quot;http://transport.data.gov.uk/0/id/traffic-count/\&quot;&amp;gt;&quot; &amp;gt; fileName;
  }

  cp = $7;
  date = $15;
  direction = substr($16, 2, length($16) - 2);
  split(date, dateFields, &quot; &quot;);
  date = dateFields[1];
  split(date, dateFields, &quot;/&quot;);
  date = sprintf(&quot;%04d-%02d-%02d&quot;, dateFields[3], dateFields[2], dateFields[1]);
  hour = sprintf(&quot;%02d:00:00&quot;, $17);
  base = &quot;http://transport.data.gov.uk/0/id/traffic-count/&quot; cp &quot;/&quot; direction &quot;/&quot; date &quot;/&quot; hour;

  cycles = $18;
  motorbikes = $19;
  ...

  print &quot;&amp;lt;t:Count rdf:about=\&quot;&quot; base &quot;/cycle\&quot;&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:point&amp;gt;&quot; &amp;gt; fileName;
  print &quot;    &amp;lt;t:CountPoint rdf:about=\&quot;http://transport.data.gov.uk/0/id/traffic-count-point/&quot; cp &quot;\&quot;&amp;gt;&quot; &amp;gt; fileName;
  print &quot;      &amp;lt;t:count rdf:resource=\&quot;&quot; base &quot;/cycle\&quot; /&amp;gt;&quot; &amp;gt; fileName;
  print &quot;    &amp;lt;/t:CountPoint&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;/t:point&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:hour rdf:datatype=\&quot;http://www.w3.org/2001/XMLSchema#dateTime\&quot;&amp;gt;&quot; date &quot;T&quot; hour &quot;&amp;lt;/t:hour&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:direction&amp;gt;&quot; direction &quot;&amp;lt;/t:direction&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:category rdf:resource=\&quot;http://transport.data.gov.uk/0/category/bicycle\&quot; /&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;rdf:value  rdf:datatype=\&quot;http://www.w3.org/2001/XMLSchema#integer\&quot;&amp;gt;&quot; cycles &quot;&amp;lt;/rdf:value&amp;gt;&quot; &amp;gt; fileName;
  print &quot;&amp;lt;/t:Count&amp;gt;&quot; &amp;gt; fileName;
  print &quot;&amp;lt;t:Count rdf:about=\&quot;&quot; base &quot;/motorbike\&quot;&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:point&amp;gt;&quot; &amp;gt; fileName;
  print &quot;    &amp;lt;t:CountPoint rdf:about=\&quot;http://transport.data.gov.uk/0/id/traffic-count-point/&quot; cp &quot;\&quot;&amp;gt;&quot; &amp;gt; fileName;
  print &quot;      &amp;lt;t:count rdf:resource=\&quot;&quot; base &quot;/motorbike\&quot; /&amp;gt;&quot; &amp;gt; fileName;
  print &quot;    &amp;lt;/t:CountPoint&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;/t:point&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:hour rdf:datatype=\&quot;http://www.w3.org/2001/XMLSchema#dateTime\&quot;&amp;gt;&quot; date &quot;T&quot; hour &quot;&amp;lt;/t:hour&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:direction&amp;gt;&quot; direction &quot;&amp;lt;/t:direction&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;t:category rdf:resource=\&quot;http://transport.data.gov.uk/0/category/motorbike\&quot; /&amp;gt;&quot; &amp;gt; fileName;
  print &quot;  &amp;lt;rdf:value  rdf:datatype=\&quot;http://www.w3.org/2001/XMLSchema#integer\&quot;&amp;gt;&quot; motorbikes &quot;&amp;lt;/rdf:value&amp;gt;&quot; &amp;gt; fileName;
  print &quot;&amp;lt;/t:Count&amp;gt;&quot; &amp;gt; fileName;
  ...
}
END {
  print &quot;&amp;lt;/rdf:RDF&amp;gt;&quot; &amp;gt; fileName; 
  close(fileName);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This also generates a shall script that includes the curl instructions to upload the files.&lt;/p&gt;

&lt;p&gt;The original data contained easing/northing information about each point when generally latitude/longitude is easier for mapping. So I extracted the easting/northings, used the &lt;a href=&quot;http://gps.ordnancesurvey.co.uk/convert.asp&quot;&gt;free (Windows only) software available via the Ordnance Survey&lt;/a&gt; to turn these into latitude/longitude &amp;#8212; there is a &lt;a href=&quot;http://gps.ordnancesurvey.co.uk/convertbatch.asp?location=0&quot;&gt;web service&lt;/a&gt; to do the same, but you can only do 200 coordinates at a time &amp;#8212; converted those into decimals, then RDF, and uploaded them.&lt;/p&gt;

&lt;p&gt;The PHP scripts that serve the data as linked data are exactly what I&amp;#8217;ve &lt;a href=&quot;http://www.jenitennison.com/blog/node/111&quot;&gt;shown before&lt;/a&gt;. I amended the &lt;code&gt;.htaccess&lt;/code&gt; file to redirect to an appropriate PHP script like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;IfModule mod_rewrite.c&amp;gt;
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d

  RewriteRule ^id/(.+)$  id.php [L]

  RewriteCond %{REQUEST_URI} !\.php
  RewriteRule ^([^/]+)(/.+)? $1.php$2 [L,QSA]
&amp;lt;/IfModule&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and created PHP scripts for each of the types of data being published. For example, &lt;code&gt;region.php&lt;/code&gt; is:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  include &quot;utils.php&quot;;
  proxy(&#039;http://geo.data.gov.uk/0/ontology/geo#Region&#039;, 50);
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And there we have it. Linked traffic count data on the web.&lt;/p&gt;

&lt;p&gt;(And because this is all published through Talis, there&amp;#8217;s also a &lt;a href=&quot;http://api.talis.com/stores/transport/services/sparql&quot;&gt;SPARQL endpoint&lt;/a&gt; that you could use to run queries and &lt;a href=&quot;http://www.jenitennison.com/blog/node/112&quot;&gt;create visualisations&lt;/a&gt;. Knock yourself out.)&lt;/p&gt;

&lt;p&gt;Please take a look and comment on what we&amp;#8217;ve done. What&amp;#8217;s your opinion of the URI scheme? Is it useful to be able to access the data as linked data? Which other formats would you like to see?&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/115#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/46">linked data</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/50">psi</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/48">uri</category>
 <pubDate>Sun, 26 Jul 2009 15:38:54 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">115 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Creating Google Visualisations of Linked Data</title>
 <link>http://www.jenitennison.com/blog/node/113</link>
 <description>&lt;p&gt;&lt;em&gt;Update: For the people who couldn&amp;#8217;t read the post because the graph didn&amp;#8217;t have 0 as its x-axis minimum, here is the version of the graph that does. I haven&amp;#8217;t removed the other version, since doing so would make the comments confusing and I think it&amp;#8217;s interesting to compare the two.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/files/LononBoroughBarchart2.jpg&quot; alt=&quot;London Borough Life Expectancy Bar Chart with Y-Axis Minimum at 0&quot; title=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;There are idealists who immediately see the publication of Open Data as a Good Thing, and leap up and down (metaphorically or physically) shouting &amp;#8220;Raw Data Now&amp;#8221;. There are also a whole bunch of people who need to &amp;#8220;see the shiny&amp;#8221;. They need to understand &lt;em&gt;why&lt;/em&gt; publishing Open Data is a Good Thing, and most particularly what the benefit is going to be to &lt;em&gt;them&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This is understandable. Publishers bear the cost of the development of URI schemes, XML formats, RDF ontologies, and other infrastructure for serving data, and the ongoing maintenance cost of domain resolution, bandwidth usage and user support. Even publishers with a public-service remit (who may not need to see &lt;em&gt;monetary&lt;/em&gt; payback) need to be convinced that there will be some kind of return on the investment.&lt;/p&gt;

&lt;p&gt;One result of making data available is that it enables you and others to easily construct nice visualisations over the data, and maybe spot useful patterns within it. This is particularly useful for public sector information because it can provide feedback on how effective a particular policy has been or where more resources need to be spent.&lt;/p&gt;

&lt;p&gt;So I thought it would be worthwhile trying to explore how to create visualisations of some data, starting with the &lt;a href=&quot;http://spreadsheets.google.com/ccc?key=t3bns85prAbiChLmFhlcB1Q&quot;&gt;London Borough data&lt;/a&gt; that I&amp;#8217;ve &lt;a href=&quot;http://www.jenitennison.com/blog/node/109&quot;&gt;published using Talis&lt;/a&gt;.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;Nowadays, there are a bunch of visualisation libraries around. My first experiment is going to be using the &lt;a href=&quot;http://code.google.com/apis/visualization/&quot;&gt;Google Visualisation API&lt;/a&gt;, which offers a &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/gallery.html&quot;&gt;range of different, reasonably pretty, visualisations&lt;/a&gt; that work cross-browser using either SVG or VML. In this post, I&amp;#8217;m going to use the &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/gallery/barchart.html&quot;&gt;barchart visualisation&lt;/a&gt;, to create a chart of male and female life expectancy in the London Boroughs. It looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/files/LondonBoroughBarchart.jpg&quot; alt=&quot;London Borough life expectancy bar chart&quot; title=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;One of the interesting things about the Google Visualisation API is that you can provide the data for every visualisation using a &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/dev/implementing_data_source.html&quot;&gt;Data Source&lt;/a&gt;. A Data Source is any web page that, in the face of particular requests, returns JSON in a particular format. The format basically encodes a table with named (and typed) columns and a number of rows containing cells which have values for each of those columns.&lt;/p&gt;

&lt;p&gt;For the London Borough data, the JSON needs to look something like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;google.visualization.Query.setResponse({
  version:0.6,
  status:&#039;ok&#039;,
  reqId:0,
  table:{
    cols:[
      {id:&#039;label&#039;, type:&#039;string&#039;},
      {id:&#039;maleLE&#039;, type:&#039;number&#039;},
      {id:&#039;femaleLE&#039;, type:&#039;number&#039;}
    ],
    rows:[
      {c:[
        {v:&#039;Barking &amp;amp; Dagenham&#039;},
        {v:76.3},
        {v:80.3}]},
      {c:[
        {v:&#039;Barnet&#039;},
        {v:79.5},
        {v:83.6}]},
      {c:[
        {v:&#039;Bexley&#039;},
        {v:78.7},
        {v:82.7}]},
      ...
    ]}
})
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now we already know, from my &lt;a href=&quot;http://www.jenitennison.com/blog/node/110&quot;&gt;last couple&lt;/a&gt; &lt;a href=&quot;http://www.jenitennison.com/blog/node/111&quot;&gt;of blog posts&lt;/a&gt;, how to query the Talis platform through its &lt;a href=&quot;http://n2.talis.com/wiki/Store_Sparql_Service&quot;&gt;SPARQL endpoint&lt;/a&gt;, and how to use &lt;a href=&quot;http://n2.talis.com/wiki/Transformation_Service&quot;&gt;Talis&amp;#8217; transformation service&lt;/a&gt; to invoke an XSLT transformation over the XML that we get back from that. The same principle applies:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;create a SPARQL query that will return the information we want&lt;/li&gt;
&lt;li&gt;create an XSLT transformation that will tidy the result into the format we want it&lt;/li&gt;
&lt;li&gt;set up a PHP page to access the relevant URI and pass through the results&lt;/li&gt;
&lt;li&gt;create an .htaccess file so that invoking the PHP can be done without revealing it&amp;#8217;s PHP&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;First, the SPARQL query. As I said, a Data Source is essentially a table of named columns. While we could do a &lt;code&gt;DESCRIBE&lt;/code&gt; or &lt;code&gt;CREATE&lt;/code&gt; query, a &lt;code&gt;SELECT&lt;/code&gt; query is a lot closer match to the tabular layout that the Data Source needs because it&amp;#8217;s also essentially a table of named columns (the variables you select). For the data that we want in the table, an appropriate SPARQL query would be:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;SELECT ?label ?maleLE ?femaleLE
WHERE {
  ?borough a &amp;lt;http://www.jenitennison.com/ontology/data#LondonBorough&amp;gt; .
  ?borough &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label .
  ?borough &amp;lt;http://www.jenitennison.com/ontology/data#maleLifeExpectancy&amp;gt; ?maleLE .
  ?borough &amp;lt;http://www.jenitennison.com/ontology/data#femaleLifeExpectancy&amp;gt; ?femaleLE .
}
ORDER BY ?label
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Plugging this query into the Talis SPARQL endpoint (have a go at &lt;a href=&quot;http://api.talis.com/stores/rdfquery-dev1/services/sparql&quot;&gt;http://api.talis.com/stores/rdfquery-dev1/services/sparql&lt;/a&gt; if you like) gives a response in the &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-XMLres/&quot;&gt;SPARQL Query Results Format&lt;/a&gt; which looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;sparql xmlns=&quot;http://www.w3.org/2005/sparql-results#&quot;&amp;gt;
  &amp;lt;head&amp;gt;
    &amp;lt;variable name=&quot;label&quot;/&amp;gt;
    &amp;lt;variable name=&quot;maleLE&quot;/&amp;gt;
    &amp;lt;variable name=&quot;femaleLE&quot;/&amp;gt;
  &amp;lt;/head&amp;gt;
  &amp;lt;results&amp;gt;
    &amp;lt;result&amp;gt;
      &amp;lt;binding name=&quot;label&quot;&amp;gt;
        &amp;lt;literal&amp;gt;Barking &amp;amp;amp; Dagenham&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
      &amp;lt;binding name=&quot;maleLE&quot;&amp;gt;
        &amp;lt;literal datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;76.3&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
      &amp;lt;binding name=&quot;femaleLE&quot;&amp;gt;
        &amp;lt;literal datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;80.3&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
    &amp;lt;/result&amp;gt;
    &amp;lt;result&amp;gt;
      &amp;lt;binding name=&quot;label&quot;&amp;gt;
        &amp;lt;literal&amp;gt;Barnet&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
      &amp;lt;binding name=&quot;maleLE&quot;&amp;gt;
        &amp;lt;literal datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;79.5&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
      &amp;lt;binding name=&quot;femaleLE&quot;&amp;gt;
        &amp;lt;literal datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;83.6&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
    &amp;lt;/result&amp;gt;
    &amp;lt;result&amp;gt;
      &amp;lt;binding name=&quot;label&quot;&amp;gt;
        &amp;lt;literal&amp;gt;Bexley&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
      &amp;lt;binding name=&quot;maleLE&quot;&amp;gt;
        &amp;lt;literal datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;78.7&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
      &amp;lt;binding name=&quot;femaleLE&quot;&amp;gt;
        &amp;lt;literal datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;82.7&amp;lt;/literal&amp;gt;
      &amp;lt;/binding&amp;gt;
    &amp;lt;/result&amp;gt;
    ...
  &amp;lt;/results&amp;gt;
&amp;lt;/sparql&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The second step &amp;#8212; transforming this SPARQL result into JSON &amp;#8212; just take a little bit of XSLT (1.0, remember, because that&amp;#8217;s all the Talis&amp;#8217; Transformation Service can manage). My aim in this post is to show that anyone, even if they don&amp;#8217;t have write access to a Talis data store, can create these visualisations, so I&amp;#8217;ve just put the XSLT on my site at &lt;a href=&quot;http://www.jenitennison.com/visualisation/data/SRXtoGoogleVisData.xsl&quot;&gt;http://www.jenitennison.com/visualisation/data/SRXtoGoogleVisData.xsl&lt;/a&gt;. I&amp;#8217;m not going to duplicate it here; it&amp;#8217;s generic enough for reuse should you want to.&lt;/p&gt;

&lt;p&gt;The third step is to create some PHP that handles a query from the Google Visualisation. The requests will include &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/dev/implementing_data_source.html#requestformat&quot;&gt;two parameters&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;tqx&lt;/code&gt; defines details about how the data should be returned, such as its format&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tq&lt;/code&gt; defines a query in the &lt;a href=&quot;http://code.google.com/apis/visualization/documentation/querylanguage.html&quot;&gt;Google Visualisation API query language&lt;/a&gt; that identifies precisely the data that should be returned&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I want this PHP to be reusable, so I&amp;#8217;ve created a &lt;code&gt;utils.php&lt;/code&gt; that looks like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  $store = &#039;rdfquery-dev1&#039;;

  function proxy($filter, $order) {
    global $store;
    $tqx = $_GET[&#039;tqx&#039;];
    $tq = $_GET[&#039;tq&#039;];

    // Parse tq parameter
    if ($tq) {
      $select = stristr($tq, &#039;select &#039;);
      $select = substr($select, 7);
      $select = explode(&#039;,&#039;, $select);
      foreach ($select as $var) {
        $var = trim($var);
        $vars[] = &quot;?$var&quot;;
      }
      $vars = implode(&#039; &#039;, $vars);
    } else {
      $vars = &#039;*&#039;;
    }
    $sparql = &quot;SELECT $vars WHERE { $filter } ORDER BY $order&quot;;

    $params = array(&#039;query&#039; =&amp;gt; $sparql, &#039;output&#039; =&amp;gt; &#039;xml&#039;);
    $query = http_build_query($params);
    $rdfURL = &quot;http://api.talis.com/stores/$store/services/sparql?$query&quot;;

    // URL for the transformation
    $params = array(&#039;xml-uri&#039; =&amp;gt; $rdfURL, 
      &#039;xsl-uri&#039; =&amp;gt; &quot;http://www.jenitennison.com/visualisation/data/SRXtoGoogleVisData.xsl&quot;,
      &#039;tqx&#039; =&amp;gt; $tqx);
    $query = http_build_query($params);
    $txURL = &quot;http://api.talis.com/tx?$query&quot;;

    $resource = fopen($txURL, &#039;rb&#039;);
    header(&#039;Content-Type: application/json&#039;, true);
    fpassthru($resource);
    return;
  }
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Things to note here are about the processing of the &lt;code&gt;tq&lt;/code&gt; and &lt;code&gt;tqx&lt;/code&gt; parameters that are sent by the Google Visualisation to control what and how the data appears. The &lt;code&gt;tqx&lt;/code&gt; parameter gets passed through into the stylesheet as a parameter, and parsed there. The &lt;code&gt;tq&lt;/code&gt; parameter is used to construct the SPARQL query itself, specifically which variables will get included within the result. The rest of the SPARQL query &amp;#8212; the filter and the ordering &amp;#8212; are set in the code which calls the &lt;code&gt;proxy()&lt;/code&gt; function, which is in &lt;code&gt;london-borough.php&lt;/code&gt; within the same directory, and looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  include &quot;utils.php&quot;;
  proxy(&#039;?borough a &amp;lt;http://www.jenitennison.com/ontology/data#LondonBorough&amp;gt; .
         ?borough &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label .
         ?borough &amp;lt;http://www.jenitennison.com/ontology/data#maleLifeExpectancy&amp;gt; ?maleLE .
         ?borough &amp;lt;http://www.jenitennison.com/ontology/data#femaleLifeExpectancy&amp;gt; ?femaleLE .&#039;, 
        &#039;?label&#039;);
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This code defines the names of the variables that can be used within the &lt;code&gt;tq&lt;/code&gt; parameter, and therefore selected and displayed within the graph. So for example, if I request:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://www.jenitennison.com/visualisation/data/london-borough?tq=select+label,+maleLE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;this translates into the SPARQL query:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;SELECT ?label ?maleLE
WHERE {
  ?borough a &amp;lt;http://www.jenitennison.com/ontology/data#LondonBorough&amp;gt; .
  ?borough &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label .
  ?borough &amp;lt;http://www.jenitennison.com/ontology/data#maleLifeExpectancy&amp;gt; ?maleLE .
  ?borough &amp;lt;http://www.jenitennison.com/ontology/data#femaleLifeExpectancy&amp;gt; ?femaleLE .
}
ORDER BY ?label
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;At the moment, the PHP just uses the &lt;code&gt;select&lt;/code&gt; portion of the &lt;code&gt;tq&lt;/code&gt; parameter to determine which data to display. It would be possible to map other aspects of the Google Visualisation query language onto SPARQL, but this will do for now.&lt;/p&gt;

&lt;p&gt;The final step is to amend the .htaccess to do a basic rewrite to prevent people from having to put .php at the end of the URI because I don&amp;#8217;t like URIs that indicate the technology they&amp;#8217;re using. In this case it looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;IfModule mod_rewrite.c&amp;gt;
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d

  RewriteRule ^(.+) $1.php [L,QSA]
&amp;lt;/IfModule&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;as this gives me flexibility later on to add more PHP files that can do similar things.&lt;/p&gt;

&lt;p&gt;So now we have a Data Source that can provide the label, male life expectancy and female life expectancy for the London Boroughs. Using it requires a copy and tweak of an example from Google:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;html&amp;gt;
  &amp;lt;head&amp;gt;
    &amp;lt;script type=&quot;text/javascript&quot; src=&quot;http://www.google.com/jsapi&quot;&amp;gt;&amp;lt;/script&amp;gt;
    &amp;lt;script type=&quot;text/javascript&quot;&amp;gt;
      google.load(&quot;visualization&quot;, &quot;1&quot;, {packages:[&quot;barchart&quot;]});
      google.setOnLoadCallback(drawChart);
      function drawChart() {
        // Replace the data source URL on next line with your data source URL.
        var query = new google.visualization.Query(&#039;http://www.jenitennison.com/visualisation/data/london-borough&#039;);
        query.setQuery(&#039;select label, maleLE, femaleLE&#039;);
        // Send the query with a callback function.
        query.send(handleQueryResponse);
      };
      function handleQueryResponse(response) {
        if (response.isError()) {
          alert(&#039;Error in query: &#039; + response.getMessage() + &#039; &#039; + response.getDetailedMessage());
          return;
        }

        var data = response.getDataTable();
        data.setColumnLabel(1, &#039;Male Life Expectancy&#039;);
        data.setColumnLabel(2, &#039;Female Life Expectancy&#039;);
        var chart = new google.visualization.BarChart(document.getElementById(&#039;chart_div&#039;));
        chart.draw(data, {
          width: 600, 
          height: 450, 
          is3D: false, 
          title: &#039;Life Expectancy in London Boroughs&#039;,
          axisFontSize: 10,
          colors: [&#039;#4162B5&#039;, &#039;#CF413A&#039;]});
      };
    &amp;lt;/script&amp;gt;
  &amp;lt;/head&amp;gt;

  &amp;lt;body&amp;gt;
    &amp;lt;div id=&quot;chart_div&quot;&amp;gt;&amp;lt;/div&amp;gt;
  &amp;lt;/body&amp;gt;
&amp;lt;/html&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;drawChart()&lt;/code&gt; function is where the URL to the Data Source gets set. It&amp;#8217;s actually done in two parts:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;var query = new google.visualization.Query(&#039;http://www.jenitennison.com/visualisation/data/london-borough&#039;);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;sets the base URI to be used, and:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;query.setQuery(&#039;select label, maleLE, femaleLE&#039;);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;sets the value of the &lt;code&gt;tq&lt;/code&gt; parameter without you having to worry about escaping the special characters it may contain.&lt;/p&gt;

&lt;p&gt;After getting the data, I set the column labels in the code itself: they aren&amp;#8217;t provided in the Data Source because it&amp;#8217;s easier and more generic this way. Plus I set a bunch of other display options within the HTML page, so it seems like the right place for it.&lt;/p&gt;

&lt;p&gt;The result of all this is a graph that you can see at &lt;a href=&quot;http://www.jenitennison.com/visualisation/london-borough-life-expectancy.html&quot;&gt;http://www.jenitennison.com/visualisation/london-borough-life-expectancy.html&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to do something similar, feel free to grab hold of &lt;code&gt;utils.php&lt;/code&gt;. You can either reuse my hosted copy of &lt;code&gt;SRXtoGoogleVisData.xsl&lt;/code&gt; or move it onto your own site. Then all you have to do is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;adjust the graph variable in &lt;code&gt;utils.php&lt;/code&gt;, and the location of &lt;code&gt;SRXtoGoogleVisData.xsl&lt;/code&gt; if you need to&lt;/li&gt;
&lt;li&gt;create another PHP file similar to &lt;code&gt;london-borough.php&lt;/code&gt; that defines a filter and an ordering over a set of data&lt;/li&gt;
&lt;li&gt;tweak your .htaccess if you want to&lt;/li&gt;
&lt;li&gt;create an HTML page that references your Data Source to create a Google Visualisation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;My current plan is to continue to refine &lt;code&gt;utils.php&lt;/code&gt; and &lt;code&gt;SRXtoGoogleVisData.xsl&lt;/code&gt; to make it easy to create SPARQL-backed visualisations. More soon.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/113#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/49">visualisation</category>
 <enclosure url="http://www.jenitennison.com/blog/files/utils.php_1.txt" length="1160" type="text/plain" />
 <pubDate>Thu, 23 Jul 2009 21:37:33 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">113 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Publishing Linked Data on the Talis Platform, Part 3</title>
 <link>http://www.jenitennison.com/blog/node/111</link>
 <description>&lt;p&gt;This is the third in a series of posts about using the &lt;a href=&quot;http://www.talis.com/platform/&quot;&gt;Talis Platform&lt;/a&gt; as a back end for serving linked data. In the &lt;a href=&quot;http://www.jenitennison.com/blog/node/109&quot;&gt;first part&lt;/a&gt;, I showed how to add data to a store. In the &lt;a href=&quot;http://www.jenitennison.com/blog/node/110&quot;&gt;second post&lt;/a&gt;, I showed how to use some PHP scripts to publish the data as Linked Data, at the URLs you use as your identifiers.&lt;/p&gt;

&lt;p&gt;In this post, I&amp;#8217;m going to begin the process of exposing the data in a way that makes it easy to locate and reuse. One of the biggest lessons I learned after the initial publication of the &lt;a href=&quot;http://www.london-gazette.co.uk&quot;&gt;London Gazette&lt;/a&gt; data as RDFa is that the publication of data and metadata about individual items is not enough. To make the data usable, you have to make it discoverable. To make it discoverable there must be an entry point from which you can locate the data. One kind of easy entry point is a list.&lt;/p&gt;

&lt;p&gt;In the case of the data about London Boroughs that I&amp;#8217;ve been using, there aren&amp;#8217;t currently any links to the data, so there is no way to discover it aside from me telling you the URI template (&lt;code&gt;http://www.jenitennison.com/data/id/london-borough/{name}&lt;/code&gt;, where name is hyphenated and in lowercase) and you knowing the name of a London Borough that you want to look up. Discovery via a URI template that I told you relies on out-of-band information, and contradicts the RESTful tenet of &amp;#8220;hypertext as the engine of application state&amp;#8221;.&lt;/p&gt;

&lt;p&gt;Instead, I need to offer an entry point from which you can follow links (or fill in forms) to discover information about the various London Boroughs. Since I&amp;#8217;m dealing with a small set of information here, I&amp;#8217;m going to do this in the straight-forward way of having &lt;code&gt;http://www.jenitennison.com/data/london-borough&lt;/code&gt; contain a brief description of each of the known London Boroughs, including (obviously) a link to the URI for the London Borough, from which you can get more information.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;I don&amp;#8217;t want to return &lt;em&gt;all&lt;/em&gt; the information about each of the London Boroughs within the list, so I&amp;#8217;m going to use a &lt;code&gt;CONSTRUCT&lt;/code&gt; SPARQL query to create triples that include just the type and label (if it has one) of the borough. Here&amp;#8217;s the initial query:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;CONSTRUCT { 
  ?thing a &amp;lt;http://www.jenitennison.com/ontology/data#LondonBorough&amp;gt; . 
  ?thing &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label . } 
WHERE { 
  ?thing a &amp;lt;http://www.jenitennison.com/ontology/data#LondonBorough&amp;gt; . 
  OPTIONAL { ?thing &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label . }}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now I want to make this request a bit more generic. The type that I&amp;#8217;m looking for here is &lt;code&gt;http://www.jenitennison.com/ontology/data#LondonBorough&lt;/code&gt; but if I (or you!) wanted to use it for resources of different types, you&amp;#8217;d want it to reference a different class. In addition, while the list of London Boroughs is reasonably small, lists of individuals of other types might be much larger, in which case you&amp;#8217;d want a facility to page through them.&lt;/p&gt;

&lt;p&gt;So in the PHP code I&amp;#8217;m going to have three variables: &lt;code&gt;$type&lt;/code&gt; is the URI of the class of things that should be listed, &lt;code&gt;$limit&lt;/code&gt; is the number of those things that should appear on each page, and &lt;code&gt;$start&lt;/code&gt; is the first item in the page. In addition, to ensure a consistent and (hopefully) meaningful order, I&amp;#8217;m going to order the results based on their URI. So this is what the query looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;CONSTRUCT { 
  ?thing a &amp;lt;$type&amp;gt; . 
  ?thing &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label . } 
WHERE { 
  ?thing a &amp;lt;$type&amp;gt; . 
  OPTIONAL { ?thing &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label . }} 
ORDER BY ?thing
LIMIT $limit
OFFSET $start
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I&amp;#8217;m also going to rejig the PHP that I&amp;#8217;ve been using to use this query when it receives the request &lt;code&gt;http://www.jenitennison.com/data/london-borough&lt;/code&gt;. First,  &lt;code&gt;utils.php&lt;/code&gt; is now going to hold a &lt;code&gt;proxy()&lt;/code&gt; function that performs requests based on the URI of the request or the arguments passed to it. Aside from the body of the SPARQL query, the code is pretty similar to the ASK query (it&amp;#8217;s getting close to the point where I should at least refactor, or maybe start using &lt;a href=&quot;http://code.google.com/p/moriarty&quot;&gt;Moriarty&lt;/a&gt; though I&amp;#8217;d like to keep this lightweight to provide the minimum overhead for other people wanting to use this mechanism for publishing their data). Here&amp;#8217;s the function:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;function proxy($type, $limit = 10) {
  global $store;
  $docUri = $_SERVER[&#039;REQUEST_URI&#039;];

  // URL for the RDF
  if ($_SERVER[&#039;PATH_INFO&#039;]) {
    // Request for a specific thing
    $dir = dirname($_SERVER[&#039;SCRIPT_NAME&#039;]);
    $path = substr($docUri, strlen($dir));
    $idUri = &quot;$dir/id$path&quot;;
    if (exists($idUri)) {
      $domain = $_SERVER[&#039;HTTP_HOST&#039;];
      $id = &quot;http://$domain$idUri&quot;;
      $params = array(&#039;about&#039; =&amp;gt; $id, &#039;output&#039; =&amp;gt; &#039;rdf&#039;);
      $query = http_build_query($params);
      $rdfURL = &quot;http://api.talis.com/stores/$store/meta?$query&quot;;
    } else {
      error();
      return;
    }
  } else {
    // Request for a list of $limit individuals of type $type
    $start = (int)$_GET[&#039;start&#039;];
    $sparql = &quot;CONSTRUCT { ?thing a &amp;lt;$type&amp;gt; . ?thing &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label . } WHERE { ?thing a &amp;lt;$type&amp;gt; . OPTIONAL { ?thing &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt; ?label . }} ORDER BY ?thing LIMIT $limit OFFSET $start&quot;;
    $params = array(&#039;query&#039; =&amp;gt; $sparql, &#039;output&#039; =&amp;gt; &#039;rdf&#039;);
    $query = http_build_query($params);
    $rdfURL = &quot;http://api.talis.com/stores/$store/services/sparql?$query&quot;;
  }

  // URL for the transformation
  $params = array(&#039;xml-uri&#039; =&amp;gt; $rdfURL, 
    &#039;xsl-uri&#039; =&amp;gt; &quot;http://api.talis.com/stores/$store/items/tidyRDF.xsl&quot;);
  $query = http_build_query($params);
  $txURL = &quot;http://api.talis.com/tx?$query&quot;;

  $resource = fopen($txURL, &#039;rb&#039;);
  header(&quot;Content-Type: application/rdf+xml&quot;, true);
  header(&quot;Content-Location: $docUri.rdf&quot;, true);
  fpassthru($resource);
  return;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now I need to make sure that requests to &lt;code&gt;http://www.jenitennison.com/data/london-borough&lt;/code&gt; calls this function with the correct type and limit. To do this I create a &lt;code&gt;london-borough.php&lt;/code&gt; that looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  include &quot;utils.php&quot;;
  proxy(&#039;http://www.jenitennison.com/ontology/data#LondonBorough&#039;, 50);
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and adjust &lt;code&gt;.htaccess&lt;/code&gt; to redirect to this PHP script:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;IfModule mod_rewrite.c&amp;gt;
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d

  RewriteRule ^id/(.+)$ id.php [L]
  RewriteRule ^london-borough(/.+)? london-borough.php$1 [L,QSA]
&amp;lt;/IfModule&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note that in the &lt;code&gt;RewriteRule&lt;/code&gt; for URIs starting with &lt;code&gt;london-borough&lt;/code&gt;, I&amp;#8217;ve included &lt;code&gt;QSA&lt;/code&gt; in the options, which means that query string parameters (such as the &lt;code&gt;start&lt;/code&gt;) parameter will enable paging through the results. For example, if I were only reporting 10 London Boroughs at a time, I could use&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://www.jenitennison.com/data/london-borough
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;to get the first ten London Boroughs and&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://www.jenitennison.com/data/london-borough?start=10
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;to get the next ten.&lt;/p&gt;

&lt;p&gt;When you request &lt;code&gt;http://www.jenitennison.com/data/london-borough&lt;/code&gt; what you get back is the neatened RDF for the London Boroughs, which looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;rdf:RDF xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;
         xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;&amp;gt;
  &amp;lt;LondonBorough xmlns=&quot;http://www.jenitennison.com/ontology/data#&quot;
                  rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/barking-and-dagenham&quot;&amp;gt;
    &amp;lt;rdfs:label&amp;gt;Barking &amp;amp;amp; Dagenham&amp;lt;/rdfs:label&amp;gt;
  &amp;lt;/LondonBorough&amp;gt;
  &amp;lt;LondonBorough xmlns=&quot;http://www.jenitennison.com/ontology/data#&quot;
                  rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/barnet&quot;&amp;gt;
    &amp;lt;rdfs:label&amp;gt;Barnet&amp;lt;/rdfs:label&amp;gt;
  &amp;lt;/LondonBorough&amp;gt;
  ...
&amp;lt;/rdf:RDF&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now this is OK, but I think the best way of serving &lt;em&gt;lists of things&lt;/em&gt; is through Atom or RSS, with RSS 1.0 fitting better with the RDF world because it is RDF. Both formats provide mechanisms to give metadata about the list, including links to the next set of information, to enable pagination through the list. So what I&amp;#8217;d like to do is provide a mechanism for serving back different formats for this information. And not only for the lists, but for the data about the London Boroughs themselves: Talis supports serving data in Turtle and RDF/JSON as well as RDF/XML, so providing those formats should be cheap. This is something to come back to later.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/111#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <enclosure url="http://www.jenitennison.com/blog/files/utils.php_0.txt" length="2351" type="text/plain" />
 <pubDate>Tue, 21 Jul 2009 21:39:21 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">111 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Publishing Linked Data on the Talis Platform, Part 2</title>
 <link>http://www.jenitennison.com/blog/node/110</link>
 <description>&lt;p&gt;In &lt;a href=&quot;http://www.jenitennison.com/blog/node/109&quot;&gt;my last post&lt;/a&gt;, I showed how to add data to a &lt;a href=&quot;http://www.talis.com/platform/&quot;&gt;Talis&lt;/a&gt; store. In this post, I&amp;#8217;m going to show how you can use the Talis Platform as a back end for a Linked Data view on the RDF you added to it.&lt;/p&gt;

&lt;p&gt;As you&amp;#8217;ll see, the great thing about this method is that it only takes a couple of PHP files and an &lt;code&gt;.htaccess&lt;/code&gt; file on a server. Assuming that you&amp;#8217;ve got a web server that supports PHP, it&amp;#8217;s an approach you can use without installing anything. The code I&amp;#8217;ve written is pretty generic and should be widely applicable; feel free to reuse and adapt it.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;One of the principles of Linked Data is that if you make a GET request to a URI that&amp;#8217;s used as an identifier within an RDF triple, you&amp;#8217;ll get back some useful information about that resource. I&amp;#8217;ve created URIs like &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt; and added triples to Talis about those resources, but I haven&amp;#8217;t yet put anything in place such that actually requesting &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt; will provide a useful response. So how do I do that?&lt;/p&gt;

&lt;p&gt;Well, it&amp;#8217;s easy enough with a bit of PHP to do the forwarding. (By the way, this is the first bit of PHP I&amp;#8217;ve ever done, so feel free to point out all the glaring problems with it; I&amp;#8217;d love to learn.)&lt;/p&gt;

&lt;p&gt;Now, the URI &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt; is a URI that I&amp;#8217;ve made up for a London Borough, and obviously when you request that URI you&amp;#8217;re not actually going to get the London Borough delivered to you through your computer screen. Instead, based on &lt;a href=&quot;http://www.w3.org/TR/cooluris/#r303gendocument&quot;&gt;Cool URIs for the Semantic Web&lt;/a&gt;, I want to either respond with a &lt;code&gt;303 See Other&lt;/code&gt; redirection to a document resource &lt;em&gt;describing&lt;/em&gt; the borough, or a &lt;code&gt;404 Not Found&lt;/code&gt; to say that it doesn&amp;#8217;t exist.&lt;/p&gt;

&lt;p&gt;Note that I don&amp;#8217;t just want to blindly respond with a &lt;code&gt;303 See Other&lt;/code&gt;. If someone requests &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/rubbish&lt;/code&gt; I want to tell them that the London Borough of &amp;#8216;Rubbish&amp;#8217; doesn&amp;#8217;t exist. If I redirected them to a document URI which then 404&amp;#8217;ed, it would mean the London Borough of &amp;#8216;Rubbish&amp;#8217; exists, but we have no information about it. So I can&amp;#8217;t use a simple URL rewrite; I have to check for its presence first.&lt;/p&gt;

&lt;h2&gt;Existence Tests&lt;/h2&gt;

&lt;p&gt;The first task, then, is to test whether the resource exists. To do that, I can execute an ASK request on the &lt;a href=&quot;http://n2.talis.com/wiki/Store_Sparql_Service&quot;&gt;SPARQL endpoint&lt;/a&gt; that Talis provides for the store. The ASK request simply looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ASK { &amp;lt;http://www.jenitennison.com/data/id/london-borough/barnet&amp;gt; ?p ?v . }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which asks if there are any triples at all that involve that URI. I request the JSON response using the &lt;code&gt;output=json&lt;/code&gt; parameter. The JSON looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{&quot;head&quot;:{},&quot;boolean&quot;:true}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;if the store holds any triples about the borough and:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{&quot;head&quot;:{},&quot;boolean&quot;:false}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;if it doesn&amp;#8217;t. The URI for the request looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://api.talis.com/stores/rdfquery-dev1/services/sparql?query=ASK+%7B+%3Chttp%3A%2F%2Fwww.jenitennison.com%2Fdata%2Fid%2Flondon-borough%2Fbarnet%3E+%3Fp+%3Fv+.+%7D
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which looks pretty horrendous when you write it out but is easy enough to construct with PHP. Here&amp;#8217;s the &lt;code&gt;exists()&lt;/code&gt; function which does the test based on the server host name used in the request and a path that&amp;#8217;s passed in.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$store = &#039;rdfquery-dev1&#039;;

function exists($idUri) {
  global $store;
  $host = $_SERVER[&#039;HTTP_HOST&#039;];
  $id = &quot;http://$host$idUri&quot;;
  $sparql = &quot;ASK { &amp;lt;$id&amp;gt; ?p ?v . }&quot;;
  $params = array(&#039;query&#039; =&amp;gt; $sparql, &#039;output&#039; =&amp;gt; &#039;json&#039;);
  $query = http_build_query($params);
  $request = &quot;http://api.talis.com/stores/$store/services/sparql?$query&quot;;
  $resource = file_get_contents($request, &#039;rb&#039;);
  $result = strstr(strstr($resource, &quot;\&quot;boolean\&quot;:&quot;), &quot;:&quot;);
  return !strstr($result, &quot;false&quot;);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Handling Identifier URIs&lt;/h2&gt;

&lt;p&gt;With that function in &lt;code&gt;utils.php&lt;/code&gt;, it&amp;#8217;s pretty easy to create a &lt;code&gt;id.php&lt;/code&gt; that does the redirection that I need to do. For my purposes, I&amp;#8217;m using &lt;code&gt;/id/&lt;/code&gt; in all the URIs that identify abstract resources, and removing it for the document URIs that describe them. So the URI for the abstract resource &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt; will redirect to the document resource &lt;code&gt;http://www.jenitennison.com/data/london-borough/barnet&lt;/code&gt;. Here&amp;#8217;s &lt;code&gt;id.php&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  include &quot;utils.php&quot;;
  $idUri = $_SERVER[&#039;REQUEST_URI&#039;];
  if (exists($idUri)) {
    $docUri = str_replace(&#039;/id/&#039;, &#039;/&#039;, $idUri);
    header(&quot;Location: $docUri&quot;, true, 303);
  } else {
    error(404);
  }
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;error()&lt;/code&gt; function is also in &lt;code&gt;utils.php&lt;/code&gt; and looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;  function error() {
    header(&quot;HTTP/1.1 404 Not Found&quot;);
    echo &amp;lt;&amp;lt;&amp;lt;EOF
&amp;lt;html&amp;gt;
  &amp;lt;head&amp;gt;
    &amp;lt;title&amp;gt;404 Not Found&amp;lt;/title&amp;gt;
  &amp;lt;/head&amp;gt;
  &amp;lt;body&amp;gt;
    &amp;lt;h1&amp;gt;404 Not Found&amp;lt;/h1&amp;gt;
    &amp;lt;p&amp;gt;No such resource&amp;lt;/p&amp;gt;
  &amp;lt;/body&amp;gt;
&amp;lt;/html&amp;gt;
EOF;
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I have &lt;code&gt;id.php&lt;/code&gt; which will check for the presence of triples about the requested resource, and respond with either a &lt;code&gt;404 Not Found&lt;/code&gt; or a &lt;code&gt;303 See Other&lt;/code&gt;. Now I need to invoke &lt;code&gt;id.php&lt;/code&gt; whenever someone requests an identifier URI like &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt;. To do this, I put &lt;code&gt;id.php&lt;/code&gt; in the &lt;code&gt;/data&lt;/code&gt; directory within my webserver&amp;#8217;s documents and added a &lt;code&gt;.htaccess&lt;/code&gt; file that looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;IfModule mod_rewrite.c&amp;gt;
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d

  RewriteRule ^id/([^/]+)/(.+)$  id.php [L]
&amp;lt;/IfModule&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This says that any requests that aren&amp;#8217;t for existing files or directories and that start with &lt;code&gt;id&lt;/code&gt; should be redirected to &lt;code&gt;id.php&lt;/code&gt;. Since &lt;code&gt;id.php&lt;/code&gt; picks up on the original request URI, I don&amp;#8217;t need to pass anything extra into it by way of query parameters and what have you.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To make this &lt;code&gt;.htaccess&lt;/code&gt; file work, you have to have &lt;code&gt;mod_rewrite&lt;/code&gt; enabled and have &lt;code&gt;AllowOverride&lt;/code&gt; include &lt;code&gt;FileInfo&lt;/code&gt; (in &lt;code&gt;http.conf&lt;/code&gt;) . My ISP allows this, but the Apache installation on my Mac doesn&amp;#8217;t, and Apache generally doesn&amp;#8217;t out of the box, so you may need to do a bit of fiddling with configuration files.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, requesting &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt; redirects me with a &lt;code&gt;303 See Other&lt;/code&gt; to &lt;code&gt;http://www.jenitennison.com/data/london-borough/barnet&lt;/code&gt;, while requesting &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/rubbish&lt;/code&gt; gives me a &lt;code&gt;404 Not Found&lt;/code&gt; response.&lt;/p&gt;

&lt;h2&gt;Handling Document URIs&lt;/h2&gt;

&lt;p&gt;The next stage is supporting the document URIs like &lt;code&gt;http://www.jenitennison.com/data/london-borough/barnet&lt;/code&gt;. For them, I need to actually get the data about the resource out of the Talis Platform. Fortunately, there&amp;#8217;s a really easy way of doing that using a simple &lt;a href=&quot;http://n2.talis.com/wiki/Metabox&quot;&gt;request on the metabox&lt;/a&gt; like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://api.talis.com/stores/rdfquery-dev1/meta?about=http%3A%2F%2Fwww.jenitennison.com%2Fdata%2Fid%2Flondon-borough%2Fbarnet&amp;amp;output=rdf
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In other words, you pass the URI of the resource that you&amp;#8217;re interested in as the value of the &lt;code&gt;about&lt;/code&gt; parameter to the metabox store URI of &lt;code&gt;http://api.talis.com/stores/{store}/meta?about={resource}&amp;amp;output=rdf&lt;/code&gt;. This gives you back some RDF/XML. For the particular request above, the RDF/XML looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;rdf:RDF
    xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
    xmlns:j.0=&quot;http://www.jenitennison.com/ontology/data#&quot;
    xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot; &amp;gt; 
  &amp;lt;rdf:Description rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/barnet&quot;&amp;gt;
    &amp;lt;rdfs:label&amp;gt;Barnet&amp;lt;/rdfs:label&amp;gt;
    &amp;lt;rdf:type rdf:resource=&quot;http://www.jenitennison.com/ontology/data#LondonBorough&quot;/&amp;gt;
    &amp;lt;j.0:maleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;79.5&amp;lt;/j.0:maleLifeExpectancy&amp;gt;
    &amp;lt;j.0:femaleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;83.6&amp;lt;/j.0:femaleLifeExpectancy&amp;gt;
  &amp;lt;/rdf:Description&amp;gt;
&amp;lt;/rdf:RDF&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now I don&amp;#8217;t know about you, but this RDF/XML really makes me cringe. It&amp;#8217;s very obviously RDF, and it has a horrible &lt;code&gt;j.0&lt;/code&gt; prefix that no one would ever actually write if they were creating it in an editor. Readability matters, even for data that&amp;#8217;s aimed at computers. If I&amp;#8217;m going to use RDF/XML, I&amp;#8217;d really like it to be &lt;a href=&quot;http://www.jenitennison.com/blog/node/74&quot;&gt;sensible XML as well as being RDF&lt;/a&gt; (and &lt;a href=&quot;http://www.jenitennison.com/blog/node/74#comment-4463&quot;&gt;Leigh Dodds has given some good guidelines about how to do it&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;But of course since it&amp;#8217;s XML it&amp;#8217;s amendable to a spot of transformation. So it&amp;#8217;s not hard to transform the RDF/XML above into:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;LondonBorough xmlns=&quot;http://www.jenitennison.com/ontology/data#&quot;
               xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
               xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;
               rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/sutton&quot;&amp;gt;
   &amp;lt;rdfs:label&amp;gt;Sutton&amp;lt;/rdfs:label&amp;gt;
   &amp;lt;femaleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;82.6&amp;lt;/femaleLifeExpectancy&amp;gt;
   &amp;lt;maleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#decimal&quot;&amp;gt;78.7&amp;lt;/maleLifeExpectancy&amp;gt;
&amp;lt;/LondonBorough&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which is a little more acceptable. Talis offers a &lt;a href=&quot;http://n2.talis.com/wiki/Transformation_Service&quot;&gt;transformation service&lt;/a&gt; at:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://api.talis.com/tx
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It only supports XSLT 1.0. (There&amp;#8217;s also the &lt;a href=&quot;http://www.w3.org/2005/08/online_xslt/&quot;&gt;W3C XSLT 2.0 Service&lt;/a&gt; based on Saxon, but I get the impression they don&amp;#8217;t like people to use it in anger.)&lt;/p&gt;

&lt;p&gt;Anyway, each Talis store contains a contentbox as well as a metabox. The metabox holds the RDF/XML, and the contentbox can hold anything you like. I can put the XSLT stylesheet (&lt;code&gt;tidyRDF.xsl&lt;/code&gt;) into my store&amp;#8217;s contentbox using the command:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;curl -X PUT -H &quot;Content-type: application/xslt+xml&quot; --digest -u username:password --data-binary @tidyRDF.xsl 
  http://api.talis.com/stores/rdfquery-dev1/items/tidyRDF.xsl
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which then makes it accessible at:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://api.talis.com/stores/rdfquery-dev1/items/tidyRDF.xsl
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(I could also use my own server of course, but if Talis are offering free hosting, why not?&amp;#8230;)&lt;/p&gt;

&lt;p&gt;And that means that I can get the RDF/XML associated with &lt;code&gt;http://www.jenitennison.com/data/london-borough/barnet&lt;/code&gt; and transform it into some decent XML using a horrendous double-escaped URI that I&amp;#8217;m not going to replicate here. The &lt;code&gt;proxy.php&lt;/code&gt; script does this all nicely behind the scenes:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;?php
  include &quot;utils.php&quot;;
  $docUri = $_SERVER[&#039;REQUEST_URI&#039;];
  $dir = dirname($_SERVER[&#039;SCRIPT_NAME&#039;]);
  $path = substr($docUri, strlen($dir));
  $idUri = &quot;$dir/id$path&quot;;
  if (exists($idUri)) {
    $domain = $_SERVER[&#039;HTTP_HOST&#039;];

    // URL for the RDF
    $id = &quot;http://$domain$idUri&quot;;
    $params = array(&#039;about&#039; =&amp;gt; $id, &#039;output&#039; =&amp;gt; &#039;rdf&#039;);
    $query = http_build_query($params);
    $rdfURL = &quot;http://api.talis.com/stores/$store/meta?$query&quot;;

    // URL for the transformation
    $params = array(&#039;xml-uri&#039; =&amp;gt; $rdfURL, 
      &#039;xsl-uri&#039; =&amp;gt; &quot;http://api.talis.com/stores/$store/items/tidyRDF.xsl&quot;);
    $query = http_build_query($params);
    $txURL = &quot;http://api.talis.com/tx?$query&quot;;

    $resource = fopen($txURL, &#039;rb&#039;);
    header(&quot;Content-Type: application/rdf+xml&quot;);
    header(&quot;Content-Location: $docUri.rdf&quot;);
    fpassthru($resource);
    return;
  } else {
    error(404);
  }
?&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;With &lt;code&gt;proxy.php&lt;/code&gt; in the &lt;code&gt;/data&lt;/code&gt; directory on my server, I need a slight tweak to the &lt;code&gt;.htaccess&lt;/code&gt; to make sure that all non-id requests go to it:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;IfModule mod_rewrite.c&amp;gt;
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d

  RewriteRule ^id/(.+)$  id.php [L]

  RewriteCond %{REQUEST_URI} !\.php
  RewriteRule ^(.+)$ proxy.php [L]
&amp;lt;/IfModule&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And Bob, as they say, is your uncle.&lt;/p&gt;

&lt;p&gt;Requests to identifier URIs redirect to document URIs. Requests to document URIs return relevant RDF/XML for the resource. Have a look at &lt;a href=&quot;http://www.jenitennison.com/data/id/london-borough/barnet&quot;&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/a&gt; for example.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Updated: fixed the link in the final paragraph so it actually pointed to the right location. Duh.&lt;/em&gt;&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/110#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <enclosure url="http://www.jenitennison.com/blog/files/id.php.txt" length="214" type="text/plain" />
 <pubDate>Fri, 17 Jul 2009 20:49:40 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">110 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Publishing Linked Data on the Talis Platform</title>
 <link>http://www.jenitennison.com/blog/node/109</link>
 <description>&lt;p&gt;I was at &lt;a href=&quot;http://www.ukuug.org/events/opentech2009/&quot;&gt;OpenTech&lt;/a&gt; a couple of weekends ago, and heard a lot of great talks. I particularly enjoyed the one by &lt;a href=&quot;http://simonwillison.net/&quot;&gt;Simon Willison&lt;/a&gt; in which he talked about the &lt;a href=&quot;http://www.guardian.co.uk/news/datablog&quot;&gt;Guardian Data Blog&lt;/a&gt;. Essentially, the data collected by the journalists at the Guardian, that form the basis of their pretty visualisations and so forth, gets published in Google Spreadsheets.&lt;/p&gt;

&lt;p&gt;Looking through the data blog today, I saw that the &lt;a href=&quot;http://www.london.gov.uk/&quot;&gt;Greater London Authority&lt;/a&gt; have similarly &lt;a href=&quot;http://www.london.gov.uk/focusonlondon/datastore.jsp&quot;&gt;released their data&lt;/a&gt; using Google Spreadsheets.&lt;/p&gt;

&lt;p&gt;Now Google Spreadsheets are just fine &amp;#8212; they&amp;#8217;re easy for end-users to use and it&amp;#8217;s not hard for data nerds to extract data from them. They have real advantages for publishing because they are quick and easy to set up.&lt;/p&gt;

&lt;p&gt;But take a look through the page listing the tables of data and you can see that many of them are about the same areas. The Guardian Data Blog have actually created a &lt;a href=&quot;http://spreadsheets.google.com/ccc?key=t3bns85prAbiChLmFhlcB1Q&quot;&gt;new spreadsheet&lt;/a&gt; that pulls together that information. Even with the aggregated data, in Google Spreadsheets there&amp;#8217;s no way to address the data held in each table about Sutton (say).&lt;/p&gt;

&lt;p&gt;Now, a few months ago, &lt;a href=&quot;http://www.talis.com/platform/&quot;&gt;Talis&lt;/a&gt; announced the &lt;a href=&quot;http://www.talis.com/platform/cc/&quot;&gt;Talis Connected Commons&lt;/a&gt;, which enables anyone to publish public domain data using the Talis Platform for free. It turns out that it&amp;#8217;s really easy to publish addressable data using the Talis Platform as a host.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;The first stage is to get hold of the data and convert it into some RDF/XML that can be loaded into Talis. You can get hold of a CSV version of a Google Spreadsheet by exporting it as CSV. Converting the CSV into RDF/XML can be done in any number of ways. Given that this is a small dataset and it has commas in some of the fields I&amp;#8217;ve used some simple reusable XSLT rather than awk. Here&amp;#8217;s &lt;code&gt;csv.xsl&lt;/code&gt; for parsing the CSV:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
      xmlns:xs=&quot;http://www.w3.org/2001/XMLSchema&quot;
      xmlns:csv=&quot;http://www.jenitennison.com/xslt/csv&quot;
      xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
      xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;
      exclude-result-prefixes=&quot;xs csv&quot;
      version=&quot;2.0&quot;&amp;gt;

&amp;lt;xsl:param name=&quot;filename&quot; as=&quot;xs:string&quot; required=&quot;yes&quot; /&amp;gt;

&amp;lt;xsl:variable name=&quot;csv&quot; as=&quot;xs:string&quot; select=&quot;unparsed-text($filename)&quot; /&amp;gt;
&amp;lt;xsl:variable name=&quot;lines&quot; as=&quot;xs:string+&quot; select=&quot;tokenize($csv, &#039;\n&#039;)[normalize-space(.) != &#039;&#039;]&quot; /&amp;gt;
&amp;lt;xsl:variable name=&quot;fields&quot; as=&quot;xs:string+&quot; select=&quot;csv:values($lines[1])&quot; /&amp;gt;
&amp;lt;xsl:variable name=&quot;data&quot; as=&quot;xs:string+&quot; select=&quot;$lines[position() &amp;gt; 1]&quot; /&amp;gt;

&amp;lt;xsl:template match=&quot;/&quot; name=&quot;main&quot;&amp;gt;
  &amp;lt;rdf:RDF&amp;gt;
    &amp;lt;xsl:for-each select=&quot;$lines[position() &amp;gt; 1]&quot;&amp;gt;
      &amp;lt;xsl:call-template name=&quot;csv:line&quot;&amp;gt;
        &amp;lt;xsl:with-param name=&quot;values&quot; select=&quot;csv:values(.)&quot; /&amp;gt;
      &amp;lt;/xsl:call-template&amp;gt;
    &amp;lt;/xsl:for-each&amp;gt;
  &amp;lt;/rdf:RDF&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template name=&quot;csv:line&quot;&amp;gt;
  &amp;lt;xsl:param name=&quot;values&quot; as=&quot;xs:string+&quot; required=&quot;yes&quot; /&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:function name=&quot;csv:values&quot; as=&quot;xs:string+&quot;&amp;gt;
  &amp;lt;xsl:param name=&quot;line&quot; as=&quot;xs:string&quot; /&amp;gt;
  &amp;lt;xsl:analyze-string select=&quot;$line&quot; regex=&quot;(&amp;amp;quot;([^&amp;amp;quot;]+)&amp;amp;quot;|([^,]+))?,&quot;&amp;gt;
    &amp;lt;xsl:matching-substring&amp;gt;
      &amp;lt;xsl:choose&amp;gt;
        &amp;lt;xsl:when test=&quot;regex-group(2) != &#039;&#039;&quot;&amp;gt;
          &amp;lt;xsl:sequence select=&quot;regex-group(2)&quot; /&amp;gt;
        &amp;lt;/xsl:when&amp;gt;
        &amp;lt;xsl:otherwise&amp;gt;
          &amp;lt;xsl:sequence select=&quot;regex-group(3)&quot; /&amp;gt;
        &amp;lt;/xsl:otherwise&amp;gt;
      &amp;lt;/xsl:choose&amp;gt;
    &amp;lt;/xsl:matching-substring&amp;gt;
    &amp;lt;xsl:non-matching-substring&amp;gt;
      &amp;lt;xsl:variable name=&quot;value&quot; as=&quot;xs:string&quot; select=&quot;normalize-space(.)&quot; /&amp;gt;
      &amp;lt;xsl:choose&amp;gt;
        &amp;lt;xsl:when test=&quot;starts-with($value, &#039;&amp;amp;quot;&#039;) and ends-with($value, &#039;&amp;amp;quot;&#039;)&quot;&amp;gt;
          &amp;lt;xsl:sequence select=&quot;substring($value, 2, string-length($value) - 2)&quot; /&amp;gt;
        &amp;lt;/xsl:when&amp;gt;
        &amp;lt;xsl:otherwise&amp;gt;
          &amp;lt;xsl:sequence select=&quot;$value&quot; /&amp;gt;
        &amp;lt;/xsl:otherwise&amp;gt;
      &amp;lt;/xsl:choose&amp;gt;
    &amp;lt;/xsl:non-matching-substring&amp;gt;
  &amp;lt;/xsl:analyze-string&amp;gt;
&amp;lt;/xsl:function&amp;gt;

&amp;lt;xsl:function name=&quot;csv:field&quot; as=&quot;xs:string&quot;&amp;gt;
  &amp;lt;xsl:param name=&quot;values&quot; as=&quot;xs:string+&quot; /&amp;gt;
  &amp;lt;xsl:param name=&quot;field&quot; as=&quot;xs:string&quot; /&amp;gt;
  &amp;lt;xsl:sequence select=&quot;$values[index-of($fields, $field)]&quot; /&amp;gt;
&amp;lt;/xsl:function&amp;gt;

&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and here&amp;#8217;s the stylesheet that I&amp;#8217;ve used to create some basic RDF data about the boroughs:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
      xmlns:xs=&quot;http://www.w3.org/2001/XMLSchema&quot;
      xmlns:csv=&quot;http://www.jenitennison.com/xslt/csv&quot;
      exclude-result-prefixes=&quot;xs csv&quot;
      xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
      xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;
      xmlns:data=&quot;http://www.jenitennison.com/ontology/data#&quot;
      version=&quot;2.0&quot;&amp;gt;

&amp;lt;xsl:import href=&quot;../csv.xsl&quot; /&amp;gt;

&amp;lt;xsl:param name=&quot;filename&quot; select=&quot;resolve-uri(&#039;boroughs.txt&#039;, static-base-uri())&quot; /&amp;gt;

&amp;lt;xsl:template match=&quot;/&quot; name=&quot;main&quot;&amp;gt;
  &amp;lt;rdf:RDF&amp;gt;
    &amp;lt;xsl:for-each select=&quot;$data&quot;&amp;gt;
      &amp;lt;xsl:variable name=&quot;values&quot; as=&quot;xs:string+&quot; select=&quot;csv:values(.)&quot; /&amp;gt;
      &amp;lt;xsl:variable name=&quot;borough&quot; as=&quot;xs:string&quot; select=&quot;csv:field($values, &#039;BOROUGH&#039;)&quot; /&amp;gt;
      &amp;lt;xsl:variable name=&quot;maleLifeExpectancy&quot; as=&quot;xs:string&quot; 
        select=&quot;csv:field($values, &#039;2005-07 Life expectancy at birth (years), Males&#039;)&quot; /&amp;gt;
      &amp;lt;xsl:variable name=&quot;femaleLifeExpectancy&quot; as=&quot;xs:string&quot; 
        select=&quot;csv:field($values, &#039;Life expectancy at birth (years) Females&#039;)&quot; /&amp;gt;
      &amp;lt;data:LondonBorough rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/{translate(lower-case(replace($borough, &#039;&amp;amp;amp;&#039;, &#039;and&#039;)), &#039; &#039;, &#039;-&#039;)}&quot;&amp;gt;
        &amp;lt;rdfs:label&amp;gt;&amp;lt;xsl:value-of select=&quot;$borough&quot; /&amp;gt;&amp;lt;/rdfs:label&amp;gt;
        &amp;lt;xsl:if test=&quot;$maleLifeExpectancy != &#039;&#039;&quot;&amp;gt;
          &amp;lt;data:maleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#integer&quot;&amp;gt;
            &amp;lt;xsl:value-of select=&quot;$maleLifeExpectancy&quot; /&amp;gt;
          &amp;lt;/data:maleLifeExpectancy&amp;gt;
        &amp;lt;/xsl:if&amp;gt;
        &amp;lt;xsl:if test=&quot;$femaleLifeExpectancy != &#039;&#039;&quot;&amp;gt;
          &amp;lt;data:femaleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#integer&quot;&amp;gt;
            &amp;lt;xsl:value-of select=&quot;$femaleLifeExpectancy&quot; /&amp;gt;
          &amp;lt;/data:femaleLifeExpectancy&amp;gt;
        &amp;lt;/xsl:if&amp;gt;
      &amp;lt;/data:LondonBorough&amp;gt;
    &amp;lt;/xsl:for-each&amp;gt;
  &amp;lt;/rdf:RDF&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The result is some RDF/XML for each borough which looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
  xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;
  xmlns:data=&quot;http://www.jenitennison.com/ontology/data#&quot;&amp;gt;
  &amp;lt;data:LondonBorough rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/city-of-london&quot;&amp;gt;
    &amp;lt;rdfs:label&amp;gt;City of London&amp;lt;/rdfs:label&amp;gt;
  &amp;lt;/data:LondonBorough&amp;gt;
  &amp;lt;data:LondonBorough
    rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/barking-and-dagenham&quot;&amp;gt;
    &amp;lt;rdfs:label&amp;gt;Barking &amp;amp;amp; Dagenham&amp;lt;/rdfs:label&amp;gt;
    &amp;lt;data:maleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#integer&quot;&amp;gt;76.3&amp;lt;/data:maleLifeExpectancy&amp;gt;
    &amp;lt;data:femaleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#integer&amp;gt;80.3&amp;lt;/data:femaleLifeExpectancy&amp;gt;
  &amp;lt;/data:LondonBorough&amp;gt;
  &amp;lt;data:LondonBorough rdf:about=&quot;http://www.jenitennison.com/data/id/london-borough/barnet&quot;&amp;gt;
    &amp;lt;rdfs:label&amp;gt;Barnet&amp;lt;/rdfs:label&amp;gt;
    &amp;lt;data:maleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#integer&quot;&amp;gt;79.5&amp;lt;/data:maleLifeExpectancy&amp;gt;
    &amp;lt;data:femaleLifeExpectancy rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#integer&quot;&amp;gt;83.6&amp;lt;/data:femaleLifeExpectancy&amp;gt;
  &amp;lt;/data:LondonBorough&amp;gt;
  ...
&amp;lt;/rdf:RDF&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To load this into a Talis store, you have to first have one set up, which I do (primarily for experimenting with &lt;a href=&quot;http://code.google.com/p/rdfquery&quot;&gt;rdfQuery&lt;/a&gt;). You can get one by filling in the form on &lt;a href=&quot;http://www.talis.com/platform/cc/contact/&quot;&gt;the Talis website&lt;/a&gt;. Loading data into Talis just means a &lt;code&gt;POST&lt;/code&gt; request like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;gt; curl -H &quot;Content-type: application/rdf+xml&quot; --digest -u username:password 
  --data-binary @LondonBoroughs/boroughs.rdf http://api.talis.com/stores/rdfquery-dev1/meta
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Once that&amp;#8217;s done, you can check whether the data has been successfully loaded or not by visiting the URI of the store in an ordinary browser, in this case:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://api.talis.com/stores/rdfquery-dev1/meta
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can enter the URI of a resource in the form, for example &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt; and indicate the format for the response. Requesting Turtle gets you something like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;http://www.jenitennison.com/data/id/london-borough/barnet&amp;gt;
  &amp;lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&amp;gt;
    &amp;lt;http://www.jenitennison.com/ontology/data#LondonBorough&amp;gt; ;
  &amp;lt;http://www.w3.org/2000/01/rdf-schema#label&amp;gt;
    &quot;Barnet&quot; ;
  &amp;lt;http://www.jenitennison.com/ontology/data#femaleLifeExpectancy&amp;gt;
    &quot;83.6&quot;^^&amp;lt;http://www.w3.org/2001/XMLSchema#integer&amp;gt; ;
  &amp;lt;http://www.jenitennison.com/ontology/data#maleLifeExpectancy&amp;gt;
    &quot;79.5&quot;^^&amp;lt;http://www.w3.org/2001/XMLSchema#integer&amp;gt; .
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;All is well and good, but this isn&amp;#8217;t really linked data. For it to be linked data, this RDF needs to be accessible at the URI &lt;code&gt;http://www.jenitennison.com/data/id/london-borough/barnet&lt;/code&gt;. So how do we do that? Tune in next time to find out.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/109#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/47">Talis</category>
 <pubDate>Wed, 15 Jul 2009 19:53:10 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">109 at http://www.jenitennison.com/blog</guid>
</item>
</channel>
</rss>

