<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.jenitennison.com/blog" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>rdfQuery</title>
 <link>http://www.jenitennison.com/blog/taxonomy/term/40</link>
 <description>The taxonomy view with a depth of 0.</description>
 <language>en</language>
<item>
 <title>On Resolvability</title>
 <link>http://www.jenitennison.com/blog/node/126</link>
 <description>&lt;p&gt;In my &lt;a href=&quot;http://www.jenitennison.com/blog/node/124&quot;&gt;last post about RDFa and HTML&lt;/a&gt; I talked about how one of the gulfs that separates the HTML5 and Semantic Web communities is the attitude to the resolvability of property (and class) URIs.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;m currently experimenting with introducing the ability to automatically locate information about properties and other resources that are referenced within triples to &lt;a href=&quot;http://code.google.com/p/rdfquery&quot;&gt;rdfQuery&lt;/a&gt;, so now is a good time, as far as I&amp;#8217;m concerned, to look more closely at what the ability to resolve properties gives you and how to avoid problems if the property URI is (temporarily or permanently) unresolvable or resolvable to something new.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;m going to attempt to answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How do or might applications use property and class URIs?&lt;/li&gt;
&lt;li&gt;How can data and ontology publishers assist them in doing so?&lt;/li&gt;
&lt;li&gt;What should frameworks (such as rdfQuery) do to help application developers?&lt;/li&gt;
&lt;/ul&gt;

&lt;!--break--&gt;

&lt;h2&gt;Application Developers&lt;/h2&gt;

&lt;p&gt;We can divide applications using online data into three general categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;data-specific applications&lt;/strong&gt; are constructed around particular data sets that are known to the developer of the application; the &lt;a href=&quot;http://www.jenitennison.com/blog/node/125&quot;&gt;visualisations&lt;/a&gt; &lt;a href=&quot;http://www.jenitennison.com/blog/node/123&quot;&gt;that&lt;/a&gt; &lt;a href=&quot;http://www.jenitennison.com/blog/node/119&quot;&gt;I&amp;#8217;ve been&lt;/a&gt; &lt;a href=&quot;http://www.jenitennison.com/blog/node/113&quot;&gt;doing&lt;/a&gt; are examples of data-specific applications&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;vocabulary-specific applications&lt;/strong&gt; are constructed around particular vocabularies, wherever the data might be found that uses them; &lt;a href=&quot;http://code.google.com/apis/socialgraph/&quot;&gt;Google&amp;#8217;s Social Graph API&lt;/a&gt; and &lt;a href=&quot;http://developer.search.yahoo.com/start&quot;&gt;Yahoo! SearchMonkey&lt;/a&gt; are examples&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;generic applications&lt;/strong&gt; are constructed to navigate through any RDF that they find; &lt;a href=&quot;http://www.w3.org/2005/ajar/tab&quot;&gt;Tabulator&lt;/a&gt; is one example&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most mashups are data-specific applications. When you, as a developer, create a data-specific application, the thing that you need to know most of all is what information the dataset contains. Part of that is working out the meaning of a particular property (or class). What the data publisher needs to do is make sure that the data they publish is documented.&lt;/p&gt;

&lt;p&gt;There are three ways of locating the documentation about a particular property or class:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;looking through the general documentation the data publisher has provided&lt;/li&gt;
&lt;li&gt;resolving the URI of the class or property&lt;/li&gt;
&lt;li&gt;searching&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a developer, it&amp;#8217;s very useful to find out about a property by bunging its URI into a browser and hitting return. Want to know what &lt;code&gt;http://xmlns.com/foaf/0.1/name&lt;/code&gt; means? Look up that URI. By comparison, if you want to know what a &lt;code&gt;vevent&lt;/code&gt; is, your best bet is a search engine. In the results I get from Google, the microformat definition of &lt;code&gt;vevent&lt;/code&gt; is currently second on the list. (The Microdata definition of &lt;code&gt;vevent&lt;/code&gt; doesn&amp;#8217;t even feature.) &lt;strong&gt;Even if a property isn&amp;#8217;t available at its URI, its URI gives a more unique identifier to search for than an short term&lt;/strong&gt;: you&amp;#8217;re more likely to find relevant information if you search for &lt;code&gt;http://xmlns.com/foaf/0.1/name&lt;/code&gt; than if you search for &lt;code&gt;name&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But there&amp;#8217;s no requirement for data-specific applications to use computer-readable information about properties or classes. If you know the data that&amp;#8217;s available in a dataset, you can find out the semantics of the properties and classes it contains and hard-code those within your application. Most applications that reuse data are currently of this type, and it tends to be the only kind that non-Semantic Web people think about.&lt;/p&gt;

&lt;p&gt;Vocabulary-specific and generic applications will have some vocabularies built in but may also operate with unknown vocabularies. For example, an application that cares about FOAF profiles is almost certainly going to want to hard-code information about FOAF rather than download its schema every time it&amp;#8217;s used. &lt;/p&gt;

&lt;p&gt;There are three reasons for building-in knowledge about particular vocabularies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;some information about a vocabulary simply can&amp;#8217;t be represented in a schema or ontology; if you want special handling for particular properties, you&amp;#8217;re going to want to hard-code it&lt;/li&gt;
&lt;li&gt;downloading, parsing and interpreting a schema that you know you&amp;#8217;re going to need every time you run the application is really inefficient&lt;/li&gt;
&lt;li&gt;relying on the network to provide information about a vocabulary you know you&amp;#8217;re going to need makes your application fragile, especially if you do not have control over the publication of the schema yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;It&amp;#8217;s worth noting that applications increasingly do rely on the availability of networked resources in order to operate &amp;#8212; that&amp;#8217;s what &lt;a href=&quot;http://en.wikipedia.org/wiki/Cloud_computing&quot;&gt;cloud computing&lt;/a&gt; is all about &amp;#8212; but the resources are usually ones that the application developers have some kind of control over.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;It helps to use URIs for properties and classes for well-known vocabularies only in as much as it means that property and class names from different vocabularies won&amp;#8217;t clash&lt;/strong&gt;, so you don&amp;#8217;t have to worry about your application confusing &lt;code&gt;http://xmlns.com/foaf/0.1/title&lt;/code&gt; with &lt;code&gt;http://purl.org/dc/terms/title&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;On the other hand, if data uses an unknown vocabulary, vocabulary-specific and generic applications would like to get hold of extra information. This falls into three categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;human-readable information&lt;/strong&gt; includes things that help with the display of data, such as human-readable labels for properties and classes; the expected datatype of the values of a property might also fall into this category&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;mapping information&lt;/strong&gt; helps applications map unknown properties and classes onto known ones; for example, if &lt;code&gt;http://people.example.org/ontology/fullName&lt;/code&gt; is defined as a sub-property of &lt;code&gt;http://xmlns.com/foaf/0.1/name&lt;/code&gt; then the application can use or display the value of &lt;code&gt;http://people.example.org/ontology/fullName&lt;/code&gt; in exactly the same way as the value of &lt;code&gt;http://xmlns.com/foaf/0.1/name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;reasoning information&lt;/strong&gt; helps applications draw further conclusions about the resources for which there&amp;#8217;s information based on what they already know; for example, if &lt;code&gt;http://people.example.org/ontology/fullName&lt;/code&gt; has a domain of &lt;code&gt;http://xmlns.com/foaf/0.1/Person&lt;/code&gt; then anything that has the property &lt;code&gt;http://people.example.org/ontology/fullName&lt;/code&gt; must be a &lt;code&gt;http://xmlns.com/foaf/0.1/Person&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are in descending order of priority: many applications will want to interact with the user in some way, in which case human-readable information is vital. Applications that have built-in knowledge about one or more vocabularies are likely to have special handling for those vocabularies, so being able to map unknown properties and classes into those known vocabularies will enhance the behaviour of the application, although it adds a bit of complexity in the implementation to do so. Further reasoning has the potential to increase the value of sparse data but again increases the complexity of implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using URIs for classes and properties provides a mechanism for applications to get hold of this extra information about unknown vocabularies&lt;/strong&gt;. They might try four tactics, in order of priority:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;look at the data they already know&lt;/strong&gt;; the information they need about the unknown properties and classes may be included in the files they&amp;#8217;ve already accessed (including those containing data)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;look in an application-specific (possibly cloud-hosted) cache&lt;/strong&gt; of vocabularies that the application has already downloaded&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;resolve the URI&lt;/strong&gt; of the class or property by performing an HTTP GET (and add it to the application-specific cache)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;look in a general-purpose cache&lt;/strong&gt;, such as &lt;a href=&quot;http://www.archive.org/&quot;&gt;the Internet Archive&lt;/a&gt; or an ontology repository such as &lt;a href=&quot;http://swoogle.umbc.edu/&quot;&gt;Swoogle&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Robust applications will not break if they don&amp;#8217;t manage to locate the definition of a property or class. They can certainly continue to parse any data that they come across. To create a human-readable label, they might use the part of the URI after the last &lt;code&gt;#&lt;/code&gt; or &lt;code&gt;/&lt;/code&gt;. It&amp;#8217;s no loss (to the application) if they cannot perform other reasoning: they might display the data in some default way or simply ignore it.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s worth noting, because of the fear of &lt;a href=&quot;http://en.wikipedia.org/wiki/Denial-of-service_attack&quot;&gt;DDoS attacks&lt;/a&gt; that some people have, that the majority of applications won&amp;#8217;t need to actually &lt;code&gt;GET&lt;/code&gt; property or class URIs, either because they are data-specific applications or because they only work with vocabularies that are hard-coded into them. Applications that are good web citizens will avoid DDoS attacks on popular vocabularies by hard-coding knowledge about those vocabularies and/or maintaining a cache, either locally or in the cloud, of vocabularies that have already been resolved.&lt;/p&gt;

&lt;h2&gt;Publishers&lt;/h2&gt;

&lt;p&gt;With what I&amp;#8217;ve said above in mind, what can publishers do to help applications to understand the data that they provide?&lt;/p&gt;

&lt;p&gt;If a publisher is only concerned about data-specific, point-to-point mashups, all they &lt;em&gt;have&lt;/em&gt; to provide is the data itself. It will help developers if there is some documentation of the dataset and the properties and classes used within it. But data publishers who only want their data to be discoverable by &lt;em&gt;people&lt;/em&gt; can rely on human intelligence for locating information, and for them using URIs for properties and classes may seem like overkill.&lt;/p&gt;

&lt;p&gt;But in a linked data world, publishers should really support their data being discovered automatically via the links from other data. Here we&amp;#8217;re talking about making life easier for vocabulary-specific and generic applications to use the data that you provide.&lt;/p&gt;

&lt;p&gt;The vocabularies that you use within your data fall into three general categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;well-known vocabularies&lt;/strong&gt; are vocabularies that are commonly enough used that vocabulary-specific and generic applications are likely to have them built-in; these vocabularies tend to be useful across domains, such as FOAF, which is useful whenever you want to express information about people or organisations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;local vocabularies&lt;/strong&gt; are vocabularies that are specific to the dataset that you are publishing; you have as much control over their publication as you do over the publication of the data itself&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;reused vocabularies&lt;/strong&gt; are vocabularies that you are using that are owned by other people but that do not have the take-up of well-known vocabularies; these are typically domain-specific; one example is &lt;a href=&quot;http://www.metalex.eu/&quot;&gt;Metalex&lt;/a&gt;, which is a vocabulary about legislation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a data publisher, the first thing you can do is to &lt;strong&gt;use well-known vocabularies in your data wherever possible&lt;/strong&gt;, even if you also use local or reused vocabularies to express the same properties or classes.&lt;/p&gt;

&lt;p&gt;For example, say you have some data describing a cricket team and use &lt;code&gt;http://cricket.example.org/ontology#name&lt;/code&gt; for the name of a member of a team, and that you mean it to be a sub-property of &lt;code&gt;http://xmlns.com/foaf/0.1/name&lt;/code&gt; (which is itself a sub-property of &lt;code&gt;http://www.w3.org/2000/01/rdf-schema#label&lt;/code&gt;). If you &lt;em&gt;just&lt;/em&gt; publish the &lt;code&gt;http://cricket.example.org/ontology#name&lt;/code&gt; property then the only way that a generic application can know that &lt;code&gt;http://cricket.example.org/ontology#name&lt;/code&gt; can be used as a label for a resource (which is a person) is by attempting to resolve &lt;code&gt;http://cricket.example.org/ontology&lt;/code&gt; and reasoning based on what it finds. On the other hand, if you &lt;em&gt;also&lt;/em&gt; provide &lt;code&gt;http://xmlns.com/foaf/0.1/name&lt;/code&gt; and &lt;code&gt;http://www.w3.org/2000/01/rdf-schema#label&lt;/code&gt; properties, applications are no longer dependent on the network, nor on having the ability to reason, to use that information.&lt;/p&gt;

&lt;p&gt;You &lt;em&gt;could&lt;/em&gt; also provide mappings onto any reused vocabularies that you specialise, but this is less worthwhile given that vocabulary-specific and generic applications are unlikely to understand them either. &lt;/p&gt;

&lt;p&gt;The second thing you can do is to &lt;strong&gt;include information about the properties that you use within the data that you publish&lt;/strong&gt;. This isn&amp;#8217;t important for well-known vocabularies (because they&amp;#8217;re&amp;#8230; uh&amp;#8230; well-known) and it&amp;#8217;s only useful for local vocabularies if you&amp;#8217;re not publishing those vocabularies, because if someone can access your data, odds are they&amp;#8217;re able to access your local vocabulary&amp;#8217;s property URIs as well. But it is useful for reused vocabularies, where you can&amp;#8217;t guarantee access, in just the same way as it&amp;#8217;s useful to provide basic labelling information about any resources you reference.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;If you&amp;#8217;re publishing your data embeddded within a web page, as well as marking up the &lt;strong&gt;data&lt;/strong&gt;, you can mark up the &lt;strong&gt;labels&lt;/strong&gt; that you use for those values, which more than likely appear as headings in a table or something similar.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you are publishing a schema or ontology that describes your properties and types, there are also things that you can do to help applications. The most important thing is to assist caches in their caching of the ontology, which will reduce the number of times that it needs to be accessed directly and help you avoid DDoS attacks: see &lt;a href=&quot;http://www.mnot.net/cache_docs/&quot;&gt;Mark Nottingham&amp;#8217;s Caching Tutorial&lt;/a&gt;. You can also reduce the number of hits on your server by using hash URIs for your property and class names and use standard load-balancing techniques to manage the traffic.&lt;/p&gt;

&lt;p&gt;If you&amp;#8217;re referring to reused vocabularies within your own, you can also embed information about the relevant properties and classes from those vocabularies within your own ontology. This can save applications an extra hop, and lessens the risk of the reused vocabulary disappearing (perhaps forever).&lt;/p&gt;

&lt;p&gt;If you want to help people who might reuse your ontology, you can make the process of copying it easier by publishing it as a single file, or broken up into segments that are likely to be reused individually. At a non-technical level, it&amp;#8217;s also a good idea to provide a announcement mailing list or a feed so that people who reuse your vocabulary can be kept up to date with any changes you make to it.&lt;/p&gt;

&lt;h2&gt;Framework Developers&lt;/h2&gt;

&lt;p&gt;Bearing all this in mind, what should I (and other framework developers) do to support the reusers of data? I think I need to make it easy for application developers to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;load in known ontologies from known locations&lt;/li&gt;
&lt;li&gt;hard-code relevant semantics in the script&lt;/li&gt;
&lt;li&gt;create catalogs that map known property or class names onto known locations of documents that contain details about them&lt;/li&gt;
&lt;li&gt;use caching proxies when accessing unknown vocabularies&lt;/li&gt;
&lt;li&gt;access vocabularies directly at the relevant URI&lt;/li&gt;
&lt;li&gt;fallback on archives when the URI cannot be found&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, I need to make it easy for people to use a range of strategies for getting hold of information about a property or class, aside from simply trying to access it at its URI. I think that means that it&amp;#8217;s better to provide a lightweight solution, giving developers the opportunity to be in control of which URIs get resolved rather than automatically downloading extra information from the URI that&amp;#8217;s actually used for the property or class. It also means I need to provide hooks in the code that they can use to trigger that resolution.&lt;/p&gt;

&lt;p&gt;It would also be useful, of course, for developers to be able to use information about properties and classes easily, in particular to reason with it. That kind of support is something I&amp;#8217;ve been working on for rdfQuery. It&amp;#8217;s not quite ready yet.&lt;/p&gt;

&lt;h2&gt;Conclusions&lt;/h2&gt;

&lt;p&gt;My (somewhat contentious) view is that we place too much emphasis on the resolvability of property and class names, and that this can put people off the idea of the Semantic Web. You can do useful things with data without resolving properties or classes. And for a large number of useful applications, being able to actually &lt;em&gt;reason&lt;/em&gt; over the data you get at the end of a property URI would have a high implementation cost without providing a great deal of functional benefit. &lt;/p&gt;

&lt;p&gt;Further, for data publishers, the requirement to enable the resolution of every property and class URI you use within your data just adds to the publishing burden, especially if you&amp;#8217;re made to feel it has to resolve to some kind of grand OWL ontology.&lt;/p&gt;

&lt;p&gt;There&amp;#8217;s a concept in psychology of the &lt;a href=&quot;http://en.wikipedia.org/wiki/Zone_of_proximal_development&quot;&gt;zone of proximal development&lt;/a&gt;. The idea is that if someone is operating at a particular level then as a teacher you should help them to achieve something &lt;em&gt;slightly&lt;/em&gt; above that level, rather than trying to get them to do everything straight away.&lt;/p&gt;

&lt;p&gt;The same is true here. We need to help publishers make the small steps that they can make, one at a time, to gradually get them to full Semantic Web goodness:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;publish a dataset in some kind of open format (CSV, XML etc) so that people can get hold of it&lt;/li&gt;
&lt;li&gt;publish the data with distinct URIs for distinct resources so that people can reference them&lt;/li&gt;
&lt;li&gt;publish the data in a machine-readable format so that people can easily reuse it&lt;/li&gt;
&lt;li&gt;publish the data in a way that can be interpreted as RDF, with URIs for properties and types, to avoid conflicts with other vocabularies and so that the data can be &amp;#8220;understood&amp;#8221; even when discovered automatically&lt;/li&gt;
&lt;li&gt;put some human-readable documentation at the end of the property/type URIs, so that developers can easily discover what your data&amp;#8217;s about&lt;/li&gt;
&lt;li&gt;embed machine-readable labels and descriptions for your properties/types within your data, so that applications can display it&lt;/li&gt;
&lt;li&gt;embed &lt;code&gt;rdfs:subPropertyOf&lt;/code&gt;/&lt;code&gt;rdfs:subClassOf&lt;/code&gt; mappings from your properties/types to well-known properties/types within your data, so that it can be displayed in custom ways&lt;/li&gt;
&lt;li&gt;put the machine-readable information about the properties/types at the end of the property/type URIs, so that you can update your vocabulary easily and so that other people can reuse it&lt;/li&gt;
&lt;li&gt;add other RDFS and OWL statements about the properties/types, so that reasoners can add value to your data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The biggest leap, the one that requires the most persuasion and the most justification, is probably from simply publishing the data in a machine-readable format to using the RDF model with URIs for properties and types. But if you remove the cost of having to provide anything at the end of the URI and factor in the potential benefits you may reap in the future (as you step further up that ladder), the question becomes less &amp;#8220;why?&amp;#8221; and more &amp;#8220;why not?&amp;#8221;.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/126#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/44">html5</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/46">linked data</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/40">rdfQuery</category>
 <pubDate>Fri, 28 Aug 2009 22:02:16 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">126 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>More Crime</title>
 <link>http://www.jenitennison.com/blog/node/125</link>
 <description>&lt;p&gt;I wrote &lt;a href=&quot;http://www.jenitennison.com/blog/node/123&quot;&gt;previously&lt;/a&gt; about a visualisation using &lt;a href=&quot;http://www.homeoffice.gov.uk/about-us/publications/non-personal-data/&quot;&gt;Home Office data&lt;/a&gt; to navigate around categories of offences. The second interesting set of data from the Home Office that I found, tucked away in a small link on a page about &lt;a href=&quot;http://www.crimereduction.homeoffice.gov.uk/toolkits/dr0202.htm&quot;&gt;Crime Reduction Toolkits&lt;/a&gt; was a &lt;a href=&quot;http://www.homeoffice.gov.uk/rds/pdfs/100years.xls&quot;&gt;spreadsheet of recorded crime statistics&lt;/a&gt; between 1898 and the present day. Each column is a different category of offence (I won&amp;#8217;t say class because they don&amp;#8217;t map onto the Classes from the spreadsheet of notifiable offences).&lt;/p&gt;

&lt;p&gt;This time I wanted to try out the &lt;a href=&quot;http://www.omnipotent.net/jquery.sparkline/&quot;&gt;jQuery sparklines&lt;/a&gt; plug-in to illustrate how crime notifications have changed over time. The resulting page is available at &lt;a href=&quot;http://www.jenitennison.com/visualisation/crime.html&quot;&gt;http://www.jenitennison.com/visualisation/crime.html&lt;/a&gt;; here&amp;#8217;s a screenshot for Bigamy:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/files/crime.jpg&quot; alt=&quot;Summary statistics for rate of Bigamy within the UK&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;(I chose Bigamy because there are some interesting humps in the data, roughly aligned with the two World Wars, which demonstrate the value of looking at timelines.)&lt;/p&gt;

&lt;p&gt;I got this working by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cleaning up and processing the spreadsheet into RDF/XML&lt;/li&gt;
&lt;li&gt;putting the RDF/XML on my server in &lt;a href=&quot;http://www.jenitennison.com/data/scheme/crime/&quot;&gt;http://www.jenitennison.com/data/scheme/crime/&lt;/a&gt;, with more or less the same &lt;code&gt;.htaccess&lt;/code&gt; as I &lt;a href=&quot;http://www.jenitennison.com/blog/node/123&quot;&gt;used previously&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;creating a web page that included jQuery sparklines and &lt;a href=&quot;http://code.google.com/p/rdfquery&quot;&gt;rdfQuery&lt;/a&gt; as libraries and populates the page with details&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can see the source code for &lt;code&gt;crime.html&lt;/code&gt; if you just go and look at it, but the relevant piece for populating the sparkline is:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$.get(selected, null, function (rdfXml) {
  var values = {}, sparkline = [],
    ...
    counts = $.rdf()
      .load(rdfXml)
      .prefix(&#039;rdf&#039;, &#039;http://www.w3.org/1999/02/22-rdf-syntax-ns#&#039;)
      .prefix(&#039;crime&#039;, &#039;http://www.jenitennison.com/data/ontology/crime#&#039;)
      .where(&#039;&amp;lt;&#039; + selected + &#039;&amp;gt; crime:count ?count&#039;)
      .where(&#039;?count crime:startYear ?year&#039;)
      .where(&#039;?count rdf:value ?value&#039;)
      .each(function () {
        ...
        values[this.year.value] = this.value.value;
      });
  ...
  for (v in values) {
    sparkline.push([v, values[v]]);
  }
  $(&#039;#sparkline&#039;)
    .sparkline(sparkline, { 
      chartRangeMin: 0,
      lineColor: &#039;#999&#039;,
      fillColor: &#039;#EEE&#039;,
      spotColor: &#039;blue&#039;,
      minSpotColor: &#039;green&#039;,
      maxSpotColor: &#039;red&#039;
    });
}, &#039;xml&#039;);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(There is a reason for first creating an object representation of the values and then generating the array-of-arrays that the sparkline needs, but it&amp;#8217;s in the elided code so you&amp;#8217;ll have to look at the original source to see.)&lt;/p&gt;

&lt;p&gt;jQuery sparklines are ridiculously easy to create. I&amp;#8217;m looking forward to using more of the great variety of visualisations that they support.&lt;/p&gt;

&lt;p&gt;Now, the big problem with the data is that the ways in which crimes are classified and notified has changed over time. If you look at the &lt;a href=&quot;http://www.homeoffice.gov.uk/rds/pdfs/100years.xls&quot;&gt;original spreadsheet&lt;/a&gt; you&amp;#8217;ll see a bunch of notes that describe three kinds of changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Changes in the codes used for particular offences: for example Arson changed from classification 51 to 56 in 1934. I&amp;#8217;m not using the Home Office codes at the moment anyway, but this is going to be something to be wary of when I start doing so.&lt;/li&gt;
&lt;li&gt;Changes when a code is split into separate codes: for example &amp;#8220;Other wounding&amp;#8221; (8) split into &amp;#8220;Other wounding&amp;#8221; (8A), &amp;#8220;Possession of weapons&amp;#8221; (8B) and &amp;#8220;Harassment&amp;#8221; (8C) in 1998. In this case, offences stop being recorded under the original category and start being recorded under the new ones.&lt;/li&gt;
&lt;li&gt;Changes in whether a crime is notifiable: for example &amp;#8220;Cruelty to or neglect of children&amp;#8221; (11) became notifiable in 1998 (according to the notes; according to the data, at least some instances were notified up to 1952, but then there was a gap).&lt;/li&gt;
&lt;li&gt;Changes in how crimes are recorded: for example &amp;#8220;The introduction of the Sexual Offences Act 2003 in May 2004 resulted in substantial changes to the sexual offences.  This means that sexual offences data for 2004/05 are not comparable with those for previous years.&amp;#8221;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are undoubtedly other effects that aren&amp;#8217;t listed in the notes, either in the legislation that covers a particular crime, or other environmental factors such as being at war (as shown above).&lt;/p&gt;

&lt;p&gt;What I&amp;#8217;d really like to do is to indicate these events in the sparkline and in the data table. Unfortunately it involves the translation of the loose notes from the spreadsheet into handcrafted RDF/XML, which is a little tedious. It&amp;#8217;s also frustrating that there&amp;#8217;s no good means of identification for the categories of offences. I&amp;#8217;ve ended up arbitrarily naming them &amp;#8216;A&amp;#8217; to &amp;#8216;FZ&amp;#8217; which is somewhat unsatisfactory.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s worth noting that although I have a closed data set I&amp;#8217;m explicitly using the Linked Data paradigm to go from a list of categories of crimes to retrieving information about a particular category (because the identifier for a category is a URI). If I weren&amp;#8217;t using RDF, and wanted to split up the data in the way that I have for manageability, I&amp;#8217;d have to document that particular properties contain pointers to information held at other locations. (Kris Zyp has attempted to &lt;a href=&quot;http://www.json-schema.org/draft-hyperschema-02.txt&quot;&gt;formalise this in a kind of schema for JSON&lt;/a&gt;, but I have no idea how much support for this there is.)&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve also used &lt;a href=&quot;http://www.w3.org/TR/skos-primer/&quot;&gt;SKOS&lt;/a&gt; to describe the categories, which is nice because all I have to tell you is that &lt;code&gt;http://www.jenitennison.com/data/scheme/crime/&lt;/code&gt; is a SKOS Concept Scheme and if you know SKOS you&amp;#8217;ll know how to locate the top concepts in that scheme, nice human readable labels for them, and so on.&lt;/p&gt;

&lt;p&gt;But if you want to reuse the counts of offences you will still have to actually look at the data to find the name of the property that I&amp;#8217;ve used to go from a category to a count, and for the years and values themselves. These semantics are local to this particular application and the only way you can know them is by being told, just as it would be if I were using JSON.&lt;/p&gt;

&lt;p&gt;So using RDF has bought us some things &amp;#8212; a level of understanding about reaching data and a common vocabulary for organising concept schemes &amp;#8212; but certainly not everything. It should be no surprise to anyone that it is not a magic bullet.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/125#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/41">jQuery</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/40">rdfQuery</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/49">visualisation</category>
 <pubDate>Sun, 23 Aug 2009 20:26:49 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">125 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Offence Hierarchy Visualisation with rdfQuery and JIT</title>
 <link>http://www.jenitennison.com/blog/node/123</link>
 <description>&lt;p&gt;The &lt;a href=&quot;http://www.homeoffice.gov.uk/&quot;&gt;Home Office&lt;/a&gt; recently &lt;a href=&quot;http://www.homeoffice.gov.uk/about-us/publications/non-personal-data/&quot;&gt;opened up some of its data&lt;/a&gt;, mostly in the form of PDF reports and Excel spreadsheets. Right after, I went on holiday and offline (!) for a week, so I set myself the task of putting together some visualisations of the data using two client-side visualisation libraries that I liked the look of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://www.omnipotent.net/jquery.sparkline/&quot;&gt;jQuery sparklines&lt;/a&gt; which I think look simply gorgeous and which follow the jQuery tradition of being incredibly easy to put on a page&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://thejit.org/&quot;&gt;the JavaScript InfoVis Toolkit (JIT)&lt;/a&gt; which can be used to create some very attractive and interactive visualisations for hierarchical information&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a quick summary, I ended up with solutions that use an HTML page with &lt;a href=&quot;http://code.google.com/p/rdfquery&quot;&gt;rdfQuery&lt;/a&gt; code that pulls in static RDF/XML files and performs queries on them to create the particular formats that the two client-side libraries require.&lt;/p&gt;

&lt;p&gt;The first one I&amp;#8217;m going to talk about is a &lt;a href=&quot;http://www.jenitennison.com/visualisation/offences.html&quot;&gt;visualisation of types of offences&lt;/a&gt; using JIT. There&amp;#8217;s a screenshot below to give you a flavour, but you&amp;#8217;d be better off actually &lt;a href=&quot;http://www.jenitennison.com/visualisation/offences.html&quot;&gt;visiting the page&lt;/a&gt; because it&amp;#8217;s interactive: mousing over and clicking on the labels enables you to navigate around the hierarchy.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Visualisation of Criminal Damage offences&quot; src=&quot;/blog/files/offences.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;The data for this visualisation comes from a &lt;a href=&quot;http://www.homeoffice.gov.uk/rds/pdfs09/countnotif09.xls&quot;&gt;spreadsheet of notifiable offences&lt;/a&gt;, available amongst a bunch of interesting information about &lt;a href=&quot;http://www.homeoffice.gov.uk/rds/countrules.html&quot;&gt;counting rules for recording crime&lt;/a&gt;. The columns are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Home Office code, which is split into a major and minor part (eg 6/4)&lt;/li&gt;
&lt;li&gt;Mode (&amp;#8216;I&amp;#8217; for indictable, &amp;#8216;E&amp;#8217; for triable-either-way or &amp;#8216;S&amp;#8217; for summary)&lt;/li&gt;
&lt;li&gt;Max sentence (eg life, 15, 3m, fine)&lt;/li&gt;
&lt;li&gt;Class (eg &amp;#8220;Violence Against The Person&amp;#8221;)&lt;/li&gt;
&lt;li&gt;Subclass (eg &amp;#8220;Endangering Railway Passengers&amp;#8221;)&lt;/li&gt;
&lt;li&gt;Offence (eg &amp;#8220;Destroying, damaging etc. a Channel Tunnel train or the Tunnel system or committing acts of violence likely to endanger safety of operation&amp;#8221;)&lt;/li&gt;
&lt;li&gt;Act(s) (eg &amp;#8220;Channel Tunnel Act 1987 Sec 1(7)&amp;#8221;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are a couple of things that I find particularly interesting about this data.&lt;/p&gt;

&lt;p&gt;First, it includes references to legislation! Since my day job at &lt;a href=&quot;http://www.tso.co.uk/&quot;&gt;TSO&lt;/a&gt; is currently all about publishing legislation as linked data, I find this really exciting! I haven&amp;#8217;t done anything with those links yet, but I aim to.&lt;/p&gt;

&lt;p&gt;Second, you would have thought that the Home Office code would be tied to a particular Subclass or Offence, but it&amp;#8217;s not. The same Subclass can have multiple codes, but two Offences can have the same Home Office code. There doesn&amp;#8217;t seem to be a natural way of identifying the Offences, except through their (often long) descriptive name. The terminology for the Offence often comes straight out of a piece of legislation, but sometimes it&amp;#8217;s simply common law.&lt;/p&gt;

&lt;p&gt;On the other hand, the offence Classes have reasonably short labels like &amp;#8220;Burglary&amp;#8221; and &amp;#8220;Drug Offences&amp;#8221; which can be turned into URIs like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://www.jenitennison.com/data/scheme/offence/drug-offences
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The fact that Class-Subclass-Offence defines a hierarchy of concepts led me to think that &lt;a href=&quot;http://www.w3.org/TR/skos-primer/&quot;&gt;SKOS&lt;/a&gt; would be a good ontology to use to model it. The Classes and Subclasses can be plain old &lt;code&gt;skos:Concept&lt;/code&gt;s but the Offences need to have their own type so that extra information, such as the maximum sentence that applies to the offence, can be associated with it.&lt;/p&gt;

&lt;p&gt;So if you look at &lt;a href=&quot;http://www.jenitennison.com/data/scheme/offence/drug-offences&quot;&gt;http://www.jenitennison.com/data/scheme/offence/drug-offences&lt;/a&gt; you&amp;#8217;ll see RDF/XML that includes the triples:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;http://www.jenitennison.com/data/scheme/offence/drug-offences&amp;gt;
  a skos:Concept ;
  skos:topConceptOf &amp;lt;http://www.jenitennison.com/data/scheme/offence/&amp;gt; ;
  skos:prefLabel &quot;Drug Offences&quot;@en ;
  skos:narrower
    [ 
      a skos:Concept ;
      skos:inScheme &amp;lt;http://www.jenitennison.com/data/scheme/offence/&amp;gt; ;
      skos:broader &amp;lt;http://www.jenitennison.com/data/scheme/offence/drug-offences&amp;gt; ;
      skos:prefLabel &quot;Trafficking in controlled drugs&quot;@en ;
      skos:narrower
        [
          a crime:Offence ;
          skos:inScheme &amp;lt;http://www.jenitennison.com/data/scheme/offence/&amp;gt; ;
          skos:prefLabel &quot;Manufacturing a scheduled substance&quot;@en ;
          crime:maxSentence &quot;P14Y&quot;^^xsd:yearMonthDuration
        ],
        [
          a crime:Offence ;
          skos:inScheme &amp;lt;http://www.jenitennison.com/data/scheme/offence/&amp;gt; ;
          skos:prefLabel &quot;Supplying a scheduled substance to another person&quot;@en ;
          crime:maxSentence &quot;P14Y&quot;^^xsd:yearMonthDuration
        ],
        ...
    ],
    [
      a skos:Concept ;
      skos:inScheme &amp;lt;http://www.jenitennison.com/data/scheme/offence/&amp;gt; ;
      skos:broader &amp;lt;http://www.jenitennison.com/data/scheme/offence/drug-offences&amp;gt; ;
      skos:prefLabel &quot;Possession of controlled drugs&quot;@en ;
      skos:narrower
        ...
    ],
    ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You&amp;#8217;ll notice that I&amp;#8217;ve used blank nodes in the above rather than constructing an identifier for each Subclass or Offence. This makes things simpler because it means I can easily publish the dataset as a few flat files. An alternative would have been to use hash URIs, I suppose, but anyway this is the way I went. The (big) disadvantage is that it means the individual offences themselves aren&amp;#8217;t referenceable. So I might work on that, especially if I migrate the data over to data.gov.uk rather than just using it to try out a visualisation.&lt;/p&gt;

&lt;p&gt;The URI for the concept scheme is &lt;a href=&quot;http://www.jenitennison.com/data/scheme/offence/&quot;&gt;http://www.jenitennison.com/data/scheme/offence/&lt;/a&gt;. The slash on the end is entirely the result of trying to make Apache serve the static files correctly. As it is, I have one RDF/XML file for each Class of offence, plus an &lt;code&gt;index.rdf&lt;/code&gt; within the same (&lt;code&gt;offence&lt;/code&gt;) directory, with the &lt;code&gt;.htaccess&lt;/code&gt; file:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;AddType application/rdf+xml .rdf
DirectoryIndex index.rdf
RewriteEngine On
RewriteRule ^([^\.]+)$ $1.rdf [L]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The concept scheme file itself contains a list of the top concepts in the scheme (the Classes) and their labels. This serves as a useful entry point to the data.&lt;/p&gt;

&lt;p&gt;So to the code. To create the visualisation, I needed to construct a Javascript structure that adhered to the &lt;a href=&quot;http://thejit.org/docs/files/Loader-js.html#Loader.loadJSON&quot;&gt;JIT Input JSON Structure&lt;/a&gt;. Basically, each &amp;#8220;node&amp;#8221; within the visualisation needed to have an &lt;code&gt;id&lt;/code&gt;, a &lt;code&gt;name&lt;/code&gt; and a number of &lt;code&gt;children&lt;/code&gt;. This structure needed to be constructed from the RDF/XML for a particular offence Class, ie that held within a particular RDF/XML document. The RDF/XML document can be accessed using the standard &lt;a href=&quot;http://docs.jquery.com/Ajax/jQuery.get#urldatacallbacktype&quot;&gt;&lt;code&gt;$.get()&lt;/code&gt; jQuery method&lt;/a&gt;. This passes the DOM for the document into the callback function passed as the third argument, which can then invoke &lt;a href=&quot;http://www.jenitennison.com/rdfquery/symbols/jQuery.rdf.html#load&quot;&gt;rdfQuery&amp;#8217;s &lt;code&gt;$.rdf.load()&lt;/code&gt; method&lt;/a&gt; to load the triples encoded in the RDF/XML into an rdfQuery object that can then operate over those triples.&lt;/p&gt;

&lt;p&gt;Here&amp;#8217;s the relevant part of the code, in which &lt;code&gt;view&lt;/code&gt; is the URI for the particular offence class and &lt;code&gt;ht&lt;/code&gt; is a JIT HyperTree instance:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$.get(view, null, function (rdfXml) {
  var rdf, offences = {};
  rdf = $.rdf()
    .load(rdfXml)
    .prefix(&#039;skos&#039;, &#039;http://www.w3.org/2004/02/skos/core#&#039;)
    .prefix(&#039;crime&#039;, &#039;http://www.jenitennison.com/data/ontology/crime#&#039;);
  rdf
    .where(&#039;&amp;lt;&#039; + view + &#039;&amp;gt; skos:prefLabel ?label&#039;)
    .each(function () {
      offences.id = view;
      offences.name = this.label.value;
      offences.data = {
        &#039;$color&#039;: &#039;#0CC&#039;,
        &#039;type&#039;: &#039;class&#039;
      };
      offences.children = rdf
        .where(&#039;&amp;lt;&#039; + view + &#039;&amp;gt; skos:narrower ?subclass&#039;)
        .where(&#039;?subclass skos:prefLabel ?label&#039;)
        .map(function () {
          return {
            id: this.subclass.id,
            name: this.label.value,
            data: {
              &#039;type&#039;: &#039;subclass&#039;
            },
            children: rdf
              .where(this.subclass + &#039; skos:narrower ?offence&#039;)
              .where(&#039;?offence skos:prefLabel ?label&#039;)
              .where(&#039;?offence crime:maxSentence ?sentence&#039;)
              .map(function () {
                var sentence;
                if (this.sentence.datatype.toString() === &#039;http://www.w3.org/2001/XMLSchema#token&#039;) {
                  sentence = this.sentence.value;
                } else if (this.sentence.value &amp;gt; 12) {
                  sentence = this.sentence.value / 12 + &#039; years&#039;;
                } else {
                  sentence = this.sentence.value + &#039; months&#039;;
                }
                return {
                  id: this.offence.id,
                  name: this.label.value,
                  data: {
                    &#039;type&#039;: &#039;offence&#039;,
                    &#039;sentence&#039;: sentence
                  },
                  children: []
                };
              })
              .get()
          };
        })
        .get();
    });
  ht.loadJSON(offences);
  ht.refresh();
}, &#039;xml&#039;);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can look at the rest of the code simply by viewing source on &lt;a href=&quot;http://www.jenitennison.com/visualisation/offences.html&quot;&gt;&lt;code&gt;offences.html&lt;/code&gt;&lt;/a&gt; if you want to. It&amp;#8217;s mostly the same as the &lt;a href=&quot;http://thejit.org/Jit/Examples/Hypertree/example1.html&quot;&gt;HyperTree animation example&lt;/a&gt; but with a bit of refactoring particularly to add some jQuery goodness.&lt;/p&gt;

&lt;p&gt;Some random thoughts having done this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;rdfQuery is really good to use, even if I do say so myself. It provides a very flexible way of creating data structures based on RDF accessed from elsewhere, particularly because you have the full power of Javascript at your fingertips.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;JIT itself is OK to work with, though it doesn&amp;#8217;t have the ease of use that it could have. The visualisation&amp;#8217;s reasonably attractive, but my attempts to do clever things with the size of nodes to reflect the severity of the sentence proved fruitless.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The HyperTree visualisation works far better for smallish hierarchies (eg for &lt;a href=&quot;http://www.jenitennison.com/visualisation/offences.html?offences=http%3A%2F%2Fwww.jenitennison.com%2Fdata%2Fscheme%2Foffence%2Fcriminal-damage&quot;&gt;Criminal Damage&lt;/a&gt;) than for large ones (eg &lt;a href=&quot;http://www.jenitennison.com/visualisation/offences.html?offences=http%3A%2F%2Fwww.jenitennison.com%2Fdata%2Fscheme%2Foffence%2Fviolence-against-the-person&quot;&gt;Violence Against The Person&lt;/a&gt; or, if you have the patience, &lt;a href=&quot;http://www.jenitennison.com/visualisation/offences.html?offences=http%3A%2F%2Fwww.jenitennison.com%2Fdata%2Fscheme%2Foffence%2Fother-offences&quot;&gt;Other Offences&lt;/a&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Offence hierarchy itself is a bit of a mess. There are 738 &amp;#8216;Other Offences&amp;#8217; compared with 453 offences categorised within the other Classes, some of which contain only a handful of Offences. If this visualisation shows anything, it&amp;#8217;s how disorganised the offences are. Even more so if you take into account some of the other data that&amp;#8217;s been made available which I&amp;#8217;ll post about another time and shows a completely different classification. I wonder if there&amp;#8217;s data or other visualisations that would help identify where it could be rationalised.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The data is also out of date. I was surprised to see that it said that under the Piracy Act 1837 Section 2, Piracy with violence (one of the many &amp;#8216;Other Offences&amp;#8217;) still attracted a death penalty. But looking at the &lt;a href=&quot;http://www.statutelaw.gov.uk/documents/1837/88/ukpga/c88/2&quot;&gt;relevant Section on the Statue Law Database&lt;/a&gt; it appears that the death penalty was replaced with life imprisonment by &lt;a href=&quot;http://www.statutelaw.gov.uk/content.aspx?LegType=All+Legislation&amp;amp;searchEnacted=0&amp;amp;extentMatchOnly=0&amp;amp;confersPower=0&amp;amp;blanketAmendment=0&amp;amp;sortAlpha=0&amp;amp;PageNumber=0&amp;amp;NavFrom=0&amp;amp;parentActiveTextDocId=1570287&amp;amp;ActiveTextDocId=1570337&amp;amp;filesize=10394&quot;&gt;Section 36 of the Crime and Disorder Act 1998&lt;/a&gt;. Getting better links into the legislation itself might help identify similar problems with the offence data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I don&amp;#8217;t know how easy it would have been to create this visualisation if I hadn&amp;#8217;t been hosting the data myself. Danny Ayers put together a helpful post recently in which he listed the various ways of &lt;a href=&quot;http://blogs.talis.com/n2/archives/770&quot;&gt;getting around the restrictions in doing cross-domain Ajax&lt;/a&gt;, which I&amp;#8217;ll no doubt draw on if and when I need to do that.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/123#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/40">rdfQuery</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/49">visualisation</category>
 <pubDate>Sun, 16 Aug 2009 20:00:50 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">123 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>rdfQuery Dazzle, Oxford, 11-12 July</title>
 <link>http://www.jenitennison.com/blog/node/106</link>
 <description>&lt;p&gt;If you&amp;#8217;re anywhere near Oxford on the weekend of the 11-12th July, and are interested in parsing, querying and manipulating RDF(a) in a browser, come along to the &lt;a href=&quot;http://code.google.com/p/rdfquery&quot;&gt;rdfQuery&lt;/a&gt; Dazzle (hack days). The &lt;a href=&quot;http://swig.networkedplanet.com/dazzle.html&quot;&gt;official page&lt;/a&gt; lists some of the things we might work on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Applications: widgets, adding to existing RDFa sites&lt;/li&gt;
&lt;li&gt;Core development: documentation, packaging, microformats, named graphs, and ontologies&lt;/li&gt;
&lt;li&gt;Interfaces: Talis change markup, N3 and Turtle, and SPARQL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It&amp;#8217;s free to attend, you can come for either or both days, and refreshments, entertainment and wifi will be provided, so &lt;a href=&quot;http://rdfquery.eventbrite.com/&quot;&gt;register now&lt;/a&gt;!&lt;/p&gt;

&lt;!--break--&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/106#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/40">rdfQuery</category>
 <pubDate>Sat, 13 Jun 2009 19:14:28 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">106 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>rdfQuery: Progressive Enhancement with RDFa</title>
 <link>http://www.jenitennison.com/blog/node/94</link>
 <description>&lt;p&gt;Earlier this week I presented at &lt;a href=&quot;http://swig.networkedplanet.com/november2008.html&quot; title=&quot;Semantic Web Interest Group Community Event&quot;&gt;SWIG-UK&lt;/a&gt; about &lt;a href=&quot;http://code.google.com/p/rdfquery&quot; title=&quot;rdfQuery: RDF plugins for jQuery&quot;&gt;rdfQuery&lt;/a&gt;. rdfQuery is a set of plugins that I&amp;#8217;ve developed for &lt;a href=&quot;http://www.jquery.com&quot; title=&quot;jQuery: The Write Less, Do More, Javascript Library&quot;&gt;jQuery&lt;/a&gt; in order to support RDFa parsing, querying and generation. There are a bunch of other Javascript libraries for RDFa around, such as Mark Birbeck&amp;#8217;s &lt;a href=&quot;http://code.google.com/p/ubiquity-rdfa/&quot; title=&quot;Ubiquity RDFa&quot;&gt;Ubiquity RDFa&lt;/a&gt; and Ben Adida&amp;#8217;s &lt;a href=&quot;http://www.w3.org/2006/07/SWD/RDFa/impl/js/&quot; title=&quot;RDFa Javascript Library&quot;&gt;RDFa library&lt;/a&gt;. What I&amp;#8217;ve really tried to do with rdfQuery is tie it in with the &amp;#8220;Write Less, Do More&amp;#8221; philosophy of jQuery and provide a neat, elegant API. At least that&amp;#8217;s the aim!&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;So what does it do? Well, I&amp;#8217;ve just added the demo that I used on Tuesday into &lt;a href=&quot;http://code.google.com/p/rdfquery/source/checkout&quot; title=&quot;rdfQuery: SVN repository&quot;&gt;the repository&lt;/a&gt;, so if you grab hold of that you can take a look. Here&amp;#8217;s a screenshot of the demo.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/files/markup-demo.jpg&quot; alt=&quot;Screenshot of rdfQuery demo&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The demo shows the overall concept of rdfQuery, namely that semantic markup can be useful not only to the crawlers that extract data from your pages to pump into massive triplestores, but also for you as a developer. In this case, which is a simple genealogy-type application, I want to have the people and places that are relevant to this particular extract highlighted within the text. I also want them listed on the left, with their details summarised.&lt;/p&gt;

&lt;p&gt;So the demo illustrates three things that rdfQuery does to help:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;gleaning triples from a section of the page (not just the whole page); in this case the triples are marked up with RDFa&lt;/li&gt;
&lt;li&gt;querying the data to construct objects that represent the results of those queries, then doing things with those results&lt;/li&gt;
&lt;li&gt;automatically adding RDFa to elements within the page, to update the data that it holds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also demonstrates rough versions of a couple of things that rdfQuery could and should do that I aim to work on soon:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reasoning based on the data that&amp;#8217;s found in the page&lt;/li&gt;
&lt;li&gt;using ontologies to decide how to handle the data you find&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the by, it shows a simple, natural-language interface for making statements based on some text, but don&amp;#8217;t let that distract you. It&amp;#8217;s just regular expression processing.&lt;/p&gt;

&lt;p&gt;There are four parts of this page. The main part contains some text about Charles Darwin. If you look at the source, you&amp;#8217;ll see that it&amp;#8217;s been marked up with some RDFa like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;span about=&quot;#CharlesRobertDarwin&quot; typeof=&quot;foaf:Person&quot; 
      property=&quot;rdfs:label&quot; datatype=&quot;&quot;&amp;gt;
  &amp;lt;span property=&quot;foaf:firstName&quot;&amp;gt;Charles&amp;lt;/span&amp;gt; Robert 
  &amp;lt;span property=&quot;foaf:surname&quot;&amp;gt;Darwin&amp;lt;/span&amp;gt;
&amp;lt;/span&amp;gt; was &amp;lt;span about=&quot;#CharlesRobertDarwin&quot; rel=&quot;biografr:hasBirthPlace&quot;&amp;gt;born in 
&amp;lt;span about=&quot;#Shrewsbury&quot; typeof=&quot;vcard:Address&quot; 
      property=&quot;rdfs:label&quot; datatype=&quot;&quot;&amp;gt;
  &amp;lt;span property=&quot;vcard:locality&quot;&amp;gt;Shrewsbury&amp;lt;/span&amp;gt;, 
  &amp;lt;span property=&quot;vcard:region&quot;&amp;gt;Shropshire&amp;lt;/span&amp;gt;, 
  &amp;lt;span property=&quot;vcard:country&quot;&amp;gt;England&amp;lt;/span&amp;gt;
&amp;lt;/span&amp;gt;&amp;lt;/span&amp;gt; on &amp;lt;span about=&quot;#CharlesRobertDarwin&quot; property=&quot;biografr:bornOn&quot; 
  content=&quot;1809-02-12&quot; datatype=&quot;xsd:date&quot;&amp;gt;12 February 1809&amp;lt;/span&amp;gt; at his 
  family home, the Mount. ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This source is also shown in the textarea at the bottom of the page. Obviously this wouldn&amp;#8217;t be visible in a real application, but it enables you to see what&amp;#8217;s going on in the demo. Changing it won&amp;#8217;t do anything; you can imagine that string being POSTed back to the server. I&amp;#8217;ve also added some CSS that gives a border to any elements with a &lt;code&gt;typeof&lt;/code&gt; attribute. Elements that have a &lt;code&gt;property&lt;/code&gt; change their colour when you mouse over them.&lt;/p&gt;

&lt;p&gt;So the page contains some RDFa. But what it doesn&amp;#8217;t contain (in the source) is any information in the menu on the left. This gets populated based on the RDFa. If you click on Charles Robert Darwin, you&amp;#8217;ll see the data that&amp;#8217;s been gleaned about him, including (at the bottom), the derived fact that Robert Darwin was Charles Darwin&amp;#8217;s father. Anything in black is information pulled from the RDFa; anything in orange is derived.&lt;/p&gt;

&lt;p&gt;Next, try typing &amp;#8220;Susannah Darwin was a person&amp;#8221; into the text input and hit return. You should get Susannah Darwin added to the list of people. More importantly, if you look at the new source of the page, you&amp;#8217;ll see that the phrase &amp;#8220;Susannah Darwin&amp;#8221; has been marked up with some RDFa to indicate that she was, indeed, a person and that &amp;#8220;Susannah Darwin&amp;#8221; can be used as a label for her.&lt;/p&gt;

&lt;p&gt;You can try typing a few more facts into the box if you like. I suggest:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&amp;#8220;Charles Robert Darwin was also known as Darwin&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;Susannah Darwin was Darwin&amp;#8217;s mother&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;Susannah Darwin&amp;#8217;s surname was Darwin&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;Josiah Wedgwood was a person&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;Susannah Darwin&amp;#8217;s father was Josiah Wedgwood&amp;#8221;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So this is all very cool and all, but the point was to show the code. So here we go.&lt;/p&gt;

&lt;h2&gt;Gleaning and Querying&lt;/h2&gt;

&lt;p&gt;Let&amp;#8217;s look at how the lists are populated:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;457: populateLists = function () {
458:   var rdf = $(&#039;#content&#039;).rdf();
459:   people.empty();
460:   places.empty();
461:   rdf
462:     .prefix(&#039;rdfs&#039;, ns.rdfs)
463:     .prefix(&#039;foaf&#039;, ns.foaf)
464:     .where(&#039;?person a foaf:Person&#039;)
465:     .where(&#039;?person rdfs:label ?label&#039;)
466:     .each(function () {
467:       addIndividual(people, this.person, this.label.value);
468:     })
469:     .reset()
470:     .where(&#039;?place a vcard:Address&#039;)
471:     .where(&#039;?place rdfs:label ?label&#039;)
472:     .each(function () {
473:       addIndividual(places, this.place, this.label.value);
474:     });
475: },
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Line 458 parses the RDFa within the element with the id &lt;code&gt;content&lt;/code&gt; (this is a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; that wraps around the paragraphs about Charles Darwin). The &lt;code&gt;rdf&lt;/code&gt; variable holds an rdfQuery object, which is very similar to a jQuery object except that it queries over RDF triples rather than DOM nodes. The rdfQuery object holds a pointer to a databank, which holds the triples that have been collected.&lt;/p&gt;

&lt;p&gt;Lines 459 and 460 empty out the existing lists of people and places if there are any. The &lt;code&gt;people&lt;/code&gt; and &lt;code&gt;places&lt;/code&gt; variables are set earlier in the script and are jQuery objects.&lt;/p&gt;

&lt;p&gt;Now the fun begins. I first set some prefixes on the rdfQuery object so that I can use those prefixes in &lt;a href=&quot;http://www.w3.org/TR/curie/&quot; title=&quot;CURIE Syntax 1.0&quot;&gt;CURIEs&lt;/a&gt; within the queries. In fact, these prefixes will have been set up by default anyway, because they&amp;#8217;re declared in the HTML page, but it doesn&amp;#8217;t hurt.&lt;/p&gt;

&lt;p&gt;Lines 464 and 465 locate triples in the databank based on simple &lt;a href=&quot;http://www.w3.org/TR/rdf-sparql-query/&quot; title=&quot;SPARQL Query Language for RDF&quot;&gt;SPARQL&lt;/a&gt;-based queries. The first &lt;code&gt;where()&lt;/code&gt; call creates a new rdfQuery object that, like a jQuery object, looks a bit like an array. The array contains objects, one for each triple that matches the pattern &lt;code&gt;?person a foaf:Person&lt;/code&gt;. Each of the objects has a &lt;code&gt;person&lt;/code&gt; property containing the resource that represents the person. So the rdfQuery that results from this &lt;code&gt;where()&lt;/code&gt; call looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{ length: 2,
  0: { person: $.rdf.resource(&#039;&amp;lt;#CharlesRobertDarwin&amp;gt;&#039;) },
  1: { person: $.rdf.resource(&#039;&amp;lt;#RobertDarwin&amp;gt;&#039;) },
  ... bunch of other properties and methods that aren&#039;t important here ... }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The second &lt;code&gt;where()&lt;/code&gt; call creates another new rdfQuery object based on combining the previous query results with the any triples that match the pattern &lt;code&gt;?person rdfs:label ?label&lt;/code&gt;. This holds objects with &lt;code&gt;person&lt;/code&gt; and &lt;code&gt;label&lt;/code&gt; properties, one for each person and their label. So the result of this looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{ length: 2,
  0: { person: $.rdf.resource(&#039;&amp;lt;#CharlesRobertDarwin&amp;gt;&#039;),
       label:  $.rdf.literal(&#039;&quot;Charles Robert Darwin&quot;&#039;) },
  1: { person: $.rdf.resource(&#039;&amp;lt;#RobertDarwin&amp;gt;&#039;),
       label:  $.rdf.literal(&#039;&quot;Robert Darwin&quot;&#039;) },
  ... bunch of other properties and methods that aren&#039;t important here ... }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;each()&lt;/code&gt; on lines 466-468 then iterates over the objects that it&amp;#8217;s constructed and calls the &lt;code&gt;addIndividuals()&lt;/code&gt; function (which is particular for this demo), passing in the list in the HTML, the person resource and the value of the label literal.&lt;/p&gt;

&lt;p&gt;Line 469 uses the &lt;code&gt;reset()&lt;/code&gt; method to go back to the original rdfQuery object. If I didn&amp;#8217;t do this, any further queries would simply add to the objects that I already have, or remove them if nothing matched.&lt;/p&gt;

&lt;p&gt;Lines 470-474 do the same thing for the places that are marked up within the text.&lt;/p&gt;

&lt;p&gt;There are various other places within &lt;code&gt;markup.js&lt;/code&gt; that glean and query RDF. For example, the &lt;code&gt;addDescription()&lt;/code&gt; function, which populates the list items on the left with data about particualr people and places. That function demonstrates the use of the &lt;code&gt;about()&lt;/code&gt; method, which gives you all the triples about a particular subject, and shows how to use arguments with the &lt;code&gt;each()&lt;/code&gt; method when you want to use the index of the query result or the triples that were used to create the query result.&lt;/p&gt;

&lt;h2&gt;Updating RDFa&lt;/h2&gt;

&lt;p&gt;So how easy is it to update the RDFa on the web page? Well, if you know what you want to add, then it&amp;#8217;s dead easy. Here&amp;#8217;s the code that does it on lines 452 and 532:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;span.rdfa(triple);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The variable &lt;code&gt;span&lt;/code&gt; here is a jQuery object. The triple itself is a &lt;code&gt;$.rdf.triple&lt;/code&gt;. I&amp;#8217;d tell you more about them but I think I&amp;#8217;ve gone on long enough.&lt;/p&gt;

&lt;h2&gt;Final Words&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;http://code.google.com/p/rdfquery&quot; title=&quot;rdfQuery: RDF plugins for jQuery&quot;&gt;rdfQuery&lt;/a&gt; is a Google Code project, released under an MIT license. If you&amp;#8217;re interested in contributing, send me an email and I&amp;#8217;ll add you as a member, or an owner if you&amp;#8217;re really keen. If you&amp;#8217;re interested, I&amp;#8217;ve set up a &lt;a href=&quot;http://groups.google.com/group/rdfquery&quot; title=&quot;rdfQuery Discussion Group&quot;&gt;discussion group&lt;/a&gt;. You can post any questions there, although of course if you find bugs, do &lt;a href=&quot;http://code.google.com/p/rdfquery/issues/entry&quot; title=&quot;rdfQuery: Add Issue&quot;&gt;add an issue&lt;/a&gt;.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/94#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/34">genealogy</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/41">jQuery</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/31">rdf</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/42">rdfa</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/40">rdfQuery</category>
 <enclosure url="http://www.jenitennison.com/blog/files/markup-demo.jpg" length="286620" type="image/jpeg" />
 <pubDate>Sat, 15 Nov 2008 16:46:02 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">94 at http://www.jenitennison.com/blog</guid>
</item>
</channel>
</rss>

