Re: Your Website is Your API: Quick Wins for Government Data

Thanks for the questions.

  1. The problem with URNs (meaning universal resource names as opposed to the unique reference numbers used for schools as described in the post) is that they aren’t resolvable. You can’t plug a URN into a browser to find out more about whatever resource is named by the URN.

    http://www.companieshouse.co.uk/id/company/00445790 really is (I assert!) an identifier for the company, and should be used to mean the company as opposed to particular information about the company. But when you request it, you should get redirected to some information about the company (the document URI), and that is a particular view.

    If I make up a set of other identifier URIs for companies (as we’ve done at http://www.gazettes-online.co.uk/id/proxy/company/{companyNumber}) then I can assert separately that my identifier URIs mean the same thing as the better or more official identifier URIs provided by companies house.

  2. All the URIs should be hackable. Continuing the example of what Companies House might do, the identifier URI http://www.companieshouse.co.uk/id/company/00445790 might redirect (via 303 See Other) to http://www.companieshouse.co.uk/company/00445790 (note that the /id part of the path is now gone). The response to this depends on the client doing the requesting:

    • A web browser will have an HTML page returned to it. This is actually the representation http://www.companieshouse.co.uk/company/00445790.htm.
    • A feed reader will have an Atom feed returned to it. This is actually the representation http://www.companieshouse.co.uk/company/00445790.feed.
    • A semantic web crawler will have a RDF/XML document returned to it. This is actually the representation http://www.companieshouse.co.uk/company/00445790.rdf.

    The representations themselves should have links to other representations. The HTML page in particular should contain links in the <head> that crawlers can pick up:

    <link rel="alternate" type="application/atom+xml" href="/company/00445790.feed" />
    <link rel="alternate" type="application/rdf+xml" href="/company/00445790.rdf" />
    

    and may also contain explicit links in the content of the page so that humans can use them.

    The point is that each of these representations is actually a different representation of the same set of information. The document URI is the URI that identifies the information; the representation URIs are URIs that identify different formats for that same information.

Hope that makes sense?

Reply

The content of this field is kept private and will not be shown publicly.