Another question to answer:
I’ve been reading about RDF, and I’m not sure in what situations it is more appropriate to use RDF over straight XML. I usually see RDF expressed as XML, but sometimes I see it written as language-independent functions (or methods).
Part of me is wondering if RDF is more appropriate for this project. What might the benefits be? And if it is, how difficult it would be to refactor it.
(Note that the person asking the question is talking about a small data-oriented project.) There’s a huge amount that could be said about this, so I might well post about some of it again. Here, I’m going to cut to the chase. This is what I’d recommend:
Model your application in RDF terms: Create a description of what classes of resources your application needs to deal with, and which properties link those together. You can call this description a RDF schema or conceptual model or ontology, depending on how impressive you want to sound. This modelling activity is useful in itself, largely because it helps you understand what information you’re dealing with and how it fits together.
Create a markup language that can be mapped to RDF: An XML version of your data allows you to make your data more generally available and reusable than locking it away in a triple store. Do one of the following:
Define a subset of RDF/XML for your application: The full flexibility of RDF/XML is complicated to handle for plain XML processors, so subset it to, for example, always used typed elements (such as <my:Course>) rather than rdf:type properties, and to use referencing or nesting in a consistent way.
Design markup languages that use RDFa attributes to reflect the semantics of the data: This gives you a standard way of mapping your markup language into RDF triples without having to adopt the “striped” design of RDF/XML in your markup language. A lot of the attributes can be defaulted to leave the markup language fairly streamlined.
Design markup languages exactly as you like, and define GRDDL mappings from them into RDF/XML: This gives you the most flexibility in your markup language design (though not complete flexibility — you still need to be able to identify the statements that you want to make from the XML), at the expense of having to write some XSLT.
The point of doing this is to put you in a position where you can just use XML if you want, but you also have the flexibility of using RDF either now or in the future.
The benefits of using RDF are partly to do with the ease with which you can do certain kinds of processing (specifically combining “facts” together to draw conclusions) and partly to do with the potential of reuse of your data. In the same way that XML gives people a common syntax and thus aids interchange of information, RDF allows others to draw some conclusions (more than they would with a random mess of elements and attributes) about what your data means.
I don’t think that using RDF triple stores, SPARQL and all that jazz gives you a great return for a small-scale, personal project — you’re better off sticking to flat files and some XSLT — but it doesn’t hurt to build in some of the formality of RDF anyway.
Comments
Re: RDF and XML Q&A: Which should I use?
Basically the message is use the technology when it gives you benefits. That’s a reasonable take. Only geeks ;) like to use it for the state of art.
On my personal experience, I have used rdf (n3) because it helped me to take into account for my lazyness. I’m using it for managing the QA Matrix. I don’t have to respect a specific order in the document and this is cool.
PS: the captcha stops working when you block cookies. :/
Re: RDF and XML Q&A: Which should I use?
“You can call this description a RDF schema or conceptual model or ontology, depending on how impressive you want to sound”
And if you want to sound really, like, old school(!) you can refer to it as plain old entity-relationship modelling as really thats all it is. My current view is that designing RDF vocabularies is like E-R modelling, only without that need to eventually make the model fit into a physical database schema. Decide on the things, properties and their relationships and you’re done.
I’ve been using the “Define a subset of RDF/XML for your application” approach quite successfully for the past 6 months or so on a client project. I’ve been creating an RDF vocabulary that the client is shipping to us as XML documents. The RDF profile is validated with XML Schema as a first pass. This provides the up-front validation that you get with XML, but the content is entirely valid RDF/XML too, so we can just throw it into a triple store if it passes validation. Its works really well.
With this approach there’s also suprisingly little RDF/XML syntax that creeps through into your XML vocabulary. For us this has amounted to:
rdf:about, rdf:resource — which are easy to grok from a “primary key” & “foreign key” standpoint
rdf:parseType=”Collection” — for lists of stuff (we’ve avoided Seq, Bag, etc)
rdf:parseType=”Literal” — for XML literals. A bit funky, but preferable to CDATA.
Other than that we use namespaces everywhere, xml:base to shorten URIs whereever possible, and avoid using rdf:RDF as the root element of the document.
I’ll try and find the time to write this up better, but thought I’d leave a comment here to begin with!
Re: RDF and XML Q&A: Which should I use?
Thanks for the post Leigh, that can act as a mini cheatsheet for me!
Re: RDF and XML Q&A: Which should I use?
XML is better. Go with it
Re: RDF and XML Q&A: Which should I use?
Thanks Leigh. I’m curious about avoiding using rdf:RDF as the root element of the document. I guess that means that there’s a single resource that everything else is related to in some way?
Re: RDF and XML Q&A: Which should I use?
Hi Jeni,
Yes, thats correct, everything in the document is related in some form to the primary resource.
Our system takes a collection of XML documents, each of which describe a single resource, its relationships to other primary resources, and any “sub-ordinate” resources that can fully described alongside its primary resource (e.g. blank nodes).
Combining this RDF into a single triple store builds a large interconnected graph. But outside of that store, there are small chunks of XML that can be easily manipulated.
Re: RDF and XML Q&A: Which should I use?
rdf:RDF is just an optional container for RDF triples: you never have to use it if you have your own container, or if you extend an existing container like html:head.