Update: This post has been translated to Italian on the Linked Open Data Italia blog.
In this post about schema.org I’m going to speculate about the economic drivers that affect how search engines use structured metadata on the web. I discuss how the technical features and choices within schema.org may cause wider long-term harm, and the role of open standards as a method for responsible companies to avoid the pitfalls of monopoly.
I can’t reply to Henri Sivonen
@JeniT What’s wrong with http://rdf.data-vocabulary.org/rdf.xml ?
in 140 characters.
http://rdf.data-vocabulary.org/rdf.xml is the the RDF schema that describes the classes and properties recognised by Google’s rich snippets, which promises to provide richer information about search results than is available currently, in the manner of SearchMonkey.
So what’s so bad about this RDF schema?
A Google Search Appliance (GSA) is a box that you plug into your network which crawls and indexes your data, and serves up the results of searches. Search results come in an XML format, and there’s a built in XSLT engine which means you can convert that XML into as many different views as you like. So you can have HTML-based search results, summaries, feeds, and so on.
My task recently was to debug some XSLT that transformed the GSA XML into an Atom feed. Easy enough, right? The GSA XML format is pretty hideous — most of the elements max out at three capital letters in length (whatever happened to human-readability) — but logical enough, and the mapping is hardly complex.
But all was not as it seemed. The GSA’s XSLT implementation is… how can I put this politely?… “non-standard”. This post describes some of the problems and workarounds.
Thursday afternoon had a few really interesting talks. I learned about the Google Data API (no longer called gData); Oracle’s use of XLink to represent relationships between documents, and the requirements that entails; using XSLT to create JSON to use Exhibit widgets; and using XMPP to enhance instant messaging.
On Thursday morning, I was down to chair the first session in the “Core Technologies” track. Two interesting papers: one on XForms and one on Google Base. Then I snuck on to the “Applications” track to hear about scientific Wikis and the trials of managing schema repositories.