Re: Web 2.0 project: RDF and uncertainty

Well, the first thing is that your description of evidence-based assertions is exactly how the GENTECH data model works, and that’s what we’re using as the basis of our application. In the GENTECH data model, if a person is mentioned in a piece of evidence, a persona is created for that person. If they are mentioned again in a different piece of evidence, another persona is created. If you want to say that those individuals are actually one and the same, that is a separate assertion, with all the uncertainty that that entails.

(The same goes for groups and events, by the way.)

I think what we’ll end up with is a kind of hierarchy of personas: the bottom layer will be personas that are directly generated based on the evidence. Higher layers will be created through “these are the same person” assertions. At any point it should be possible to snip off a subtree by saying “actually, these aren’t the same person after all”.

Another, more webby, way of viewing this is to say that the higher levels are aggregations (like feed aggregations). What you see about a persona are the assertions made about that persona directly. If an assertion is made that two personas are the same, you get a new persona who is an aggregation of the two that it’s based on: you see an aggregation of the assertions about the two personas.

We could do some clever things when aggregating assertions, such as comparing the statements to see if they’re contradictory, but (at least to begin with) I think we might just rely on people-power to correct mistaken assertions.

Versioning is going to be interesting. I’ll have to think about that some more.

Reply

The content of this field is kept private and will not be shown publicly.