Re: Partitioning overlapping markup

Hey Jeni,

I offer a small clarification, which doesn’t affect the gist of your post but which does perhaps show a bit more of the implications of Bert’s work. His routine (as I understand it) transforms from one sort of well-formed-but-inadequate representation of overlap, in “milestone” notation (could be “Trojan”, as here), into a different representation, namely aligned-but-segmented XML elements (this is Bert’s example):

That is, we start with

<p>Dorothy said: <q sID=”q1”/>...</p>
<p>...<q eID=”q1”/> and that was all.</p>

and we get

<p>Dorothy said: 
  <q id=”q1.0” next=”q1.1”>...</q>
</p>
<p>
  <q id=”q1.1” prev=”q1.0”>...</q>
  and that was all.
</p>

(with apologies for cosmetic whitespace, funny quote marks etc.)

The reverse direction, of course, is much easier to manage … although in both cases, you have the problem of knowing which elements are first class and which are subordinate.

Since milestones seem to be the current favorite kluge for overlap, but (as you note) the segmented-aligned form is what we need for display in SGML/XML rendering systems, a solution to this should be particularly useful.

I’ve invited Bert to provide updates at http://www.lmnlwiki.org.

Reply

The content of this field is kept private and will not be shown publicly.