<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.jenitennison.com/blog" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>overlapping markup</title>
 <link>http://www.jenitennison.com/blog/taxonomy/term/9</link>
 <description>The taxonomy view with a depth of 0.</description>
 <language>en</language>
<item>
 <title>Partitioning overlapping markup</title>
 <link>http://www.jenitennison.com/blog/node/27</link>
 <description>&lt;p&gt;&lt;a href=&quot;http://www.piez.org/&quot; title=&quot;Wendell&#039;s Home Page&quot;&gt;Wendell Piez&lt;/a&gt; forwarded me an interesting poster by &lt;a href=&quot;http://www.huygensinstituut.knaw.nl/index.php?option=com_content&amp;amp;task=view&amp;amp;id=120&amp;amp;Itemid=57&quot; title=&quot;Bert Van Elsacker&quot;&gt;Bert Van Elsacker&lt;/a&gt; on automatic fragmentation of overlapping structures. That&amp;#8217;s taking something like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;bold&amp;gt; this is bold &amp;lt;italic&amp;gt; and italic &amp;lt;/bold&amp;gt; text &amp;lt;/italic&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and turning it into something well-formed, like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;bold&amp;gt; this is bold &amp;lt;italic&amp;gt; and italic &amp;lt;/italic&amp;gt;&amp;lt;/bold&amp;gt;&amp;lt;italic&amp;gt; text &amp;lt;/italic&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When you do this, you have to decide which elements can be split and which can&amp;#8217;t, and their relative priorities. Wendell suggested that perhaps Creole might help to do this. I have been thinking about is using Creole to add annotations to markup (something like, you add attributes to the Creole patterns and they get copied on to the matched ranges, or are used to create new ranges), but I haven&amp;#8217;t done that yet, and actually I think you probably want a different kind of language to do it (&lt;a href=&quot;http://blog.jclark.com/2007/04/do-we-need-new-kind-of-schema-language.html&quot; title=&quot;James Clark: Do we need a new kind of schema language?&quot;&gt;a new kind of schema language&lt;/a&gt; like James Clark suggested), because the way in which you break up overlapping structures has a lot to do with how you&amp;#8217;re going to process them.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;I&amp;#8217;m reminded of the paper&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Sperberg-McQueen, C. M., David Dubin, Claus Huitfeldt and Allen Renear. “&lt;a href=&quot;http://www.idealliance.org/papers/extreme/proceedings/html/2002/CMSMcQ01/EML2002CMSMcQ01.html&quot;&gt;Drawing inferences on the basis of markup.&lt;/a&gt;” In Proceedings of Extreme Markup Languages 2002. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;in which (based on my memory of the talk) they discuss how different elements allow you to make different assertions about the text they contain, and consequently can be split in different ways. For example, a &lt;code&gt;&amp;lt;paragraph&amp;gt;&lt;/code&gt; element can&amp;#8217;t be split into two &lt;code&gt;&amp;lt;paragraph&amp;gt;&lt;/code&gt; elements without changing the meaning of the document, whereas a &lt;code&gt;&amp;lt;bold&amp;gt;&lt;/code&gt; element can be split into two &lt;code&gt;&amp;lt;bold&amp;gt;&lt;/code&gt; elements with no problems because it&amp;#8217;s really indicating &amp;#8220;these characters are bold&amp;#8221; rather than &amp;#8220;this is a bold phrase&amp;#8221;.&lt;/p&gt;

&lt;p&gt;You can take a purist view (which would usually entail splitting hardly any elements, since most elements &lt;em&gt;do&lt;/em&gt; mark up a range of text rather than the individual characters they contain), but I think the main reason you want to do this fragmentation is for presentation. And in that context, the notional semantics of the element don&amp;#8217;t really matter: what matters is how they&amp;#8217;re styled. For example, a &lt;code&gt;&amp;lt;comment&amp;gt;&lt;/code&gt; element, marking up a range of text that has been commented on, might not be splittable at a theoretical level, but if you&amp;#8217;re going to render it simply by turning the background yellow, then in fact you &lt;em&gt;can&lt;/em&gt; split it for that purpose.&lt;/p&gt;

&lt;p&gt;Since it&amp;#8217;s related to presentation, I wonder whether you could use a (simplified) CSS stylesheet to provide both the fragmentation and the style. Block-level elements (&lt;code&gt;display: block;&lt;/code&gt;) couldn&amp;#8217;t be split whereas inline elements could. Elements that have the box model properties (margin, padding &amp;amp; borders) can&amp;#8217;t be split, or, if they are, you need to mark the fragments as &amp;#8220;left&amp;#8221;, &amp;#8220;middle&amp;#8221; and &amp;#8220;right&amp;#8221;, and only apply the &lt;em&gt;left&lt;/em&gt; margin/padding/border to the &amp;#8220;left&amp;#8221; fragment, and similarly with the right.&lt;/p&gt;

&lt;p&gt;It wouldn&amp;#8217;t be a general purpose transformation mechanism, but it would be darned useful!&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/27#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/7">creole</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/9">overlapping markup</category>
 <pubDate>Mon, 11 Jun 2007 21:36:09 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">27 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>A Creole by any other name...</title>
 <link>http://www.jenitennison.com/blog/node/6</link>
 <description>&lt;p&gt;Argh. I&amp;#8217;ve been contacted by the guys at &lt;a href=&quot;http://www.wikicreole.org&quot; title=&quot;Creole Wiki Markup language&quot;&gt;WikiCreole&lt;/a&gt; who want me to change the name of &lt;a href=&quot;http://www.lmnlwiki.org&quot; title=&quot;Creole schema language&quot;&gt;Creole&lt;/a&gt;. What should I do? Not only is &amp;#8220;Creole&amp;#8221; a great name for a schema language that deals with concurrent markup, but it&amp;#8217;s a great acronym too (Composable regular expressions for overlapping languages etc.)&lt;/p&gt;

&lt;p&gt;I did Google when I first came up with the name in August 2006, but didn&amp;#8217;t discover WikiCreole (unsurprisingly, since it was only coined in July 2006 itself). But now far more many people know, care about and use WikiCreole than Creole grammars. So any suggestions for alternative names?&lt;/p&gt;

&lt;!--break--&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/6#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/7">creole</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/9">overlapping markup</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <pubDate>Wed, 25 Apr 2007 21:09:28 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">6 at http://www.jenitennison.com/blog</guid>
</item>
</channel>
</rss>
