Multiple Hierarchic Structures


Date:       Fri, 11 Jun 1999 15:18:54 +0100 (BST)
From:       Richard Tobin <richard@cogsci.ed.ac.uk>
To:         xml-dev@ic.ac.uk
Subject:    RE: question for a friend



[Kay Michael <Michael.Kay@icl.com>]
> But yes, there are
> situations where you want to impose multiple hierarchic structures over the
> same linear data. An obvious one is logical structure vs layout structure
> (e.g. sections and pages) another is logical structure vs change history and
> authorship. 

> I'm only aware of two answers:

----------- viz.,

1. Represent one of the hierarchies using empty tags, e.g. <PAGEBREAK/>.
2. Split the units of one of the heirarchies to achieve proper nesting:

<SPEECH>
<SPEAKER>OTHELLO</SPEAKER>
<LINE>Most potent, grave, and reverend signiors,</LINE>
<LINE>My very noble and approved good masters,</LINE>
<LINE>That I have ta'en away this old man's daughter,</LINE>
<LINE>It is most true; true, I have married her:</LINE>
<LINE>The very head and front of my offending</LINE>
<LINE>Hath this extent, no more. Rude am I in my speech,</LINE>
<LINE>And little bless'd with the soft phrase of peace:</LINE>
<LINE>For since these arms of mine had seven years' pith,</LINE>
<LINE>Till now some nine moons wasted, they have used</LINE>
<LINE>Their dearest action in the tented field,</LINE>
<LINE>And little of this great world can I speak,</LINE>
<LINE>More than pertains to feats of broil and battle,</LINE>
<LINE>And therefore little shall I grace my cause</LINE>
<LINE>In speaking for myself. Yet, by your gracious patience,</LINE>
<LINE>I will a round unvarnish'd tale deliver</LINE>
<LINE>Of my whole course of love; what drugs, what charms,</LINE>
<LINE>What conjuration and what mighty magic,</LINE>
<LINE>For such proceeding I am charged withal,</LINE>
<LINE PART="1">I won his daughter.</LINE>
</SPEECH>

<SPEECH>
<SPEAKER>BRABANTIO</SPEAKER>
<LINE PART="2">A maiden never bold;</LINE>
<LINE>Of spirit so still and quiet, that her motion</LINE>
<LINE>Blush'd at herself; and she, in spite of nature,</LINE>
<LINE>Of years, of country, credit, every thing,</LINE>
<LINE>To fall in love with what she fear'd to look on!</LINE>
<LINE>It is a judgment maim'd and most imperfect</LINE>
<LINE>That will confess perfection so could err</LINE>
<LINE>Against all rules of nature, and must be driven</LINE>
<LINE>To find out practises of cunning hell,</LINE>
<LINE>Why this should be. I therefore vouch again</LINE>
<LINE>That with some mixtures powerful o'er the blood,</LINE>
<LINE>Or with some dram conjured to this effect,</LINE>
<LINE>He wrought upon her.</LINE>
</SPEECH>

---------end ----

Another solution, for some purposes, is to have two documents, one for
each hierarchy.  Of course, you don't want to to duplicate the data
itself.  We avoid this by using "standoff markup", which we implement
with XLinks (we have our own software to perform the transclusion
process).

The context we use this in is markup of linguistic corpora, and we
often have an ID on every word.  For the Shakespeare example it might
be something like this:

fragment of base file (othello.xml):

<w id="w20">charged</w>
<w id="w21">withal</w>
<punct id="p5">,</punct>
<w id="w22">I</w>
<w id="w23">won</w>
<w id="w24">his</w>
<w id="w25">daughter</w>
<punct id="p6">.</punct>
<w id="w26">A</w>
<w id="w27">maiden</w>
<w id="w28">never</w>
<w id="w29">bold</w>
<punct id="p7">;</punct>
<w id="w30">Of</w>
<w id="w31">spirit</w>

fragment of speech file:

<!ENTITY o SYSTEM "othello.xml">
<!ATTLIST speech 
          href      CDATA    #IMPLIED
          xml:link  CDATA    #FIXED "simple"
          show      CDATA    #FIXED "embed"
          actuate   CDATA    #FIXED "auto">

<speech id="s5" href="&o;#id(w5)..id(p6)/>
<speech id="s6" href="&o;#id(w26)..id(p9)/>

fragment of line file:

<line id="l10" href="&o;#id(w22)..id(p7)/>

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 
981-02-3594-1