[This local archive copy is from the official and canonical URL, http://journals.ecs.soton.ac.uk/xml4j/xlinkexperience.html; please refer to the canonical source document if possible. Some font characteristics have been changed in this copy to enable viewing on low-end browsers.]
An initial objective was to re-create the Knit application, designed by the Language Technology at Edinburgh as part of the LTXML package, which performed ‘transclusion’ operations to replace inline, simple id()-based links in an XML file with the data object to which it was linked. The ExtremeKnit class uses the XLink package to extend that functionality to extended links in XML documents using (almost) the full XPointer semantics.
Beyond these direct aims, it is hoped to use this package as a testbed and demonstrator for ideas about open hypermedia. In particular, to address the current debate about separating behaviours into logically distinct components [Nurnberg], the extent to which semantics can pass between a general link processor and application [Carr] and to create a model for the life cycle of the followlink process [Trigg].
This work is at an early stage, and is currently being tested in house.
To obtain the software, please contact the author on the email above.
The original XML file |
<?xml version="1.0"?> <!DOCTYPE XLinkTest SYSTEM "http://journals.ecs.soton.ac.uk/xml4j/generic.dtd"> <XLinkTest> <body> <div> <h1>Reasons to be Cheerful</h1> <p>One, two three. Well, the hypertext conference is a good one!</p> <p>I think that open hypertext has a lot going for it. There are still a lot of people in the WWW community that don't understand hypertext very well. I could become a bit of a salesman!</p> </div> </body> <linkbases> <linkbase href="http://journals.ecs.soton.ac.uk/xml4j/linkbase.xml"/> </linkbases> </XLinkTest> |
The linkbase |
<?xml version="1.0"?> <!DOCTYPE linkbase SYSTEM "http://journals.ecs.soton.ac.uk/xml4j/linkbase.dtd"> <linkbase> <link type='generic'> <locator href="#root().string(all,'hypertext',1,9)" role="source" title="hypertext gloss"/> <locator href="http://journals.ecs.soton.ac.uk/xml4j/htgloss.xml#id(hypertext)" role="destination" title="Dr Frankenstein"/> </link> <link type='generic'> <locator href="#root().string(all,'WWW',1,3)" role="source" title="WWW gloss"/> <locator href="http://journals.ecs.soton.ac.uk/xml4j/htgloss.xml#id(WWW)" role="destination" title="Mr WWW"/> <locator href="http://journals.ecs.soton.ac.uk/xml4j/htgloss.xml#id(w3c)" role="destination" title="Home of Hypertext"/> </link> </linkbase> |
The glossary file |
<?xml version="1.0"?> <!DOCTYPE glossary [ <!ELEMENT glossary (gloss)+> <!ELEMENT gloss (#PCDATA)> <!ATTLIST gloss id ID #REQUIRED> ]> <glossary> <!-- some trivial hypertext thingies --> <gloss id="hypertext">(One of Vanevar Bush's better ideas!)</gloss> <gloss id="w3c">(Visit W3C - The home of hypertext on the Web!)</gloss> <gloss id="WWW">[Insanely good idea of Tim BL]</gloss> </glossary> |
The result of
ExtremeKnitware -A -i |
<?xml version="1.0"?> <!DOCTYPE XLinkTest SYSTEM "http://journals.ecs.soton.ac.uk/xml4j/generic.dtd" > <XLinkTest> <body> <div> <h1>Reasons to be Cheerful</h1> <p>One, two three. Well, the hypertext(One of Vanevar Bush's better ideas!)(Visit Webcosm - The home of hypertext on the Web!) conference is a good one!</p> <p>I think that open hypertext(One of Vanevar Bush's better ideas!)(Visit Webcosm - The home of hypertext on the Web!) has a lot going for it. There are still a lot of people in the WWW community that don't understand hypertext(One of Vanevar Bush's better ideas!)(Visit Webcosm - The home of hypertext on the Web!) very well. I could become a bit of a salesman!</p> </div> </body> <linkbases xml:link="group"> <linkbase href="http://journals.ecs.soton.ac.uk/xml4j/linkbase.xml" xml:link="document"/> </linkbases> </XLinkTest> |
A sample XSL rule to display the linked glosses |
<rule> <target-element type="gloss"/> <SPAN font-weight="bold" color="green"> <children/> </SPAN> </rule> |
The result of running the MSXL processor on the output of
ExtremeKnitware -A and the XSL fragment shown above. |
Reasons to be CheerfulOne, two three. Well, the ACM hypertext (One of Vanevar Bush's better ideas!) conference is a good one!I think that open hypertext (One of Vanevar Bush's better ideas!) has a lot going for it. There are still a lot of people that don't understand hypertext (One of Vanevar Bush's better ideas!) very well. I could become a bit of a WWW [Insanely good idea of Tim BL] salesman! |
Doc d=parse(inputXMLFile);The user can then process all the links by iterating through them
Vector allLinks=getLinks(d);
for(int i=0; i<links.size(); i++){And then process each link by iterating through the anchors:
XLink xl=(XLink)links.elementAt(i);
//process each link
}
Vector anchors=xl.getAnchors();And then process each anchor according to its role, iterating through the pointed at nodes if necessary:
for(int j=0; j<anchors.size(); j++){
XAnchor xa=(XAnchor)anchors.elementAt(j);
//process each anchor
}
String role=xa.getRole();
if(role.indexOf("source")>=0){
Pointed ps=xa.getPoint();
for(int k=0; k<ps.size(); k++){
Item it=(Item)ps.elementAt(k);
Node n=it.node;
Parent p=(Parent)n.getParentNode();
Text t=maindoc.createTextNode("Look here! ->");
p.insertBefore(t,n);
}
}
Doc d=parse(inputXMLFile);XlinkCollection allLinks=getLinks(d);In a navigation-style hypertext, the imperative is to find all the source links and connect them to their destinations. To support this, the XLinkCollection can be scanned for all the sources as follows:
Enumeration sources=allLinks.getAnchorsByRole("source").elements();And then each source anchor can be processed to find its matching destinations
while(sources.hasMoreElements()){Alternatively, all the other anchors can be enumerated by the expression
XAnchor theSource=sources.nextElement();
XLink xl=theSource.getLink();
Vector dests=xl.getAnchorsByRole("destination");
//Knit theSource together with all the dests. Somehow.
}
dests=theSource.getSiblings();
The following usage allows the programmer to (almost) completely abandon the multiple layers of iteration over the XLinks, XAnchors and Items, reducing it instead to a single loop over every possible combination of pairs of matching source and destination link anchor points.
Enumeration combos=allLinks.getNodesComboByRoles("source","dest");The drawback with this mechanism is that the collection implementation is unclear (please, not another Vector!), relying on the programmer's discipline to keep the alternating node roles synchronised. It is also not clear why combinations of two should not be extended to three or more, but this may be better handled by application-specific code.
while(combos.hasMoreElements()){
Node src=combos.nextElement();
Node dest=combos.nextElement();//Now do the hard work at the DOM level
Parent p=(Parent)src.getParentNode();
Text newText=doc.createTextNode("["+dest.getData()+"]");
p.insertAfter(newText, dest);
}
A variant implementation (getNodesAndXAnchorsComboByRoles) alternates the Nodes with their containing XAnchors, allowing the API user to find more information about the linking context of the Node in question. From there, the programmer can call the getSiblingNodes(thisNode) method of the XAnchor to find all the other nodes that this anchor shares. From there also the programmer can use the XAnchor class’ getLink() method to find information about which XLink any of these anchors is connected to, and hence to find any other attributes of this or any of the other XAnchors. (In an ideal world, the owning XAnchor would be an attribute of each pointed Item).
The naïve model (which has been presented thus far) resolves every link, evaluating every XPointer on every locator of every link, so that the complete forest of DOM Nodes is instantly available. For some limited applications this is fine, but more general processing may render this unusable.
Consider a typical Microcosm session: it contains many active linkbases, each with many dozens (or even hundreds) of generic, keyword links whose source anchor contains an XPointer of the form #root().string(all, keyword, 1, 7). Any particular document may only contain a small fraction of the phrases which trigger those generic links, and so the majority of source links for the linkbases will resolve to null places. Effectively, those links are inactive for that particular document. However, our naïve approach will force all the destination anchors to be resolved irrespective of whether they are needed or not. This will mean parsing and storing hundreds of irrelevant documents. By contrast, the required behaviour for this scenario is to first attempt to resolve the XAnchor with the "source" role, and only if it succeeds to resolve the remaining anchors.
In this scenario, the source anchor acts as a guard for the link. Other circumstances may act as further guards: in the Microcosm system all link processing acts on the current document. This would imply that links whose source anchors lie outside the current document are also ineligible for processing. Other more complex rules will apply for other applications.
Even if the link is found to be valid in the above sense, it may be the case that it is sensible to defer the resolution of the other anchors (such as the destinations) until they are actually needed. Application-specific processing may mean that many of the links are not required (for example, only a window of a scrollable document may be rendered). In the ExtremeKnit application, it would have been pointless to defer the resolution of any of the destination anchors as they are all eventually used. However other applications may find it advantageous to delay resolving any anchor pointing outside the scope of the local document.
Expected use: XLinkCollection
l=getLinks(document, "source");
or XLinkCollection
l=getLinks(document, "source src");
If a guard is specified, only links containing an anchor with that guard role will be eligible for use. That anchor's XPointer must also correctly resolve to a non-null position in order for the link to be useful.
Further behaviour can be predicated on the guard role: if the LOCAL behaviour is requested then the guard must resolve to some data in the current document. If the DEFER behaviour is requested then only the guard will be resolved. The XPointers of other anchors will be resolved at the point when they are needed. This behaviour will be part of the XAnchor class, invisible to the user of the API.
Expected use: XLinkCollection l=getLinks(document, "src", XAnchor.DEFER);
Next, the FREEDOM interface was used with the aelfred parser to implement the Knit program from the ground up. This involved interpreting the XPointers by hand, but as the original Knit program only worked for the #id() forms this was not too great a problem.
In the middle of building up a fuller XPointer class, I discovered that IBM's xml4java already provided XPointer support, and so I switched to that product. Within that environment I have tried to keep strictly to the DOM interface, although some of the native facilities have been resorted to on occasions. Some changes have been made to the native XPointer package, in particular those relating to the parsing and handling of string() terms. Because Microcosm relies so much on the ability to address within-node data, it was an important to add proper support for the string term to the XPointer package. This has been done, although the current implementation does not allow individual string matches to straddle node boundaries. Support for span() and origin() has still not been provided.
In order to make the results of resolving a within-node XPointer generally useful in the situation where the document tree is to be processing by in-place transformation, the following strategy (shown graphically in Figure 1) was evolved.
Figure 1a:
The word hypertext, the subject of a notional string XPointer #id(x).string(1,hypertext,1,9) is shown embedded in a DOM structure. |
Figure 1b:
The DOM structure is now rewritten so that the target string has its own separate TEXT node. |
XPointer support for spans() and origin().
As a specific example, imagine a document about Sheep Farming, contained in an XML file contained at URL d. Imagine a glossary document at URL g which contains paragraphs describing the various breeds of sheep and also a set of links from mentions of each breed in d to the descriptions in g. If the hrefs include specific URIs then fine, but if not does #root().string(1,'Merino',1,6) identify the first occurrence of the string 'Merino' in d or g? If the former, how can a locator be specified for the glossary item? If the latter how can a locator be specified for the data document? If specific URL's are resorted to, then is it possible to create a g that can be used with any document about Sheep Farming?
Davis, H. C. (1998) "Referential Integrity of Links in Open Hypermedia Systems", Proceedings of the Ninth ACM Conference on Hypertext, Pittsburgh.
Davis, H., W. Hall, I. Heath, G. Hill and R. Wilkins (1992), “Towards an Integrated Information Environment with Open Hypermedia Systems,” In ECHT ’92, Proceedings of the Fourth ACM Conference on Hypertext, Milan.
Nurnberg, P. (1997) "As we should have thought", Proceedings of the Eigth ACM Conference on Hypertext, Southampton.
Trigg R. H. (1998) "A straw model for link traversal in open hypermedia systems", Proceedings of the Fourth Workshop on Open Hypermedia Systems at the Ninth ACM Conference on Hypertext, Pittsburgh.