SGML: XML Extended link

XML Extended link

Subject: Re: XML questions
Date: 17 Jun 1997 23:52:58 GMT
From: cmsmcq@uic.edu (C M Sperberg-McQueen)
Newsgroup: comp.text.sgml
-------------------------------------------------------------- Lars Marius Garshol (larsga@ifi.uio.no) wrote: : I've been reading the XML linkage specification and started wondering : about the external links described in it. As I understood the spec, : (which is not far, admittedly) these links will not have to reside in : XML files that the links go to/from. Simple links do reside in the document from which they start. Extended links don't -- or at least, needn't. : This leads me to wonder : - where will they then be expected to be "declared"? The XML Extended link element asserts the existence of a link connecting two or more locations (perhaps one or more locations, but there's a dispute in hypertext theology over whether single-ended links are imaginable or not, and if imaginable whether or not they should be countenanced). Unlike simple links, extended links need not be located at any of the locations connected by the link. They will, however, be located somewhere. And if by "where will they be declared?" you mean "where will the link be asserted?", then the answer is "in an XML document containing Extended links". If you mean "how will the browser know that a given phrase should function as one end of some link(s)?" then read on. : - how will they be linked to the actual XML files? (Ie: how will the : browser know that this linking element corresponds to that link) The browser can know in the same way it can know *now* that a given <a name='foo'> element in an HTML document is the target of a link coded <a href="#foo"> -- i.e. by (1) displaying a document (call it D), (2) keeping track of the elements in D, using the data structure of its choice, (3) reading some documents (X, Y, Z, maybe also D) which contain link elements asserting the existence of two- or n-way links), and (4) noticing when one of the link elements in X, Y, Z points at an element in D. A nice browser might then indicate the presence of an incoming or outgoing link with an icon, or a special color / font / underscore treatment. How does the browser find X, Y, and Z? Two ways are fairly obvious; there are probably others. (1) The user says "Load document D, and also load documents X, Y, and Z, because X contains the links constructed by my good friend Eliot Kimber, showing the correspondence of each paragraph in document D to one or more chapters in the prophecies of Nostradamus, and Y contains Steve DeRose's elegant refutation of Kimber, and Z contains Kimber's response to DeRose, and I don't want to miss a single link in the chain of argument" (or words to similar effect; XML-link will not actually require documents containing external links to have been written by Steve DeRose and Eliot Kimber). (2) The document D can contain an element (this is what XML-link calls the Document slement) pointing at other documents which contain relevant links, which the browser may read to find out if document D has any outgoing links the browser needs to know about (and perhaps to find out about incoming links, too). An XML browser may or may not be required by the spec to read the external documents named by the Document element and scan through them for links with ends in document D -- similarly, if the Document element points at a document which itself has a Document element, the browser may or may not be required to (a) follow the Document links recursively up to some maximum depth, (b) notify the user and ask for advice, (c) follow just one (or just two) recursive links, (d) shell out and start a game of Rogue. Some people whose judgment I trust assure me there is no consensus on this issue and that it will therefore be left to the implementation to decide what to do. : Can anyone give an example of how this is supposed to work? OK. Imagine that we have before us the following electronic documents: - C (the constitution of the United States) - J (the Judiciary Act of 1793 or whenever it was) - M (the Supreme Court decision in the case of Marbury vs. Madison, written by Chief Justice John Marshall; quotations of C and J in M are connected to C and J via simple links) - L (a document containing a set of links which connect the places in M where Marshall refers to but does not quote C and J, and various places where C and J are (a) consistent with each other, or (b) in conflict with each other, together with some modern legal commentary) M has an element of the form <seealso xml-link="group"> <xlinks xml-link="document" href="L.xml"/> </seealso> which identifies L as a document containing relevant links. C and J have the same thing, but also mention M, since M contains relevant links, too: <seealso xml-link="group"> <xlinks xml-link="document" href="L.xml"/> <xlinks xml-link="document" href="M.xml"/> </seealso> Scenario 1: 1 user loads L, reads the commentary; from time to time clicks on links to jump to C and J. Just like HTML, more or less. Scenario 2: 1 user loads document C 2 browser detects the <seealso> element in C and asks the user "do you want to see the links to C from L and M? do you want to see the links *from* C that are defined in L and M?" 3 user says "Yes, show me stuff from M but not stuff from L" 4 browser reads document M and notices seventeen quotation links connecting quotations in M to the original passages in C 5 browser displays a 'quoted-by' icon next to the passages in C quoted by M 6 user reads, and clicks on a quoted-by icon 7 browser loads M and shows the user where M quoted the passage in C that the user was just reading etc. Scenario 3: 1 user loads document J 2 browser detects the <seealso> element in J and automatically (in the background) scans both L and M for links involving document J. Since it is actually holding M in memory, it also notices links involving M and L themselves (a sort of pre-emptive caching), though the only links involving L are in L itself. 3 browser displays appropriate icons (quoted-by, paraphrased-by, consistent-with, in-contradiction-with, referred-to, or just a generic all-purpose 'something-points-at-this' icon) for each link to or from J, whether there is an explicit link element in J or the link is asserted by a link element in L or M 4 user reads, and clicks on a link icon 5 browser follows the link etc. Scenario 4: 1 user loads document M 2 browser detects the <seealso> element in M and does nothing, because (a) the user has set the default to "Do what I tell you and nothing more, don't try to get clever and don't try to load external links", or because (b) the implementor of the browser has decided to ignore extended links and the user has no choice 3 user reads document Depending on what the conformance clause of the spec says in the end, some of these may not be correct implementations, but for now they are at least all imaginable. : When reading the spec, I felt that it assumed the reader knew of some : existing standards already, especially HyTime and TEI. Is this correct? : Do you think it would help my understanding if I tried to understand : HyTime (which is supposed to be difficult, no?) and/or TEI before : approaching XML linkage again? I don't *think* knowledge of either HyTime or TEI is required to understand XML-link. A few more examples should help, but I think what the current draft spec is really assuming is not knowledge of HyTime and TEI but knowledge of how hypertext systems more advanced than the World Wide Web have been constructed, and why. Some examples in the spec would probably help a lot. But then, a spec is not necessarily a tutorial ... -- -C. M. Sperberg-McQueen University of Illinois at Chicago ACH/ACL/ALLC Text Encoding Initiative cmsmcq@uic.edu, tei@uic.edu +1 (312) 413-0317, fax +1 (312) 996-6834