SGML: Kimber on the HyTime parts of the NITF 2.0 DTD

Kimber on the HyTime parts of the NITF 2.0 DTD

Subject: Re: NITF and HyTime
Date: 25 Oct 1996 17:28:44 GMT
From: "W. Eliot Kimber" <>
Newsgroup: comp.text.sgml
---------- David Allen <> wrote in article <>... > The International Press Telecommunications Council and the Newspaper > Association of America have just released the (production) version 2 of > the News Industry Text format (NITF) DTD. See -------- [...] > Can anyone help by commenting on our method of including the HyTime > links in the DTD and how we might provide full testing of this > functionality? Here is my initial take on the HyTime parts of the NIST 2.0 DTD. In general, it's about right--only serious problem is with the declaration of ilink, which is pretty close but off just by a bit. There are some possible simplifications and enhancements you can make. I discuss these below. What follows is an annotated version of the NIST 2.0 DTD (but only those parts related directly to HyTime; I've commented on the declarations in the order they occur in the original DTD). <!-- This is the NITF DTD <!DOCTYPE NITF PUBLIC "-//IPTC-NAA//DTD NITF 1.0F//EN" > --> <!ELEMENT LINKPOOL - o (NMLIST | NAMELOC | DATALOC | TREELOC)* -- Hytime linkpool --> Technically, the elements contained by LinkPool are *location addresses* and not hyperlinks. One of the key concepts in HyTime is the clear distinction between hyperlinking, which is the act of relating things together, and addressing, which is the process of pointing to the things. Location addressing may be used for purposes other than hyperlinking. So the best name for LinkPool would be AddressPool, reflecting that it contains location address elements and not links. But this is not a conformance issue, only a HyTime Pedant Hotbutton issue. <!ELEMENT NMLIST - - (#PCDATA) -- Hytime nmlist --> <!ELEMENT NAMELOC - - (NMLIST)* -- Hytime nameloc --> <!ELEMENT DATALOC - o (DIMLIST) -- Hytime dataloc --> <!ELEMENT TREELOC - - (MARKLIST*) -- Hytime treeloc --> The Dimlist and Marklist elements are not strictly required if you don't use marker-related element forms like dimref or mrkquery. The HyTime processor should interpret integer lists as coordinate specifications or tree specifications in the contexts where it expects them (e.g., within treeloc- and dataloc-form elements). In the TC, we've made this simplification option clearer. <!ELEMENT DIMLIST - - (#PCDATA) -- Hytime ? --> <!ELEMENT MARKLIST o o (#PCDATA) -- Hytime treeloc marklist --> These are correct content models. Note that marklist is not unique to the treeloc form (although it is in this particular document type). <!ELEMENT VIRTLOC - - (#PCDATA) -- Virtual Location --> Not sure what Virtloc is (couldn't find it in the NITF doc), but it might be equivalent to a HyTime bibliographic location address (bibloc). <!ELEMENT CLINK - - (#PCDATA) -- Context Link --> <!ELEMENT ILINK - o (#PCDATA) -- Independent Link --> There is a general problem with having an ilink-form called "ilink", which is that HyTime requires independent links to have "meaningful" types and anchor roles that are fixed for a particular type in a particular document type. Thus, unless you are really intending to represent a *very* generic and essentially semanticless hyperlink, an "Ilink" element type can't satisfy the HyTime semantic requirement. The same goes for clink, although it's not as critical because clink is already defined to be a pretty generic cross-reference (as indicated by the anchor roles defined in the HyTime standard itself). I generally give clink-form element types names like "Link" or "Crossref" to reflect their referential nature, although clink is acceptible if you really don't have a more precise semantic for it. Note that these requirements for hyperlinks are essentially the same as SGML's for element types: that element types are intended to be descriptive. This means that, at least as a rule of thumb, over-general element types are not proper. [For example, that Panorama's Web document type is not, strictly speaking, correct because it uses an "ILink" element type for the ilinks in the Web files--ideally Panorama would let you define new link types and anchor roles as part of its reader-defined linking features.] Because HyTime independent links are always derived from the HyTime ilink form, there shouldn't be any great difficulty with allowing authors or DTD customizers to declare their own ilink-form element types within the context of a larger, standardized DTD. For example, the NITF DTD could provide a mechanism for allowing authors to add link types by providing customization hook parameter entities. As I've set up the entities below, authors need only define the link type name and declare an attribute list for the link type defining the anchor roles: <!-- Declared in external DTD subset: --> <!ENTITY % Base-Ilinks "GenericLink" > <!ENTITY % Author-Ilinks "ZDummyIlink" -- To be redeclared by authors --> <!ENTITY % Ilink-STD-Atts "HyTime NAME #FIXED 'ilink' Linkends IDREFS #REQUIRED" > <!ELEMENT (%Base-Ilinks; | %Author-Ilinks; - O EMPTY > <!-- End of stuff in external DTD subset --> <!-- As used by an author in the internal DTD subset: <!DOCTYPE NITF PUBLIC "..." [ <!ENTITY % Author-Ilinks "Related-Person-Mention" > <!-- The Related-Person-Mention link links mentions of people to their occurrences in other stories. For example, you might want to link a mention of Ross Perot in one story to other stories in which he's mentioned. --> <!ATTLIST Related-Story %Ilink-STD-Atts; anchrole CDATA #FIXED "mentions #AGG stories #AGG" > ]> Note: I wouldn't expect the average author to do this by hand. Either it would be done through a user interface or by experts designing an extension to the NITF DTD to be used as-is by specific authors. For example, the stuff I've shown in the internal subset could itself be put into an external parameter entity and simply included into each document or into a modified version of the NITF DTD (I would prefer the latter). Note also that because ilinks are independent, you can imagine having a separate document type (such as Panorama's Web documents) that contain nothing but ilinks (and possibly text explaining the link types or instances). Finally, there's no reason ilink content has to be empty. I like to use ilinks to contain explanations of the link instances. For example, I could declare the Related-Person-Mention link to allow content and then use it like so: <!-- NOTE: Requires SGML declaration with name length of at least 32 --> <?HyTime VERSION "ISO/IEC 10744:1992" HYQCNT=32 > <?HyTime MODULE base exidrefs dcnatts> <?HyTime MODULE locs multloc anysgml anydtd mixspace coordloc> <?HyTime MODULE links > <!DOCTYPE Related-Person-Mention [ <!ELEMENT Related-Person-Mention - - (#PCDATA) +(nameloc | dataloc) -- The Related-Person-Mention link links mentions of people to their occurrences in other stories. For example, you might want to link a mention of Ross Perot in one story to other stories in which he's mentioned. Content of link is an explanation of why the link has related particular mentions to particular articles. -- > <!ATTLIST Related-Person-Mention HyTime NAME #FIXED "ilink" linkends IDREFS #REQUIRED anchrole CDATA #FIXED "mentions #AGG stories #AGG" > <!ENTITY % HyTime-Stuff SYSTEM "hystuff.ent" -- Standard declarations for HyTime stuff: loc addrs, notations, etc. -- >%HyTime-Stuff; <!ENTITY dole961020 SYSTEM "" CDATA SGML> <!ENTITY dole961021 SYSTEM "" CDATA SGML> <!ENTITY ross-PC SYSTEM "" CDATA SGML> ]> <Related-Person-Mention linkends="dole-n-ross ross-perot-PC"> This link links mentions of Ross Perot in articles on Dole's courting of Ross to the article on Ross' latest press conference on the subject. <nameloc id=dole-n-ross> <nmlist docorsub=dole961020 nametype=element>x1 x5 x10</nmlist> <nmlist docorsub=dole961021 nametype=element>x3 x7 x13</nmlist> </nameloc> <nameloc id=ross-PC> <nmlist nametype=entity>ross-PC</nmlist> </nameloc> </Related-Person-Mention> I've now created a completely self-contained document that contains a single hyperlink and all of its location addresses, as well as an explanation of why this particular link exists. [For the complete version of this sample, see] Note that while this may seem very verbose, it would be dead easy to write a little Visual Basic program, Tkl program, or HTML form to create documents of this sort (I've put such a form at the site above). Also, having created one, you simply copy and modify to create others (not counting the entity declarations, which will probably be unique for each link, although these would probably be generated by the system that manages the actual article files). <!ATTLIST NITF VERSION CDATA #FIXED "-//IPTC-NAA//DTD NITF 1.0F//EN" Change.Date CDATA #FIXED "24 April 1996" Change.Time CDATA #FIXED "1500" Baselang CDATA #IMPLIED URN CDATA #IMPLIED Class NAMES #IMPLIED> As these are HyTime documents (or can be), the NITF document element should have a HyTime form of "HyDoc": <!ATTLIST NITF ... HyTime NAME #FIXED "HyDoc" > <!ATTLIST CLINK Hytime name CLINK Linkend idrefs #REQUIRED ID id #IMPLIED> These attributes are, of course, correct. However, you would normally make the HyTime attribute fixed, as I've done in my examples above, rather than just supplying a default value as has been done here to prevent authors from changing the value. The only exception to this is where a particular element type might be able to conform to one of several possible forms, in which case you should make the HyTime attribute a name list of the possible forms. <!ATTLIST DATALOC ID id #REQUIRED Locsrc idref #IMPLIED Quantum (norm) norm HyTime name DATALOC > No problem here. <!ATTLIST DIMLIST HyTime name DIMLIST> <!ATTLIST ILINK HyTime NAME ilink id ID #IMPLIED anchorole CDATA #IMPLIED -- check this -- linkends IDREFS #REQUIRED> This declaration for ILINK does *not* conform because the anchorole attribute must be fixed in the DTD (or, more accurately, must be the same for all elements of that type in the document, whether or not the attribute itself is actually declared as #FIXED). Also, the attribute name is "anchrole", not "anchorole". To use a different name for any HyTime-defined attribute, you can use the HyNames attribute to define the mapping: <!ATTLIST ILINK HyTime NAME ilink id ID #IMPLIED anchorole CDATA #FIXED "anchor1 anchor2" linkends IDREFS #REQUIRED HyNames NAMES #FIXED "anchrole anchorole"> Note that I've not fixed the "linktype too general" problem, rather I've simply declared equally semanticless anchor roles. <!ATTLIST NAMELOC ID id #REQUIRED HyTime name NAMELOC> <!ATTLIST NMLIST Docorsub entity #IMPLIED Nametype (element | entity) element HyTime name NMLIST> <!ATTLIST TREELOC ID id #REQUIRED Locsrc idref #IMPLIED HyTime name TREELOC> These are fine. The only other thing that is missing is a declaration for the data content notation "SGML", which you need to have in order to declare data entities that are the SGML documents you're linking to. The declaration should always look like this: <!NOTATION SGML PUBLIC "ISO 8879:1986//NOTATION Standard Generalized Markup Language//EN" > -- <Address HyTime=bibloc homepage=""> W. Eliot Kimber, Senior SGML Consultant and HyTime Specialist Passage Systems, Inc., 10596 N. Tantau Ave., Cupertino, CA 95014-3535 (408) 366-0300 (Cupertino), (512) 339-1400 (Austin), </Address> "If I never had existed, would you still remember me?..." --Austin Lounge Lizards, "1984 Blues" (