comp.text.sgml Shadow Archive

Subject: Reply: Inclusion exceptions: what are they good for?

Submitted to: COMP.TEXT.SGML
Submitted by: Erik Naggum (erik@naggum.no )
Date Of Submission: 13 Aug 1994 19:02:53 UT
Lines: 150

Organization: Naggum Software; +47 2295 0313 Newsgroups: comp.text.sgml Reference: David Megginson

CTS archive link: here

[David Megginson] | I have been thinking a lot about inclusion and exclusion exception, and | am beginning to wonder just how useful they are. I can see some | justification for exclusion exceptions, to prevent elements from | nesting (i.e., not allowing a footnote within a footnote), but what can | inclusion exceptions do that couldn't be done better with well-designed | parameter entities? exclusion exceptions are almost obviously useful. to halt recursion is just one use. another is to allow larger content models than will be used in all contexts, then let some element higher up exclude unwanted elements. this allows a form is generic/specific separation that is not tied to the element, but rather to its relationship to other elements. this is a very powerful mechanism that requires the exclusion exception to work. inclusion exceptions are _not_ useful, but probably not obviously so. nothing can be done with inclusion exceptions that cannot be done without, and the methods used without inclusions are better than those used with. inclusions clutter the language with complicated rules for when record ends shall be ignored. inclusions clutter the element structure by having two kinds of elements: proper, and included. included elements are parallel to the proper elements, but are nonetheless anchored in the same hierarchy. an application dealing with included elements must accept them anywhere, just as the parser will, and this screws up any reasonable grammar used to process the element hierarchy. an included element transcends the element structure and has to be processed in a different environment that needs to do various magic to position the included element close to, but not where it does occur in the element hierarchy, or do other interesting things with the position that it is not part of but relates to. four typical uses of inclusions are normally mentioned. (1) floating figures, tables, etc. (2) footnotes and index entries. (3) annotations. (4) change bars. floating elements will "float" to some position regardless of where they occur in the actual element structure. that is, there is code in the application software to reposition such elements. this code is the same regardless of where the elements actually occur, and do not in any way require that elements be included. instead of a floating element that can crop up anywhere within a paragraph or its subelements, why not allow such figures between paragraph elements? the code is the same, and the concept of "floating" elements is unrelated to the occurrence of the actual element. SGML, in other words, need not know about them. if there are means to refer to figures using ID/IDREF, there isn't even any requirement that they appear near the reference. if they can float _some_ distance, they should be able to float _any_ distance. footnotes form a different kind of "floating" element. again, the same logic applies if they exist in footnote and footnote-reference pairs. let the footnotes follow the paragraph to which they apply. if they are inlined and only leave a reference where they are found and their text is kept for other display (such as at the bottom of the page), there is nothing to gain from making them inclusions; allow them as part of the text. parameter entities can reduce the need to list them explicitly. index entries are also remembered, like footnotes, but are collected to be displayed at a different place. again, allow them in the course of the text. annotations are slightly harder, but their biggest drawback is that the text is now really two intertwined texts. this is not a good idea to begin with, and has serious problems with scalability, merging of various sets of annotations, etc. better to provide a mechanism that will let one document be a set of annotations and refer to points in another document without changing the latter. HyTime comes to your aid in this respect. reverse linking between document and annotations is not as difficult as many have argued it is, and therefore argue strongly for changing the annotated document to include them. change bars are really processing instructions in disguise. they are created as a side effect of comparing two versions of a text, whether the versions are in the same document or not. change management in SGML is not trivial, but change bars only apply to the _result_ of such management, and is not sufficient to do anything really useful, anyway. using inclusions for these is only a good idea if you're very close the printing end of the process, and then processing instructions make even more sense. David argues that inclusions also screw up validation. that's a valid point, and it is even more valid with respect to the application. the parser can deal with it relatively easily, but a foreign element in the element structure means a lot more code to deal with it in unsuspected places. I view this validation business as reducing the need for error handling (that is, spurious elements or constructs) in the application code. inclusions reintroduce the need for such "error handling". also, if one uses the content models to write the left-hand side of a mapping from element structure to processing (as with LINK), inclusions multiply _much_ faster than they do if they were included in the content models where they belong. | Should inclusion exceptions be discouraged? yes. a parser that does not allow inclusions is also much easier to write than one that does allow them. applications doubly so if the parser is known not to allow them. there should be a FEATURE flag to announce that a document required inclusion exceptions. --------------------------------------------------------------------------- for reference, here's my list of deprecated features in SGML: CDATA and RCDATA declared content DATATAG and RANK CONCUR SHUNCHAR MSOCHAR and MSICHAR quantities and capacities inclusion exceptions mismatch between start-tag and end-tag in minimization omitted start-tag when end-tag is present empty start-tag feature-dependent syntax note that very little of this affects instances of already well-designed documents. --------------------------------------------------------------------------- I'd like to hear about other cases of using inclusions that may show that they are still useful. I'm particularly interested in learning how they are processed, and if the processing requires inclusions or it could do better or at least as well with a proper element. CALS does _not_ count as an argument. note that it is always possible to take a DTD with inclusions and convert it one without (possibly using exceptions), and that it is the document that is validated, not the DTD. since validation with inclusions is lax and without is strict, if it passes without inclusions, it will pass with inclusions. thus, you should never validate a document against a DTD that uses inclusions, but rewrite the DTD to allow them in specific places, and validate against that. best regards, </Erik> -- Microsoft is not the answer. Microsoft is the question. NO is the answer.

Back To Complete Subject Listing on remote site.

Archive last updated dd. 02/04/96

Suggestions to the compiler.