Organization: Naggum Software; +47 2295 0313
Newsgroups: comp.text.sgml
Reference: David Megginson
CTS archive link: here
[David Megginson]
| I have been thinking a lot about inclusion and exclusion exception, and
| am beginning to wonder just how useful they are. I can see some
| justification for exclusion exceptions, to prevent elements from
| nesting (i.e., not allowing a footnote within a footnote), but what can
| inclusion exceptions do that couldn't be done better with well-designed
| parameter entities?
exclusion exceptions are almost obviously useful. to halt recursion is
just one use. another is to allow larger content models than will be used
in all contexts, then let some element higher up exclude unwanted elements.
this allows a form is generic/specific separation that is not tied to the
element, but rather to its relationship to other elements. this is a very
powerful mechanism that requires the exclusion exception to work.
inclusion exceptions are _not_ useful, but probably not obviously so.
nothing can be done with inclusion exceptions that cannot be done without,
and the methods used without inclusions are better than those used with.
inclusions clutter the language with complicated rules for when record ends
shall be ignored. inclusions clutter the element structure by having two
kinds of elements: proper, and included. included elements are parallel to
the proper elements, but are nonetheless anchored in the same hierarchy.
an application dealing with included elements must accept them anywhere,
just as the parser will, and this screws up any reasonable grammar used to
process the element hierarchy. an included element transcends the element
structure and has to be processed in a different environment that needs to
do various magic to position the included element close to, but not where
it does occur in the element hierarchy, or do other interesting things with
the position that it is not part of but relates to.
four typical uses of inclusions are normally mentioned. (1) floating
figures, tables, etc. (2) footnotes and index entries. (3) annotations.
(4) change bars.
floating elements will "float" to some position regardless of where they
occur in the actual element structure. that is, there is code in the
application software to reposition such elements. this code is the same
regardless of where the elements actually occur, and do not in any way
require that elements be included. instead of a floating element that can
crop up anywhere within a paragraph or its subelements, why not allow such
figures between paragraph elements? the code is the same, and the concept
of "floating" elements is unrelated to the occurrence of the actual
element. SGML, in other words, need not know about them. if there are
means to refer to figures using ID/IDREF, there isn't even any requirement
that they appear near the reference. if they can float _some_ distance,
they should be able to float _any_ distance.
footnotes form a different kind of "floating" element. again, the same
logic applies if they exist in footnote and footnote-reference pairs. let
the footnotes follow the paragraph to which they apply. if they are
inlined and only leave a reference where they are found and their text is
kept for other display (such as at the bottom of the page), there is
nothing to gain from making them inclusions; allow them as part of the
text. parameter entities can reduce the need to list them explicitly.
index entries are also remembered, like footnotes, but are collected to be
displayed at a different place. again, allow them in the course of the
text.
annotations are slightly harder, but their biggest drawback is that the
text is now really two intertwined texts. this is not a good idea to begin
with, and has serious problems with scalability, merging of various sets of
annotations, etc. better to provide a mechanism that will let one document
be a set of annotations and refer to points in another document without
changing the latter. HyTime comes to your aid in this respect. reverse
linking between document and annotations is not as difficult as many have
argued it is, and therefore argue strongly for changing the annotated
document to include them.
change bars are really processing instructions in disguise. they are
created as a side effect of comparing two versions of a text, whether the
versions are in the same document or not. change management in SGML is not
trivial, but change bars only apply to the _result_ of such management, and
is not sufficient to do anything really useful, anyway. using inclusions
for these is only a good idea if you're very close the printing end of the
process, and then processing instructions make even more sense.
David argues that inclusions also screw up validation. that's a valid
point, and it is even more valid with respect to the application. the
parser can deal with it relatively easily, but a foreign element in the
element structure means a lot more code to deal with it in unsuspected
places. I view this validation business as reducing the need for error
handling (that is, spurious elements or constructs) in the application
code. inclusions reintroduce the need for such "error handling".
also, if one uses the content models to write the left-hand side of a
mapping from element structure to processing (as with LINK), inclusions
multiply _much_ faster than they do if they were included in the content
models where they belong.
| Should inclusion exceptions be discouraged?
yes.
a parser that does not allow inclusions is also much easier to write than
one that does allow them. applications doubly so if the parser is known
not to allow them. there should be a FEATURE flag to announce that a
document required inclusion exceptions.
---------------------------------------------------------------------------
for reference, here's my list of deprecated features in SGML:
CDATA and RCDATA declared content
DATATAG and RANK
CONCUR
SHUNCHAR
MSOCHAR and MSICHAR
quantities and capacities
inclusion exceptions
mismatch between start-tag and end-tag in minimization
omitted start-tag when end-tag is present
empty start-tag
feature-dependent syntax
note that very little of this affects instances of already well-designed
documents.
---------------------------------------------------------------------------
I'd like to hear about other cases of using inclusions that may show that
they are still useful. I'm particularly interested in learning how they
are processed, and if the processing requires inclusions or it could do
better or at least as well with a proper element.
CALS does _not_ count as an argument. note that it is always possible to
take a DTD with inclusions and convert it one without (possibly using
exceptions), and that it is the document that is validated, not the DTD.
since validation with inclusions is lax and without is strict, if it passes
without inclusions, it will pass with inclusions. thus, you should never
validate a document against a DTD that uses inclusions, but rewrite the DTD
to allow them in specific places, and validate against that.
best regards,
</Erik>
--
Microsoft is not the answer. Microsoft is the question. NO is the answer.
Back To Complete Subject Listing on remote site.
Archive last updated dd. 02/04/96
Suggestions to the compiler.