SGML: Inclusion exceptions, Sperberg-McQueen
Subject: Re: using a subset of a DTD with general inclusions
Date: 15 Apr 1996 20:21:33 GMT
From: cmsmcq@tigger.cc.uic.edu (C M Sperberg-McQueen)
References: <LUBELL.96Apr9173443@villars.nist.gov> <4kjlmt$md3@crl14.crl.com>
---------------------------------------------------------------------------
Josh Lubell <lubell@nist.gov> wrote:
>
>I was wondering if anyone has a good solution to the following problem:
>
>I have an SGML application where I need to validate and parse a subset
>of an SGML document. The document's DTD has elements that can appear
>anywhere in the document (such as notes, examples, lists, etc.).
>These elements are specified in the content model for the top level
>element as general inclusions.
>Now suppose that SUBELEMENT is an element nested somewhere inside
>TOPLEVEL and that I have a chunk of SGML-tagged text consisting of a
>SUBELEMENT and my chunk of text contains generally included elements.
>If I use the following DOCTYPE declaration with my SGML-tagged text, I
>get error messages from SGMLS because the general inclusions are
>unrecognized:
><!DOCTYPE SUBELEMENT SYSTEM "mydtd.dtd">
><SUBELEMENT>
>[...]
>Is there a good way to write a DTD so that it can easily be used to
>validate subsets of a document containing general inclusions? The
>only solution I can think of is to explicitly specify the inclusions
>in every single content model where they apply. There has got to be a
>better way, right?
Joe English responded:
: Other than not using inclusion exceptions to begin with, no,
: not really.
Actually, I think there's a fairly simple way to validate fragments of
this kind of document type -- which Lou Burnard and I call a 'Belgian
DTD', in honor of Jean-Pierre Gaspart, from whom we first learned about
it as a possible method for integrating different DTD fragments. Under
the right circumstances, the Belgian DTD seems to be an extremely
useful tool for DTD design; it's the major reason I dissent from the
view, occasionally expressed here, that use of any inclusion or
exclusion exceptions is inherently poor design. Applied in its pure
form (inclusion exceptions only on the document type element), it seems
to me to avoid many of the problems that lead people to swear off of
exceptions.
But enough: you didn't ask whether a Belgian-style DTD was a good
idea, but how to validate fragments of your document.
Assume (1) that your fragment, tagged as a SUBELEMENT, resides in the
file FRAG.SGM and (2) that the active inclusion exceptions, all
specified on the top-level element as inclusions, are NOTE, EXAMPLE,
LIST, and SUBELEMENT. You can validate your fragment by making an ad
hoc driver file for it which looks something like this:
<!DOCTYPE fragment SYSTEM "mydtd.dtd" [
<!ELEMENT fragment - - (#PCDATA) +(note|example|list|subelement) >
<!ENTITY myfrag SYSTEM 'FRAG.SGM'>
]>
<fragment>
&myfrag;
</fragment>
I use a similar approach in validating fragmentary examples from DTD
documentation (similar, at least, in supplying a new document type for
an existing set of ELEMENT declarations): I embed each example in an
EXAMPLE element, and define a new document type, EXAMPLES, which adds
two elements (EXAMPLE and EXAMPLES) to the base DTD:
<!DOCTYPE examples system 'tei2.dtd' [
<!ELEMENT examples - - (example*) >
<!ELEMENT example - - ANY >
]>
<examples>
<example> <!-- ... text of example here ... --> </example>
<example> <!-- ... --> </example>
<!-- ... -->
</examples>
One must, of course, be sure that EXAMPLE and EXAMPLES are not GIs
used in the base DTD.
-C. M. Sperberg-McQueen
University of Illinois at Chicago