DTDs and namespaces

From:      "C. M. Sperberg-McQueen" <cmsmcq@acm.org> 
To:        XML-DEV xml-dev@lists.xml.org, james.anderson@setf.de 
Date:      Mon, 07 May 2001 13:19:03 -0600 
Subject:   DTDs and namespaces [was: using namespaces to version]

At 2001-05-04 15:13, james anderson wrote:

>This is just a side point, but DTD's do contain sufficient information
>to "support namespaces in their full generality". Note please an earlier
>message in this thread 
>(http://lists.xml.org/archives/xml-dev/200105/msg00112.html).

[Pedantry alert:  I agree with the first clause above, but find
the details irresistible.  If you don't, skip to the next message.]

>It is possible to propagate namespace constraints within a DTD according
>to rules analogous to those which the namespace recommendation specified
>for names within the document entity so long as the same literal
>qualified name does not appear more than once in the same external
>entity. It is conceivable to establish scoping rules which would cover
>those cases as well, but that does not seem necessary.

It is certainly true that DTDs can contain sufficient information to
support namespaces, in the sense that they can be used to define the
names in a namespace, in a system which understands DTD notation and
which can resolve qualified names correctly.  But some outside system
is required; no system which validates using a DTD and the validation
rules of XML 1.0, without extension, can support all the syntactic
variations allowed by the namespaces recommendation.

When I say that DTDs cannot 'support' namespaces I mean simply that
given some plausible account of the rules which govern elements in
some set of namespaces, and the rules of the namespace recommendation
(which include the ability to bind arbitrary prefixes to arbitrary
namespaces), it is not possible to write a DTD which (using the normal
rules of DTD-based validation) recognizes the set of documents which
follow the rules, and distinguishes them from documents which don't.

It is possible, using clever parameter entity tricks, to allow the
user to associate namespaces with arbitrary prefixes.  This is a
partial victory.

In their full generality, however, the rules of the namespace
recommendation allow homography: elements with different universal
names (and thus potentially different declarations) can appear with
the same prefix + colon + localname as their generic identifier.

Consider the three namespaces defined as follows:

Namespace ns-bare defines an element 'doc', which takes as content any
number of 'x' elements from either namespace ns-a or namespace ns-b.
If we bind ns-a to the prefix 'a' and ns-b to 'b', we could write this
this way:

   <!ELEMENT doc (a:x | b:x)* >

Namespace ns-a defines elements 'tick', 'tock' (each empty), and
'x', which consists of exactly one tick followed by one tock:

   <!ELEMENT x (tick, tock)>

Namespace ns-b defines elements 'tick', 'tock' (each empty), and
'x', which consists of exactly two ticks followed by one tock:

   <!ELEMENT x (tick, tick, tock)>

So the following is a legal document which follows all the
DTD-expressed rules:

   <doc xmlns="http://www.example.org/ns-bare">
    <x xmlns="http://www.example.org/ns-a"><tick/><tock/></x>
    <x xmlns="http://www.example.org/ns-b"><tick/><tick/><tock/></x>
    <x xmlns="http://www.example.org/ns-a"><tick/><tock/></x>
   </doc>

And the following does not follow all the DTD-expressed rules:

   <doc xmlns="http://www.example.org/ns-bare">
    <x xmlns="http://www.example.org/ns-b"><tick/><tock/></x>
    <x xmlns="http://www.example.org/ns-a"><tick/><tick/><tock/></x>
    <x xmlns="http://www.example.org/ns-a"><tick/><tock/></x>
   </doc>

I don't know how to make an XML 1.0 DTD which accepts the first document
but not the second.

In their full generality, the rules of the namespace recommendation
also allow synonymy: elements with the same universal name can appear
with different generic identifiers.  So the following are legal
documents:

    <x xmlns="http://www.example.org/ns-b"><tick/><tick/><tock/></x>

    <x xmlns="http://www.example.org/ns-b">
      <a:tick xmlns:a="http://www.example.org/ns-b"/>
      <b:tick xmlns:a="http://www.example.org/ns-b"/>
      <tock/>
    </x>
    <x xmlns="http://www.example.org/ns-b">
      <b:tick xmlns:a="http://www.example.org/ns-b"/>
      <a:tick xmlns:a="http://www.example.org/ns-b"/>
      <tock/>
    </x>

And indeed any such document is legal as long as the prefix used on
the tick (or tock) elements is the same string as the string which
follows the colon on an attribute value specification of the form

   xmlns:name='http://www.example.org/ns-b'

That is, the content model written above as (tick, tick, tock)
actually stands for an *infinite* number of content models (assuming
appropriate namespace declarations for the prefixes), including (tick,
tick, a:tock), (tick, tick, b:tock), (tick, a:tick, tock), (tick,
a:tick, a:tock), (tick, a:tick, b:tock), (tick, b:tick, tock), (tick,
b:tick, a:tock), (tick, b:tick, b:tock), (a:tick, tick, tock) ...

If anyone can show me how to write DTDs to support these aspects of
the namespace recommendation, I will be very happy to learn how.  In
the meantime, however, I don't believe that such DTDs can be written,
and so I continue to believe that the DTD notation does not support
namespaces "in their full generality".

-CMSMcQ

Date:       Mon, 07 May 2001 21:29:45 -0600
From:       "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
To:         james.anderson@setf.de
Cc:         xml-dev@lists.xml.org
Subject:    Re: DTDs and namespaces (was: using namespaces to version)

At 2001-05-07 19:41, james anderson wrote:

>C. M. Sperberg-McQueen wrote:
>
> > ... no system which validates using a DTD and the validation
> > rules of XML 1.0, without extension, can support all the syntactic
> > variations allowed by the namespaces recommendation.
>
>This is trivially true, but trivial invalidation is not significant to
>the question.

Er, you took exception -- or appeared to take exception* -- to the
observation that DTDs do not, and cannot, support namespaces in
their full generality.  By that I meant that you cannot write a DTD
for a namespace or set of namespaces which will correctly distinguish
document instances legal under the namespace rules and which follow
the rules for each namespace from document instances which don't do so.

If examples illustrating the kinds of things which are in principle
legal for anyone using namespaces, but which DTDs cannot make legal
without overgenerating, are not in your view significant for the
question "Can DTDs support namespaces in their full generality?"
then I think we do not have the makings of any useful exchange here.

   * A seventh reading of your notes produces an alternative
     reading of your note, in which your posting does not take
     exception to the proposition mentioned above, but notes the
     interesting fact that it is possible to get namespace-aware
     validation without extending DTD notation per se -- the DTD
     actually does provide all the information you need -- but
     merely by carefully defining a modified rule for matching
     elements in an instance with names in declarations in the
     DTD.

I agree with your claim that by adding knowledge of how namespaces
work, it is possible to construct systems which use DTD notation
and support namespaces.  This, if I may say so, is unsurprising and
not in contradiction with the remark which began this exchange.

In sum, we appear to be in emphatic agreement.  If one is willing to
change the rules for interpreting DTDs and validating documents
against them, one can use DTD notation to define sets of documents
which use namespaces.   Otherwise, only with restrictions
on the use of namespace prefixes.

best regards,

C. M. Sperberg-McQueen

Prepared by Robin Cover for The XML Cover Pages archive. See "Namespaces in XML."