From: http://www.markbaker.ca/2002/01/draft-baker-generic-xmlns-dispatch-00.txt
Date: 2002-12-16
-------------------------------------------------------------------------------
Note that I never published this. The time doesn't seem right, as the
problem this addresses isn't yet recognized as a problem. Nor is it
even clear that application/xml won't evolve in practice to make this
problem moot (i.e. dispatch behaviour will be expected).
MB
============
Internet-Draft Mark Baker
Expires: December, 2002 Idokorro Mobile, Inc.
June 21, 2002
Generic Namespace Dispatch Behaviour for XML
draft-baker-generic-xmlns-dispatch-00.txt
Status of this Memo
This document is an Internet-Draft and is subject to all provisions
of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as
"work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
To date, the promise of constructing compound XML documents using
XML namespace declarations, and having the resultant document be
processed as a seamless whole, has not been realized. This document
defines rules for processors and content that should allow a
significant degree of generic processing to occur for many compound
documents. These rules are then bound to a new generic XML media
type, "application/xmld".
1. Introduction
The registration of the application/xml and text/xml media types by
RFC 3023 [XMLMIME], specified common behaviour for several important
characteristics of these types for themselves, and for those types
using the "+xml" suffix convention. These common characteristics
include; the charset parameter and encoding considerations, base URI
processing, fragment identifier interpretation, and others.
One characteristic that wasn't specified is, where the document uses
XML namespaces [XMLNS], how the processing of that document should
relate to those declarations. Currently, all proposed IETF tree
media types using the "+xml" naming convention, including XHTML,
SOAP, SMIL, and the upcoming one for SVG, declare that if the root
namespace of the document is the respective one of that content
format, then that document should be initially processed by a
processor of that format.
This document aims to achieve two things. First, to lay out rules
for how namespace declarations in documents can be used to dispatch
processors for processing that content. And second, to bind this
behaviour to a new generic XML media type.
2. Namespace dispatching
The last paragraph of Section 3 of RFC 3023 reads;
"An XML document labeled as text/xml or application/xml might
contain namespace declarations, stylesheet-linking processing
instructions (PIs), schema information, or other declarations that
might be used to suggest how the document is to be processed. For
example, a document might have the XHTML namespace and a reference
to a CSS stylesheet. Such a document might be handled by
applications that would use this information to dispatch the
document for appropriate processing."
The use of "might" here suggests the possibility that the document
might not be dispatched in this way. In other words, RFC 3023
defines no consistent behaviour, it only suggests that the
possibility exists that it may be interpreted this way. Certainly,
there are many cases where this weak guarantee is inappropriate.
As an example, consider this XML document;
My document
My heading
According to RFC 3023, if a web server serves this content as either
application/xml or text/xml, a web browser may either process it as
"generic XML" (perhaps in a tree view, as in IE 5.0), or as XHTML,
rendering the title and heading as expected with HTML/XHTML.
RFC 3023 is also silent on how a document should be processed if the
namespace is used to dispatch a processor. Consider this document, a
slightly modified version of the above;
My document
My heading number
According to RFC 3236 [XHTMLMED] Section 5, this is an XHTML
document. Yet, according to the XSLT 1.0 Recommendation [XSLT],
this document, when served as application/xml or text/xml, should be
processed as an XSLT stylesheet. Clearly, some more specific rules
are required about how XML documents using namespaces should be
processed.
2.1 Out of scope
As this document attempts to define an 80/20 solution for namespace
dispatching, some topics invariably fall in the "20". For the
moment, these include;
o addressing other XML processing model issues
o XML events (http://www.w3.org/TR/xml-events/)
o no attempt to define any meaning that crosses namespace
boundaries, such as "id"
o no support for XInclude (http://www.w3.org/TR/xinclude/)
o no support for processors that produce documents or document
fragments, and expect the output to be incorporated into the
original document for re-processing
However, the author welcomes input into how this specification can
be further generalized without undue implementation or specification
complexity.
3. Rules for namespace dispatching
The intent of namespace dispatching is to use the namespace as
Internet media types are currently used; as a means of choosing a
software application to process the portion of a document using that
namespace. That is, instead of a "mailcap" style [@@ref?]
mapping of media types to applications, namespace dispatching
requires a namespace to application mapping.
The processing rules for namespace dispatching are presented below
as both rules for dispatchable XML processors, and rules for content.
3.1 Conforming processors
These are the rules that must be followed by software claiming to
conform to this specification.
3.1.1 Only namespaces can specify processors
At all points in a document, the namespace of the element currently
being processed must be the authoritative determinant of the
processor doing that processing, except in the case described in
section 3.1.2.
Note that some processors may be able to process multiple namespaces.
If multiple processors for a particular namespace are available, this
specification says nothing about which one should be used.
3.1.2 Permit dispatching to be overriden
Some namespace processors may
3.2 Conforming content
3.2.1 Containment must be permitted
The element within which an element from an alternate namespace
exists, must explicitly permit that element being there (via its
schema, if specified).
3.2.1.1 XML Schema interpretation
XML Schema [XMLSCHP1] defines wildcards (Section 3.10) that specify
different rules about how the containment of content from other
namespaces is handled. For example, this declaration within a schema
can be used to permit any namespace;
In XML Schema parlance, this rule means;
o if the schema is known, the element must not contain
@@need to learn more about validation and the role of xsi:type
@@need to cover DTDs(?), Relax NG.
3.2.2 Mandatory extension mechanisms must be respected
Some XML based languages include a feature sometimes called a
"mandatory extension mechanism" [EXTLANG]. SMIL, for example,
defines the attribute "skip-content" that can be used to modify the
behaviour of the element to which its bound, requiring that SMIL
processors either process (and perhaps fail trying), or not process,
the contained content.
If the containing element uses a mandatory extension to require that
contained content be understood, then
5. Registration of the "application/xmld" media type
MIME media type name: application
MIME subtype name: xmld
Required parameters: none
Optional parameters:
charset
This parameter has identical semantics to the charset parameter
of the "application/xml" media type as specified in [XMLMIME].
Encoding considerations:
Identical to those of "application/xml" as described in [XMLMIME],
Section 3.2.
Security considerations:
Interoperability considerations:
Published specification:
This document.
Applications which use this media type:
No known applications currently use this media type.
Additional information:
Magic number:
Same as [XMLMIME] Section 3.1
File extension:
".xml" may be used because [XMLMIME] suggests it be used for any
XML content, but this may result in the content being served as
text/xml or application/xml, removing the added benefits that
this new media type provides.
Macintosh File Type code: TEXT
Person & email address to contact for further information:
Mark Baker
Intended usage: COMMON
Author/Change controller:
4. Future considerations
This document may need to consider these other issues at some point;
o any generalized mandatory extension mechanism that arises in the
context of XML and/or RDF
5. Authors' Addresses
Mark A. Baker
Idokorro Mobile, Inc.
44 Byward Market, Suite 240
Ottawa, Ontario, CANADA. K1N 7A2
tel:+1-613-789-1818
mailto:mbaker@idokorro.com
mailto:distobj@acm.org
6. Acknowledgements
TBD.
7. References
[XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M. and E. Maler,
"Extensible Markup Language (XML) 1.0 (Second Edition)",
World Wide Web Consortium Recommendation REC-xml, October
2000, .
[XMLMIME] Murata, M., St. Laurent, S., Kohn, D., "XML Media Types", RFC
3023, January 2001.
[EXTLANG] Berners-Lee, T., Connolly, D., "Web Architecture: Extensible
Languages", W3C Note, February 1998. Available at
http://www.w3.org/TR/XHTMLplusSMIL/
[XSLT] http://www.w3.org/TR/xslt
[XHTMLMED] "Registration of the application/xhtml+xml media type",
RFC 3236, January 2002. Baker, M., Stark, P.
8. Appendices
8.1 Processing model feature consideration matrix
This matrix reflects the thought process in trying to extract
content and processing rules, by examining the interaction between
schema wildcarding (such as seen in XML Schema), and the type of
extension (mandatory or not).
(schema) allow deny not-speced
(mandatory)
true (1) (3) (5)
false (2) (4) (6)
(1) if the schema for language foo explicitly permits content from
language bar, do we require namespace dispatch on bar content?
No, because foo processor is in control. It can punt if it wants
though, but has to recognize that content (per the semantics of
the mandatory extension mechanism in use).
(2) same as for (1) except it doesn't have to recognize content in
order to (optionally) punt.
(3) a processor with knowledge of the schema should be able to
"intercept" dispatch and process the sub-namespace it itself.
(4) same as 3
(5) business as usual
(6) business as usual
8.2 Options for binding to Internet media types
Two of the alternative solutions to binding this behaviour to an
Internet media type are presented here.
8.2.1. Redefine */xml and maybe the meaning of "+xml"
This possibility involves making normative changes to RFC 3023 to
associate this behaviour with application/xml, text/xml, and possibly
of any type using the "+xml" convention.
The advantage of this approach is that these types are well known
and that much of the existing content out there being served as
*/xml conforms to the content rules specified here. The disadvantage
is that not all content conforms, including any XSLT [XSLT] style
sheet using the simplified stylesheet form.
8.2.2. Define a new media type
We could define a new generic XML media type to which the rules
specified here could be bound. An interesting consideration is
whether or not this new type should use the "+xml" suffix. An
advantage of doing this is that it can easily reuse all of the
common behaviour of "+xml". The main disadvantage is that because
dispatch behaviour within "+xml" types is undefined (see [XMLMIME]
Section 3, last paragraph), a generic XML processor may not treat it
properly, and may even make changes that change its conformance to
the content rules specified here. For that reason, "+xml" will not
be used.
We might also want to consider specifying a suffix that custom types
could use to indicate that, although they enable namespace dispatch
and could be described with this new generic media type, they still
have a good reason for needing their own media type. The processing
behaviour of this new media type should also be consistent with the
processing rules defined here. Currently, the requirements for this
are unclear, so it will not be considered at this time.