We envision applications of XML in which a document instance may contain markup defined in multiple schemas. These schemas may have been authored independently. One motivation for this is that writing good schemas is hard, so it is beneficial to reuse parts from existing, well-designed schemas. Another is the advantage of allowing search engines or other tools to operate over a range of documents that vary in many respects but use common names for common element types.
These considerations require that document
should have universal names,
whose scope extends beyond their
containing document. This specification proposes a mechanism,
XML Namespaces are based on the use of
XML syntax does not allow direct use of a URI as a
because URIs can contain characters not allowed in names.
namespace name serves as a proxy for a URI.
A special processing instruction described below is used
To accomplish this, the production for
The namespace name, unless it is "
xml", must have been
declared in a
xml is reserved, and considered to
have been implicitly declared.
To enable the proper use of
Element types may be given as
Attribute names are given as
I have to write a schema for manuals. As manuals have mathematical expressions, my schema has to allow them.
Fortunately, W3C has a schema called MathML. As I know little about mathematical expressions, I would like to use MathML as is. I do not even want to read what is defined in MathML. If a better schema for mathematical expressions appears later, I will switch to that schema.
Although I do not care about internal structures of mathematical expressions, I do want to restrict where they may appear. I do not allow them in footnotes. I only allow them as direct subordinates of sections and subsections.
Writers will use XML editors to edit manuals. In the near future, mathematical expression editors for MathML will show up in the market. My writers will use such editors to edit mathematical expressions in manuals. While editing manuals with XML editors, writers can introduce mathematical expressions, provided that my schema allows mathematical expressions there. Then, mathematical expression editors are automatically invoked. After creating mathematical expressions, writers close math editor windows, and resume editing in XML editors.
Observe that implementation of XML editors does not require implementation of mathematical expressions, and that mathematical expression editors are dedicated to MathML. Neither editor need know the entire document or schema.
I have to write a schema for on-line novels. Because of some regulation, each novel has to have metadata. The schema of such metadata is already defined by somebody (the government, for example). I have to use that metadata schema as is. No change is allowed.
As in the previous case, I do not care about the internal structure of metadata. But I allow metadata to appear as the eldest child of the novel element only.
Writers write novels with XML editors. The editors use my schema. But writers do not provide metadata. Writers see no metadata.
Somebody examines each novel and then creates metadata. If that novel is a pornography, he or she will specify this information in the metadata. But the novel is not modified at all. The only change is metadata. Therefore, metadata editing tools do not provide editing of novels, but only provide metadata editing.
If the schema for metadata is revised by the government, I simply reference to the new schema. Writers do not rewrite novels. If the revision is backward-compatible, existing metadata (embedded within novels) remains valid. If not, metadata has to be edited manually or converted automatically.
My schema for manuals should provide tables. But I do not want to study columns and rows. I would like to use somebody's schema for tables. I simply refer to that schema.
Although I do not care about columns and rows, I do care permissible about subordinates of entries. I would like to allow data characters, phrase elements, and mathematical expressions only. Nothing else can appear within entries.
Editing of tables is similar to that of mathematical expresssions, but we need recursive editing. A writer edits a manual with an XML editor. When he or she introduces a table, a table editor is automatically invoked. The table editor is dedicated to tables, and does not use my schema. After creating rows, columns, and entries, the XML editor is recursively invoked to create subordinates of entries.
Imagine an XML document representing an invoice for books. If public schemas exist for elements and attributes describing books, electronic transactions and digital signatures, the invoice author should be able to use these, rather than inventing new element and attribute types. Any reader of the invoice document should be able to infer a consistent meaning to its contents, the same meaning as if the elements and attributes had appeared in a different kind of document (such as an invoice for automotive parts, or an inventory of books or a digital signature on a legal contract). Any search tool should locate the elements, regardless of the document in which they reside. Further, since several schemas may choose the same name (e.g. "size") for elements or attributes with different meanings, these must be distinguished if used within the same document.
The namespace syntax presented in this working draft is intended
to support the namespace needs expressed by other W3C activities, to
enable interoperability and to provide for future enhancements to the
XML specification. Unfortunately, the syntax presented is not sufficiently
robust to describe the blind interchange of validated documents which
contain elements and attributes whose types are defined in several
schemas. Therefore, to provide insight into the intent of the namespace
syntax, this draft includes a brief summary of the SIG discussion and
rationale. This section will not attempt to present detailed technical
discussion nor will it document the individual contributions of those
who participated in the discussion. These details are available in the
The namespace discussion has resulted in no changes to the XML 1.0 syntax; colons continue to be valid name characters. Our intent is to enable the development of namespace aware applications without adding large passages to the XML specification and without adding significant burden to XML application developers. In developing this proposal we have avoided several features and functions we believe will be included in the full namespace specification in a future release of the XML specification.
Specifically, this working draft does not establish semantics for validating document instances against multiple schemas, the mechanics for minimizing namespace names, address whether qualified attributes should be constrained, or if there should be constraints placed on other name characters. It is anticipated this proposal will promote the development of industry experience in regards to multiple schema validation, inheritance, sub-classing, editing and cut-and-pastepaste operation, and application behaviors that will be reflected in future versions of the XML specification.
This working draft does add constraints to the XML syntax by limiting the use of colons in names and establishing a convention for namespace declarations. It is the intent of these constraints to limit the namespace syntax sufficiently that future extensions can be defined to resolve remaining namespace issues. It is expected that legacy data conforming to this note will be compatible with these solutions. Note: there is no guarantee that any namespace mechanism will be adopted for XML, nor that the mechanism will in fact be compatible with the syntax described in this working draft.
The XML specification uses
PI target Root element type in doctype declaration Element type in start-, empty-element, and end-tags, and
in element type declarations Attribute names, in start-tags, and in attribute list
declarations As the value of ID, IDREF(S), and ENTITY(IES) attributes
(note that the values in NMTOKEN(S) attributes are NMTOKENS, not names) Entity names, in declarations and references (general and parameter
entities) Notation names, in NDATA entity declarations and Notation
declarations (As LatinName) as the encoding name in an XML declaration
Name that is the first and least
controversial candidate for qualification is the element type.
Root element type in doctype declaration
Element type in start-, empty-element, and end-tags, and in element type declarations
Attribute names, in start-tags, and in attribute list declarations
As the value of ID, IDREF(S), and ENTITY(IES) attributes (note that the values in NMTOKEN(S) attributes are NMTOKENS, not names)
Entity names, in declarations and references (general and parameter entities)
Notation names, in NDATA entity declarations and Notation declarations
(As LatinName) as the encoding name in an XML declaration
Existing SGML practice shows that attributes are often used for much the same purposes as elements, with the choice determined by evolutionary history and design aesthetics as often as by differences in element and attribute capabilities. Also, certain kinds of attributes are used on a wide range of element types (for example, those employed in XML Links, those that might indicate the datatype of an element, etc.). Thus, attribute names are a strong candidate for qualification.
Furthermore, on the basis of consistency and simplicity, it might be argued that if one instance of Name is to be qualified, all should be.
On the other hand, attributes are already qualified by element types; that is, permissible values, defaults, and semantics of attributes depend on element types. It has been argued that further qualification of attributes by namespaces is unnecessary, and is even harmful for validation (see 4.2). Those attributes (e.g., xml:lang) which apply to all element types should rather be captured by a different mechanism (e.g.,#ALL in the WebSGML adaptation), as their permissible values, defaults, and semantics do not depend on element types.
Perhaps the most complicated issue surrounding namespaces is validation.
In the discussion, many viewed validation as any process that verified
document instance against a schema while others often referred to
validation as "plain old DTD-wise validation with an XML processor,
not any other kind of validation with an application". This white
paper tries to support both
For generic schema validation it is anticipated that applications will validate against a set of semantics that are either predefined in an application standard or expressed in machine readable syntax. The system literal in the namespace declaration should be used to identify the specific schema and any associated data resources.
The two most commonly discussed approaches to XML validation were fragment merging and validation of a grove of independent fragments. These methods differ in how the the validation process is implemented, and should yield the same result. This specification most directly supports validation by fragment merging. Fragment merging validates the document instance against a DTD created by merging elements from the constituent DTD fragments. The merged-DTD may be created either by man or machine, on the fly or prior to validation. This approach has the advantage that once the merged-DTD is created, the validation process is unchanged from current SGML/XML practice.
XML validation using a
In considering the validation issues, it became apparent to many that we have not arrived at an consensus about what it means, in the general case, to apply an attribute from one schema to an element from another schema. Coupled with a growing concern about detailed meaning of multiple schema validation and a desire to develop an XML based syntax to express a superset of the SGML DTD semantics, it was decided to support the most straight forward XML validation technique that supported the requirements and which will also foster an environment where application developers can evolve working models of multiple namespace validation with various schema syntaxes.
Two mechanisms were discussed to associate between elements and namespace schemas; qualified names and a reserved attribute. While many in the discussion admired the simplicity of the reserved-attribute approach which required no namespace declaration and no new syntax, the qualified namespace prefix syntax was chosen because it supports two requirements not possible with URI attributes. In addition, namespace qualifiers may be more compact, more meaningful to a reader, easier to understand, describe and use than a reserved attribute.
Qualified names support the requirement to identify the schema for an attribute and to be able to apply an attribute from one document fragment to an element from another document fragment. The use of URI attributes would not allow an attribute from one namespace to be applied to an element from a different namespace fragment. Additionally, many believe qualifiers are needed for other names; id, idrefs, and enumerated attribute values and perhaps should be permitted on all names. This qualified name syntax will not need to change if we decide to support additional name types in the future.
This working draft does not specify how the URI system literal in the Namespace Declaration PI is to be used, aside from saying that it identifies the namespace schema. We anticipate that validating XML processors will use the QName for validation. Other applications are free to choose whether and how the URI,LocalPart pair are interpreted. Note: a future version of the namespace specification that addresses schema validation semantics may require the interpretation of the URI, LocalPart pair to be formalized.
A validating XML processor must be able to disambiguate between QNames that have the same LocalParts. One intent is to support the ability to merge two document fragments that have contain same generic identifier while still performing XML validation against the original content models. This ability to differentiate may also ease the deployment of stylesheets and the development other processing applications. Finally, because applications need only to control the NSPart prefix, qualified names simplify avoiding namespace collisions.
The syntax presented in this working draft uses a reserved processing instruction (PI) to associate a namespace qualifier with a network resource. In the discussions many alternatives were offered, including using a new declaration type, system notations or XLL links. While many considered PIs as the least desirable long term option, it was chosen for this draft because it is the most compatible with existing processing systems and should prove easy to migrate to another syntax in the future.
The ability to associate support multiple schemas with a name space qualifier was discussed. The multiple associations could provide both machine and human readable schemas or even multiple machine schemas. The namespace syntax in this working draft requires that there be exactly one namespace declaration PI for each namespace name. Future namespace specifications may establish conventions which support multiple associations as well as a mechanism to qualify a particular attribute of a specific element in a namespace.
The namespace syntax presented in this working draft severely constrains qualified names. The following constraints were chosen based on an understanding of the immediate requirements of other W3C actives and to reserve syntactical constructs to support extensions of the namespace mechanisms.
This work reflects input from a very large number of people, including especially the members of the World Wide Web Consortium XML Working Group and Special Interest Group.
In particular, Murata Makoto contributed the operational scenarios in the examples section.