[This local archive copy is from the official and canonical URL, http://www.w3.org/TR/1999/NOTE-ddml-19990119; please refer to the canonical source document if possible.]


W3C NOTE-ddml-19990119


Document Definition Markup Language (DDML) Specification, Version 1.0

W3C Note, 19-Jan-1999

This Version:
http://www.w3.org/TR/1999/NOTE-ddml-19990119
Latest Version:
http://www.w3.org/TR/NOTE-ddml
Editors:
Ronald Bourret <rbourret@ito.tu-darmstadt.de>, Darmstadt University of Technology
John Cowan <cowan@locke.ccil.org>
Ingo Macherius <macherius@gmd.de>, GMD
Simon St. Laurent <simonstl@simonstl.com>, simonstl.com

Status of this document

This document is a submission to the World Wide Web Consortium from GMD (see Submission Request, W3C Staff Comment).

This document is a NOTE made available by W3C for discussion only. This indicates no endorsement of its content, nor that W3C has had any editorial control in its preparation, nor that W3C has, is, or will be allocating any resources to the issues addressed by the NOTE.

DDML was formerly known as XSchema. For a list of differences between DDML and XSchema, see Appendix B, "Differences between DDML and XSchema".

Abstract

This document proposes Document Definition Markup Language (DDML), a schema language for XML documents. DDML encodes the logical (as opposed to physical) content of DTDs in an XML document. This allows schema information to be explored and used with widely available XML tools.

DDML is deliberately simple, providing an initial base for implementations. While introducing as few complicating factors as possible, DDML has been designed with future extensions, such as data typing and schema reuse, in mind.

Table of Contents

1 Introduction
  1.1 Origin and Goals
  1.2 Relation to Standards
  1.3 Related Work
  1.4 Terminology
  1.5 Comments and History
2 DDML Syntax
  2.1 The DocumentDef Element
  2.2 Element Declarations
  2.3 Content Model Declarations
     2.3.1 Empty Content Model
     2.3.2 Any Content Model
     2.3.3 PCData Content Model
     2.3.4 Reference Content Model
     2.3.5 Mixed Content Model
     2.3.6 Choice Content Model
     2.3.7 Sequence Content Model
  2.4 Attribute Declarations
     2.4.1 Attribute Types
     2.4.2 Attribute Defaults
     2.4.3 Combinations of Types, Defaults, and Default Values
  2.5 Notation Declarations
  2.6 Unparsed Entity Declarations
  2.7 DDML Extensions
     2.7.1 Documentation Extensions
     2.7.2 Other Extensions
  2.8 id Attributes
3 DDML and Namespaces
  3.1 The DDML Namespace
  3.2 Namespaces of Elements and Attributes Being Defined
4 DDML Documents and DTDs
  4.1 DTDs in DDML Documents
  4.2 DTDs in Documents Described by DDML Documents
  4.3 Converting Between DDML Documents and DTDs
     4.3.1 Converting DTDs to DDML Documents
     4.3.2 Converting DDML Documents to DTDs
5 Using DDML Documents
  5.1 Associating DDML Documents with XML Documents
     5.1.1 DDML Processing Instruction
     5.1.2 Inline DDML Elements (Non-Normative)
  5.2 Validation
  5.3 Suggested Uses of DDML Documents (Non-Normative)
     5.3.1 Parsed Entities in DDML Documents
     5.3.2 Entity Support in DDML
     5.3.3 DTD Replacement
     5.3.4 Schema Repository
     5.3.5 Reusing Element Declarations with Entities or Processing Instructions
     5.3.6 Reusing Schema Definitions through XLinks
     5.3.7 Authoring
     5.3.8 General Schema Information
     5.3.9 Custom Uses

Appendices

A: References
B: Differences between DDML and XSchema
C: DDML DTD
D: DDML in DDML
E: Contributors

1 Introduction

In order for document processing to be reliable, it is necessary to be able to describe classes of documents and to verify individual documents' membership in these classes -- in other words, to be able to express constraints on documents and thus define 'document types'. XML inherits a mechanism for doing this from SGML: the Document Type Definition. XML DTDs can perform a subset of the functions of SGML DTDs.

DTDs have limited expressiveness and it is necessary to experiment with new ideas in schema design. These ideas include a syntax that is more like that of XML document content, certain kinds of extensibility and a cleaner separation between parsing and verifying. DDML is an experimental schema language designed to provide a starting point for these experiments.

So that DDML documents will be immediately useful with existing software, the DDML specification will describe a conversion from DDML documents to DTDs. This initial version of the DDML specification is deliberately simple, providing an initial base for implementations while introducing as few complicating factors as possible. Authors accustomed to DTD creation will find their tool set constricted; it is hoped that supporting software and tools available from other standards will make up for this reduced tool set.

1.1 Origin and Goals

Proposals for describing SGML document type definitions using document syntax rather than the separate declaration syntax have been under development for a number of years, and used by several tools for documentation. The current proposal arose from a number of concerns surrounding XML's usability and consistency. Originally conceived of as a mapping of DTD syntax to document syntax, the project has developed into an effort focused on creating schemas describing element and attribute structures rather than preserving every function provided by XML 1.0 DTDs.

The list of goals developed by the xml-dev discussion follows:

  1. DDML documents shall use XML document syntax, using element nesting and attributes to describe all constraints that may be verified by a processor using DDML .
  2. DDML shall define a transformation from DDML documents to DTDs.
  3. DDML documents shall be capable of representing the normalized element and attribute structures defined in XML 1.0 DTDs, and provide namespace support.
  4. DDML documents shall be parseable, manageable, and manipulable using the same tools used to parse, manage, and manipulate XML documents.
  5. DDML documents shall be easy to create, read, and modify, and shall provide authoring support for XML documents.
  6. DDML documents shall be easy to use in combination with a parser to provide structural validation of documents.
  7. DDML shall include a DDML document and an XML 1.0 DTD defining the structure of DDML documents.
  8. DDML shall suggest mechanisms for applying DDML documents to documents.
  9. DDML shall include mechanisms for extending the information included in DDML documents to support metadata.
  10. The DDML specification shall be readable, clear, and rigorous, using terminology and nomenclature as close to the XML 1.0 specification as possible.
  11. The DDML specification will comply with and be consistent with W3C recommendations.
  12. DDML documents shall provide constructs for human- and machine-readable documentation.

1.2 Relation to Standards

DDML documents use XML 1.0 document instance syntax and may be applied to XML 1.0 [XML] documents. This specification refers to several IETF standards, notably Multipurpose Internet Mail Extensions (MIME) ([RFC 2046]and [RFC 2048]) and XML Media Types [RFC 2376].

Namespace usage in DDML is based on the 17 November, 1998 "Namespaces in XML" Proposed Recommendation [Namespaces]. Because this recommendation is still subject to change, all namespace attributes (the xmlns and xmlns:DDML attributes of the DocumentDef element and all ns, prefix, and ElementNS attributes) and processing (Section 3, "DDML and Namespaces") are subject to change, even after the rest of the DDML specification is finalized.

It is hoped that future versions of DDML will use [XLink] and [XPointer] to implement schema reuse.

DDML has been influenced by the XML-Data proposal [XML-Data]. It is hoped that DDML may be mapped to an RDF vocabulary.

1.3 Related Work

Several other proposals for XML schema languages also exist. These are Document Content Description for XML ([DCD]), Schema for Object-Oriented XML ([SOX]), and XML-Data [XML-Data]. Resource Description Framework (RDF) Schema Specification (RDF Schemas) proposes a related language, although this is not currently suitable for replacing DTDs.

1.4 Terminology

The requirement levels used throughout this document reflect the approach of [RFC 2119], though keywords (like may and must) are not capitalized. Other terms used are defined in the XML 1.0 Recommendation [XML].

1.5 Comments and History

Please send public comments to the XML-Dev mailing list [XML-DEV] and private comments to Simon St.Laurent or Ronald Bourret.

DDML is a cooperative effort of members of the XML-Dev mailing list and has been submitted as a Note to the W3C through the GMD - Forschungszentrum Informationstechnik GmbH, Bonn, Germany. For a complete list of contributors, see Appendix E, "Contributors". Historical information regarding the development of DDML is available at http://purl.oclc.org/NET/ddml.

2 DDML Syntax

This section describes DDML document syntax. In version 1.0, the DDML document is an XML document containing a single DocumentDef element in which information describing the schema is nested. The DocumentDef element must be preceded by an XML declaration and may be preceded by other declarations, comments, and processing instructions. In future versions of DDML, DocumentDef elements may be embedded in instance documents.

2.1 The DocumentDef Element

The DocumentDef element is the root element for all DDML documents. The declaration for the DocumentDef element is:

<!ELEMENT DocumentDef (Doc?, More?, (ElementDecl | Model | AttDef | AttGroup | Notation | UnparsedEntity | Enumeration | DocumentDef)*)>
<!ATTLIST DocumentDef
    xmlns         CDATA   #FIXED   "http://www.purl.org/NET/ddml/v1"
    xmlns:DDML    CDATA   #FIXED   "http://www.purl.org/NET/ddml/v1"
    ns            CDATA   #IMPLIED
    ElementNS     CDATA   #IMPLIED
    prefix        NMTOKEN #IMPLIED

    Version       CDATA   #FIXED   "1.0"
    MimeType      CDATA            "application/xml"
    FileExtension CDATA            "xml"
    id            ID      #IMPLIED>

The DocumentDef element contains other elements describing the document type and building a schema. These elements are described in later sections of this specification. The DocumentDef element may also contain other DocumentDef elements nested inside of it. This nesting of DocumentDef elements improves reusability of DDML documents by allowing the combination of multiple DDML documents inside of a single DDML document. It also allows finer-grained control over documentation for subsections of a DDML document.

The DocumentDef element's attributes include information about the namespaces used by DDML and elements defined with DDML, the version of the DDML specification used, and information about the type of documents described by the DDML document.

The xmlns and xmlns:DDML attributes identify the URI of the namespace containing the DocumentDef elements. For more information, see Section 3.1, "The DDML Namespace."

The ns attribute provides the URI of the namespace containing elements and attributes being declared, as with an ElementDecl element. The ElementNS attribute identifies the URI of the namespace of elements being referenced, as with a Ref element. The prefix attribute identifies the prefix to be used when converting the DDML document to a DTD. These attributes can be overridden in the elements that declare and reference elements and attributes. For more information, see Section 3.2, "Namespaces of Elements and Attributes Being Defined."

Information about the DDML specification version used to create this DDML document, contained in the Version attribute, is critical to proper handling of documents should the specification be updated in the future. This specification is identified as version 1.0. Future major and minor versions of the DDML specification should identify themselves differently. No provision is made at this time for nesting DocumentDef elements using different versions of the specification under a parent DocumentDef element.

The MimeType and FileExtension attributes are used to provide a suggested MIME (Multipurpose Internet Mail Extensions) Content-type and file extension for documents associated with a particular DDML document. Applications may use this information to identify XML document types. A document library that generates XML documents dynamically could assign file extensions and MIME types based on the DDML document used.

Applications using this information should use the values stored in the first DocumentDef element encountered during processing. For instance, if a DocumentDef element includes another nested DocumentDef element, the values for the MimeType and FileExtension attributes of the root DocumentDef element should be used.

By default, most XML documents are assumed to have a MIME type of application/xml, as described in [RFC 2376]. Developers who need different MIME types for documents associated with particular DDML documents may register other MIME types with the IETF, as described in [RFC 2048], or use the 'x-' prefix syntax for subtypes, as described in [RFC 2046].

For information about the id attribute, see Section 2.8, "id Attributes".

2.2 Element Declarations

Element declarations in DDML documents are made using the ElementDecl element and its contents:

<!ELEMENT ElementDecl (Doc?, More?, Model, AttGroup?)>
<!-- Name is the element name -->
<!ATTLIST ElementDecl
    Name   NMTOKEN #REQUIRED
    ns     CDATA   #IMPLIED
    prefix NMTOKEN #IMPLIED
    id     ID      #IMPLIED
    Root   (Recommended | Possible | Unlikely) "Possible">

The Name attribute identifies the name of the element, and is required. An element declaration would look like:

<ElementDecl Name="Species">
...additionalElementInformation...
</ElementDecl>

This declaration would declare an element named "Species", which would appear in an instance as:

<Species>...content...</Species>

The Name attribute must be unique within the set of elements in the defined namespace. It provides the name of the element as declared here and is also used by other elements to refer to this element in their content model declarations. The Name attribute must match the NCName production in [Namespaces]. (Effectively, this requires element names to begin with a letter or underscore and not include a colon.)

The ns attribute identifies the URI of the namespace containing this element and its attributes. The prefix attribute identifies the prefix that will be applied to this element and its attributes during conversion to DTDs. These attributes override the values of the same attributes on the DocumentDef element and can be overridden in the AttGroup and AttDef elements. For more information, see Section 3.2, "Namespaces of Elements and Attributes Being Defined."

The Root attribute provides authoring tools with a guide for which elements are likely root elements for documents. This is intended to simplify the choices presented to authors during document composition. Composition tools could use this to build a menu of likely starting points for a document. The Root attribute is purely a suggestion and does not require any action on the part of the processor.

For information about the id attribute, see Section 2.8, "id Attributes".

Note that an element must declare a content model of some type, using the Model element, even if that content model is empty. Documentation (in the Doc element), non-DDML extensions (in the More element) and attribute declarations (using the AttGroup element) are optional.

Documentation about the element, additional extensions, content-model information, and attribute information are stored as sub-elements of the ElementDecl element. Documentation is covered in Section 2.7.1, Documentation Extensions. Additional extensions are covered in Section 2.7.2, Other Extensions. Content Models are covered in Section 2.3, Content Model Declarations, and attributes are covered in Section 2.4, Attribute Declarations.

2.3 Content Model Declarations

Content model declarations are made within the Model sub-element of the declaration for the element to which they apply.

Model elements may appear inside DocumentDef elements for reusability, documentation, and reference, but will need to be linked to particular element declarations through mechanisms not yet defined (most likely XLink). All content model declarations have an optional id attribute; for more infomation, see Section 2.8, "id Attributes".

The Model element holds the content model for an element.

<!ELEMENT Model (Doc?, More?, (Ref | Choice | Seq | Empty | Any | PCData | Mixed))>
<!ATTLIST Model
    id ID #IMPLIED>

Model elements are pure containers. A Model element nested inside a Choice or Seq element can only contain Doc, More, Ref, Choice, and Seq elements.

2.3.1 Empty Content Model

The simplest content model is empty, which indicates that the parent element has no sub-elements and no character data content. The Empty element indicates that an element is empty.

<!ELEMENT Empty EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

For example, to declare the Species element shown in the previous section empty, use the following DDML declaration:

<ElementDecl Name="Species">
  <Model>
    <Empty/>
  </Model>
</ElementDecl>

This would not allow the Species element to contain any text or sub-elements.

2.3.2 Any Content Model

The Any content model, which allows the element to contain parsed character data or any other elements as content, is equally simple:

<!ELEMENT Any EMPTY>
<!ATTLIST Any
    id ID #IMPLIED>

Using the Any content model is much like using the Empty content model. To declare that the Species element had a content model of any, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Any/>
  </Model>
</ElementDecl>

This allows the Species element to contain text and any sub-elements an author desired.

2.3.3 PCData Content Model

The PCData content model, which allows the element to contain only parsed character data, is also represented by a single empty element.

<!ELEMENT PCData EMPTY>
<!ATTLIST PCData
    id ID #IMPLIED>

Using the PCData content model is much like using the Empty and Any content models. For example, to assign the Species element a content model of PCData, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <PCData/>
  </Model>
</ElementDecl>

This allows the Species element to contain text, but no sub-elements.

2.3.4 Reference Content Model

The Reference content model allows an element to specify other elements that it may contain. Ref elements identify the contained elements, as well as the frequency with which they appear:

<!ELEMENT Ref EMPTY>
<!-- Element references the name in an ElementDecl element -->
<!ATTLIST Ref
    Element       NMTOKEN #REQUIRED
    ElementNS     CDATA   #IMPLIED
    id            ID      #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

A Model element may directly contain at most one Ref element. To define content models that permit or require the use of more elements, use the Any, Mixed, Choice, or Sequence content models.

The Element and ElementNS attributes identify the contained element. These must match the values of the Name and ns attributes, respectively, of an ElementDecl element elsewhere in the DocumentDef document. The ElementNS attribute overrides the value of the same attribute on the DocumentDef, Mixed, Choice, or Seq element. For more information, see Section 3.2, "Namespaces of Elements and Attributes Being Defined."

The Frequency attribute controls the number of referenced elements that may occur.

To declare that the Species element must contain a single CommonName element, and nothing else, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="Required"/>
  </Model>
</ElementDecl>

This requires the Species element to contain a single CommonName element. To make the CommonName element optional - though it may still only appear once, set the Frequency attribute to 'Optional':

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="Optional"/>
  </Model>
</ElementDecl>

Optional is the equivalent of the ? occurrence indicator in XML 1.0 DTDs.

To require the Species element to contain at least one but possibly multiple CommonName elements, set the Frequency attribute to 'OneOrMore':

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="OneOrMore"/>
  </Model>
</ElementDecl>

OneOrMore is the equivalent of the + occurrence indicator in XML 1.0 DTDs.

Finally, to allow the Species element to contain any number (including zero) of CommonName elements, set the Frequency attribute to 'ZeroOrMore':

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="ZeroOrMore"/>
  </Model>
</ElementDecl>

ZeroOrMore is the equivalent of the * occurrence indicator in XML 1.0 DTDs.

2.3.5 Mixed Content Model

The mixed content model allows the unordered use of different element types and parsed character data. Content within an element declared as mixed can be parsed character data, one or more of the elements referenced by Ref elements nested in the Mixed element, or a mixture of both. The Mixed element in a DDML document must contain only Ref elements; there is no need to include a PCData element because this is inherent in the mixed content model.

<!ELEMENT Mixed (Ref+)>
<!ATTLIST Mixed
    ElementNS CDATA        #IMPLIED
    id        ID           #IMPLIED
    Frequency (ZeroOrMore) #FIXED   "ZeroOrMore">

To declare that the Species element may contain a mix of parsed character data, CommonName elements, LatinName elements, and PreferredFood elements in any order, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Mixed>
      <Ref Element="CommonName"/>
      <Ref Element="LatinName"/>
      <Ref Element="PreferredFood"/>
    </Mixed>
  </Model>
</ElementDecl>

The DDML processor must ignore any frequency attributes in Ref elements that appear as subelements of the Mixed element.

For information about the ElementNS attribute, see Section 2.3.4, "Reference Content Model."

2.3.6 Choice Content Model

The Choice content model allows for either-or inclusions of elements and groups of elements. The Choice content model represents groups of element content possibilities and must contain at least two sub-elements. Situations where only one element is needed should use the Ref content model instead of Choice. The Choice element may indicate a frequency, allowing the content model defined by the Choice model to appear one, one or zero, one or more, or zero or more times.

<!-- A Choice must have two or more children -->
<!ELEMENT Choice ((Seq | Ref | Model), (Seq | Ref | Model)+)>
<!ATTLIST Choice>
    ElementNS CDATA #IMPLIED
    id        ID    #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

The simplest Choice element will contain two Ref elements and a frequency attribute. By default, the Choice element's content model is required to appear once.

To declare that a Species element may contain either a common name or a Latin name, but not both, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Choice Frequency="Required">
      <Ref Element="CommonName"/>
      <Ref Element="LatinName"/>
    </Choice>
  </Model>
</ElementDecl>

The Ref elements in a Choice element may also specify the frequency with which they appear, as may the Seq elements described in Section 2.3.7, "Sequence Content Model". The Choice element is the equivalent of the choice group (element | element) in XML 1.0 DTDs. The ordering of the sub-elements within an Choice element has no effect.

For information about the ElementNS attribute, see Section 2.3.4, "Reference Content Model."

2.3.7 Sequence Content Model

The Sequence content model allows for the sequential appearance of sub-elements. Elements, if they are required to appear, must appear in the order of the Choice and Ref sub-elements in the Seq element. The Seq element may also indicate a frequency, allowing the content model defined by the Seq model to appear one, one or zero, one or more, or zero or more times.

<!-- A Seq must have two or more children -->
<!ELEMENT Seq ((Choice | Ref | Model),(Choice | Ref | Model)+)>
<!ATTLIST Seq
    ElementNS CDATA #IMPLIED
    id        ID    #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

The simplest Seq element will contain two Ref elements in the order in which they should appear and a frequency attribute. By default, the Seq element's content model is required to appear once.

To declare that the Species element requires a common name and a Latin name, in that order, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Seq Frequency="Required">
      <Ref Element="CommonName"/>
      <Ref Element="LatinName"/>
    </Seq>
  </Model>
</ElementDecl>

The Ref elements in an Seq element may also specify the frequency with which they appear, as may the Choice elements. The Seq element is the equivalent of the sequence group (element, element) in XML 1.0 DTDs.

For information about the ElementNS attribute, see Section 2.3.4, "Reference Content Model."

2.4 Attribute Declarations

Attributes are declared with AttDef elements. The name, type, and default value (if any) of an attribute are defined with attributes of the AttDef element, as well as whether the attribute is required. Values for enumerated types are provided by subelements.

AttGroup elements provide a container for multiple AttDef elements. If an AttGroup element is directly beneath an ElementDecl element, the contained AttDef elements apply to the element being declared. If an AttGroup element or AttDef element is directly beneath a DocumentDef element, the defined attributes are "global", and can only be used by reference.

<!ELEMENT AttGroup (Doc?, More?, (AttDef | AttGroup)*)>
<!ATTLIST AttGroup
    ns        CDATA   #IMPLIED
    prefix    NMTOKEN #IMPLIED
    id        ID      #IMPLIED>

<!ELEMENT AttDef (Doc?, More?, Enumeration?)>
<!ATTLIST AttDef
    Name      NMTOKEN      #REQUIRED
    ns        CDATA        #IMPLIED
    prefix    NMTOKEN      #IMPLIED
    Type      (CData    |
               ID       |
               IDRef    |
               IDRefs   |
               Entity   |
               Entities |
               Nmtoken  |
               Nmtokens |
               Notation |
               Enumerated) "CData"
    Required  (Yes | No)   "No"
    AttValue  CDATA        #IMPLIED
    id        ID           #IMPLIED>

<!ELEMENT Enumeration (Doc?, More?, EnumerationValue+)>
<!ATTLIST Enumeration
    id ID #IMPLIED>

<!ELEMENT EnumerationValue (Doc?, More?)>
<!ATTLIST EnumerationValue

    Value CDATA #REQUIRED>

The Name and ns attributes provide the name of the attribute and the URI of its namespace, respectively. The value of the Name attribute must match the NCName production in [Namespaces]; that is, it must begin with a letter or underscore and cannot include a colon. If an attribute uses the same namespace as the element to which it applies, its name must be unique within that element. If it uses a different namespace or does not apply to an element (that is, it is a global attribute), its name must be unique within the Global Attribute Partition of its namespace.

If an AttDef element is nested beneath an ElementDecl element, it applies to that element. The following example declares that the Species element has a Latin attribute:

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="Latin" ...additionalInformation.../>
  </AttGroup>

</ElementDecl>

If an AttDef element is not nested beneath an ElementDecl element, it is a global attribute; a mechanism for using global attributes will be defined in a later version. The following example declares date, time, and location as global attributes:

<AttDef Name="date" ...additionalInformation.../>
...
<AttGroup>
  <AttDef Name="time" ...additionalInformation.../>
  <AttDef Name="location" ...additionalInformation.../>
</AttGroup>

The prefix attribute identifies the prefix that will be applied to the attribute during conversion to DTDs. For more information about prefixes and namespaces, see Section 3.2, "Namespaces of Elements and Attributes Being Defined."

For information about the id attribute, see Section 2.8, "id Attributes".

By default, attributes are assumed to contain character data (CData), not be required, and have no default value. Thus, the simplest attribute declaration requires only an attribute name:

<AttDef Name="Latin"/>

2.4.1 Attribute Types

DDML 1.0 provides equivalents for all of the XML 1.0 DTD attribute types. All of them are declared using attribute values within the AttDef element.

The CData attribute type is one of the most common, permitting an attribute to contain character data as defined by the XML 1.0 specification. If the Species element were to contain an attribute providing the Latin name of the species, the declaration could look like the following. (The Type attribute could actually be omitted in this case, as CData is the default type.)

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="Latin" Type="CData"/>
  </AttGroup>
</ElementDecl>

This attribute would then be available for use in instances of the Species element:

<Species Latin="Passerina cyanea">...additionalContent...</Species>

The ID attribute type is used to uniquely identify elements in a document for application processing. IDRef and IDRefs attribute types are used to refer to a single ID value in the same document or multiple ID values in the same document, separated by whitespace, respectively. These attribute declarations must be used with the same constraints as apply to ID, IDREF, and IDREFS attribute types in XML 1.0.

The Entity and Entities attribute types identify the names of unparsed entities. The use of these attribute types must be made with the same constraints as apply to the ENTITY and ENTITIES attribute types in XML 1.0. The name of an unparsed entity identified by an Entity or Entities attribute must match the Name attribute of an UnparsedEntity element elsewhere in the DDML document.

The Nmtoken and Nmtokens attribute types are used to declare attributes that must contain information conforming to the Nmtoken and Nmtokens productions in XML 1.0.

The Notation and Enumerated attribute types are more complex, requiring an Enumeration subelement, which in turn contains EnumerationValue subelements, to identify their possible content. These two declarations use similar syntax, but the allowed values of Notation declarations must match the Notations declared elsewhere in the DDML document.

If the status attribute of the Species element were to allow the values of extinct, endangered, protected, and non-threatened, an appropriate enumerated type declaration would look like:

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="status" Type="Enumerated">
      <Enumeration>
        <EnumerationValue Value="extinct"/>
        <EnumerationValue Value="endangered"/>
        <EnumerationValue Value="protected"/>
        <EnumerationValue Value="non-threatened"/>
      </Enumeration>
    </AttDef>
  </AttGroup>
</ElementDecl>

A Species element created conforming to this declaration might look like:

<Species status="extinct">...additionalContentAboutDodos...</Species>

2.4.2 Attribute Defaults

DDML requires attribute declarations to provide information about the default value of a given attribute. DDML provides for the four cases supported by XML 1.0: #REQUIRED, #IMPLIED, #FIXED AttValue, and AttValue, though they are expressed as choices between required and not required with an optional default value. There may be only one default value declaration per attribute.

Required attributes (identified in XML 1.0 by #REQUIRED) are identified by assigning the value "Yes" to the Required attribute of an AttDef element and not assigning a value to the AttValue attribute. For instance, if the Latin attribute described above was required by the Species element, the AttDef element would contain a Required attribute with a value of "Yes":

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="Latin" Required="Yes"/>
  </AttGroup>
</ElementDecl>

Optional attributes (identified in XML 1.0 by #IMPLIED) are identified assigning the value "No" to the Required attribute of an AttDef element and not assigning a value to the AttValue attribute. Implied indicates that there is no default value provided, and also that no value is required. If the Latin attribute is optional, the AttDef element would contain a "No" value for the Required attribute. (Note that this is the default status and the Required declaration does not need to be made explicitly.)

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="Latin" Required="No"/>
  </AttGroup>
</ElementDecl>

Fixed attributes (identified in XML 1.0 by #FIXED AttValue) are identified through the use of the Required attribute in combination with the AttValue attribute, which must contain the fixed value for the attribute. Attributes declared as fixed can only contain the declared value for that attribute. Fixed effectively hard codes attribute values into particular elements. If the Required attribute has a value of "Yes", and the AttValue attribute is present, the attribute value should be treated as a #FIXED value in XML 1.0.

For example, to declare a planet attribute for the Species element, a Required attribute given the value of "Yes" would identify the fixed nature of the attribute and the AttValue attribute would provide the value.

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="planet" Required="Yes" AttValue="Earth"/>
  </AttGroup>
</ElementDecl>

Attributes may also be provided with a default value that may be overridden by other declarations. These default values are identified through the use of the AttValue attribute. The status attribute of species elements described above would be an appropriate target for such a default value, especially if most species being described fell into a particular category:

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="status" Type="Enumerated" AttValue="non-threatened"/>
      <Enumeration>
        <EnumerationValue Value="extinct"/>
        <EnumerationValue Value="endangered"/>
        <EnumerationValue Value="protected"/>
        <EnumerationValue Value="non-threatened"/>
      </Enumeration>
    </AttDef>
  </AttGroup>
</ElementDecl>

Any default (required, fixed, etc.) may be used with any attribute type, though default values must always correspond to acceptable values for the attribute type.

2.4.3 Combinations of Types, Defaults, and Default Values

This notation also permits the declaration of certain attributes (IDs with defaults, for instance) that are prohibited by the standard XML 1.0 DTD syntax. Developers who use these combinations should test that their documents will behave as expected in DTD-only environments as well as DDML environments. Additional processing of document instances may be necessary to produce normalized-for-DTD use documents if they included such attributes as default values. The attribute type should always be considered more important than its default values in DDML to DTD conversion.

The table below summarizes the possible combinations of DDML attribute defaults and their XML 1.0 DTD equivalents.

Required

AttValue

XML 1.0 Equivalent

Yes

<value>

#FIXED <value>

Yes

--

#REQUIRED

No

<value>

AttValue

No

--

#IMPLIED

(-- indicates an undeclared value)

2.5 Notation Declarations

Notation declarations are made with Notation elements nested in the DocumentDef element.

<!ELEMENT Notation (Doc?, More?)>
<!ATTLIST Notation

    Name          NMTOKEN #REQUIRED
    PubidLiteral  CDATA   #IMPLIED
    SystemLiteral CDATA   #IMPLIED
    id            ID      #IMPLIED>

The Name attribute provides the name of the notation. It must match the Name production in the XML 1.0 specification.

Notations may include a public identifier or a system literal, or both. DDML processors should ignore Notation elements that contain neither. Public identifiers and system literals should conform to the rules in Section 4.7 of the XML 1.0 Specification.

For information about the id attribute, see Section 2.8, "id Attributes".

2.6 Unparsed Entity Declarations

Unparsed entities are declared with UnparsedEntity elements nested in the DocumentDef element.

<!ELEMENT UnparsedEntity (Doc?, More?)>
<!ATTLIST UnparsedEntity
    Name          NMTOKEN #REQUIRED
    SystemLiteral CDATA   #REQUIRED
    PubidLiteral  CDATA   #IMPLIED
    Notation      NMTOKEN #REQUIRED
    id            ID      #IMPLIED>

The Name attribute provides the name of the unparsed entity. It must match the Name production in the XML 1.0 specification and must be unique within the set of unparsed entities defined in the DDML document. The Notation attribute provides the name of a notation that gives the format of the unparsed entity. It must match the Name production in the XML 1.0 specification and must also match the Name attribute of a Notation element elsewhere in the DDML document.

UnparsedEntity elements must include a system literal and may include a public identifier. Public identifiers and system literals should conform to the rules in Section 4.7 of the XML 1.0 Specification.

For information about the id attribute, see Section 2.8, "id Attributes".

2.7 DDML Extensions

DDML provides areas in which DDML developers can provide supplemental information and metadata regarding DDML components in both human- and machine-readable formats. Human-readable information is provided through the use of a subset of HTML that conforms to XML syntax, while machine-readable information may be provided through the DDML:More element.

2.7.1 Documentation Extensions

Human-readable documentation for DDML documents should be provided using the Itsy Bitsy Teeny Weeny Simple Hypertext [IBTWSH]. This is an XML DTD which describes a subset of HTML 4.0 for embedded use within other XML DTDs. It is equivalent (within its scope) to -//W3C//DTD HTML 4.0 Transitional//EN. Documentation that uses portions of the IBTWSH format may be included in the DDML:Doc element, a subelement available to all declarations. The DDML:Doc element provides basic formatting options for DDML documentation.

<!ENTITY % ibtwsh SYSTEM "http://www.ccil.org/~cowan/XML/ibtwsh.dtd">
%ibtwsh;
<!ELEMENT DDML:Doc %struct.model;>
<!ATTLIST DDML:Doc
    xmlns CDATA #FIXED "">

Note that because DDML:Doc redefines the default namespace to support IBTWSH, the DDML: prefix must be used for DDML:Doc. Any element allowed in the IBTWSH struct.model set of elements (A, ABBR, ACRONYM, ADDRESS, BIG, BLOCKQUOTE, BR, CITE, CODE, DFN, DIR, DIV, DL, EM, H1, H2, H3, HR, KBD, OL, P, PRE, SAMP, SMALL, SPAN, STRONG, UL, VAR, XML) may be used in the DDML:Doc element. To preserve compatibility with HTML, IBTWSH does not use namespaces.

DDML applications should ignore all DDML declarations (i.e., elements prefixed with DDML: or another appropriate DDML prefix) within a DDML:Doc element. (The XML element of IBTWSH allows an ANY content model.)

2.7.2 Other Extensions

The DDML:More element provides an area which developers can use to create their own supplements to DDML, defining content types more tightly than is possible through DDML 1.0. The DDML:More element has a simple ANY content model, though DDML processors should ignore the appearance of any elements from the DDML namespace in this area.

<!ELEMENT DDML:More ANY>
<!ATTLIST DDML:More
    xmlns CDATA "">

Because DDML:More redefines the default namespace, the DDML: prefix must be used for DDML:More. Developers may override the blank value of the xmlns attribute to define their own default namespace for elements contained in the DDML:More element.

2.8 id Attributes

All DDML elements except EnumerationValue, More, and Doc have an optional id attribute. These attributes, if they appear, must have a unique value within the document. They have no defined use in DDML 1.0, but are included so that future extensions (possibly involving XLink) can uniquely identify elements in a DDML document.

3 DDML and Namespaces

DDML uses namespaces for its own operations and also supports schemas that take advantage of namespace facilities. DDML processors are responsible only for elements that use the DDML namespace appropriate to the version of DDML they are processing. Elements in other namespaces may be used in the DDML:Doc and DDML:More elements and passed to other applications as the processor deems appropriate.

DDML documents can be used by namespace-unaware applications provided the following conditions are met:

Note: This section is subject to change, even after the DDML specification is otherwise finalized. For more information, see Section 1.2, "Relation to Standards."

3.1 The DDML Namespace

The namespace for DDML 1.0 is built into the DDML DTD as the fixed default value of the xmlns and xmlns:DDML attributes of the DocumentDef element. The URL of the DDML namespace is a PURL (permanent URL) provided by the OCLC. PURLs use redirection to maintain a permanent address for sites that may change address. (For more information, see http://www.purl.org.) While DDML specification information may be stored at the location to which the PURL server redirects visitors, DDML applications should not rely on any of that information being there.

DDML:Doc and DDML:More must use the DDML: prefix because they declare other values for the default namespace. All other DDML elements may use the DDML: prefix if desired, but are not required to do so.

3.2 Namespaces of Elements and Attributes Being Defined

Each element or attribute defined in a DDML document can belong to its own namespace. The URI of this namespace is provided by the ns attribute and any documents that contain the element or attribute must use the same URI. For example, if the Species element is part of the http://www.taxonomy namespace, a DDML document might contain the following declaration:

<DocumentDef ns="http://www.taxonomy">
  <ElementDecl Name="Species">
    ...additionalElementInformation...
  </ElementDecl>
  ...
</DocumentDef>

The document that uses the Species element might contain:

<TAXON:Species xmlns:TAXON="http://www.taxonomy">
  ...additionalElementContent...
</TAXON:Species>

The ns attribute occurs on the DocumentDef, AttGroup, ElementDecl, and AttDef elements. Its value is inherited by lower-level elements, which can override it. In the simplest case, only the root DocumentDef element has an ns attribute. It is strongly recommended that the ns attribute be used only on the DocumentDef element.

Ref elements refer to an element defined elsewhere in the DDML document and which may belong to a different namespace. The name and namespace of the referenced element are provided with the Element and ElementNS attributes, respectively. The ElementNS attribute occurs on the DocumentDef, Mixed, Choice, and Seq elements, as well as the Ref element. Its value is inherited by lower-level elements, which can override it. In the simplest case, only the root DocumentDef element has an ElementNS attribute.

The following example shows how the Species element, defined in the http://taxonomy namespace, is referenced in the content model of the Animal element, defined in the http://zooinventory namespace.

<DocumentDef>
  <DocumentDef ns=""http://www.taxonomy">
    <ElementDecl Name="Species">
      ...additionalElementInformation...
    </ElementDecl>
    ...
  </DocumentDef>
  <DocumentDef ns=""http://www.zooinventory">
    <ElementDecl Name="Animal">
      <Model>
        <Seq>
          <Ref Element="Species" ElementNS="http://www.taxonomy" />
          <Ref Element="Quantity" />
          ...additionalReferences...
        </Seq>
      </Model>
    </ElementDecl>
  </DocumentDef>
</DocumentDef>

If no ns or ElementNS attribute applies to an element or attribute being defined or referenced, then that element or attribute is not considered to belong to any particular namespace. In particular, the element or attribute does not belong to the DDML namespace, nor does it belong to the current default namespace of any documents in which it is used, assuming a default namespace is defined.

For conversion to and from DTDs, DDML provides prefix attributes, which declares the namespace prefixes used in element and attribute declarations in DTDs. This allows documents and their associated DDML documents to track the same namespace using different prefixes if necessary. DDML-to-DTD converters should use the prefix attribute of a DocumentDef, ElementDecl, AttGroup, or AttDef element when creating DTD element and attribute declarations. DTD-to-DDML converters should use the prefixes assigned in the DTD and request further information about the 'real' namespace for use in the ns attribute. This may be accomplished by parsing a sample document instance, or by direct input from the person doing the conversion.

4 DDML Documents and DTDs

A DDML document is related to two different DTDs: the DTD of the DDML document itself and the DTD of the document described by the DDML document. This section discusses the relationship of DDML documents to these DTDs and describes what conversions are possible between the DDML document and the latter DTD. There is no requirement that either DTD actually exist.

4.1 DTDs in DDML Documents

A DDML document may include a DTD as an internal subset, external subset, or both. If included, the Name in the DOCTYPE statement must be DocumentDef and the DTD must include all of the markup declarations in Appendix C, "DDML DTD." It may also include additional markup declarations, such as declarations of elements to be used under the More element. However, these declarations must not override any of the declarations from Appendix C.

The main reason to include a DTD in a DDML document is so a DDML-unaware XML parser can supply default attribute values and determine the system and public identifiers of notations and unparsed general entities. Default attribute values are used in the DDML DTD defined in Appendix C. Notations and unparsed general entities can be used by user-defined elements under the More element.

Secondary reasons for including a DTD in a DDML document are to declare parsed entities (see Section 5.3.1, "Parsed Entities in DDML Documents") and to allow the document to be validated by DDML-unaware software.

4.2 DTDs in Documents Described by DDML Documents

A document described by a DDML document may include a DTD as well as processing instructions that refer to DDML documents (see Section 5.1.1, "DDML Processing Instruction"). This DTD can describe the same information as the DDML documents as well as additional information.

Reasons to include a DTD in a document described by a DDML document include:

If an XML document includes both a DTD and processing instructions that refer to DDML documents, it is the responsibility of the document author to ensure that the information common to both is the same. If the common information is different, it might not be possible to use the document with both DDML-aware and -unaware software. For example, it might not be possible to validate the document against both the DTD and the DDML documents.

If a DDML processor is built on top of an XML parser, the DDML processor is not required to process the DTD of the XML document. If a DDML processor also functions as an XML parser, it is required to process the DTD only to the extent required of a non-validating parser.

4.3 Converting Between DDML Documents and DTDs

Schema information can be converted between DDML documents and DTDs, although some information may be lost. Most logical information (such as element and attribute declarations) can be converted from DTDs to DDML documents, while some logical information (such as attribute declarations not assigned to elements) cannot be converted from DDML documents to DTDs. In general, physical information (such as parsed entity declarations and use, the order of declarations, and the distribution of declarations among different files) either cannot be converted or is converted only at the option of the converter.

Converters may include as many or as few comments in the output document as they choose and may place these at any (legal) locations they choose. In particular, converters are not limited to converting between DDML Doc elements and XML comments. For example, a converter might place the entire input document in one or more comments in the output document for documentation's sake or it might generate comments noting which structures it does not convert.

4.3.1 Converting DTDs to DDML Documents

The following DTD structures must be converted to the corresponding DDML structures:

The following DTD structures may be converted to the corresponding DDML structures or discarded:

The following DTD structures cannot be converted to DDML structures because such structures do not exist:

4.3.2 Converting DDML Documents to DTDs

The following DDML structures must be converted to the corresponding DTD structures:

The following DDML structures may be converted to the corresponding DTD structures or discarded:

The following DDML structures cannot be converted to DTD structures because such structures do not exist:

5 Using DDML Documents

This section describes how to associate DDML documents with XML documents and suggests ways to use DDML documents.

5.1 Associating DDML Documents with XML Documents

A DDML document can define a class of XML documents in the same way a DTD defines a class of XML documents. A document declares that it conforms to a class by including the DDML processing instruction. A document fragment can declare that it conforms to a class by including a nested DDML element; this latter usage is experimental.

5.1.1 DDML Processing Instruction

The DDML processing instruction is similar to the SYSTEM declaration in a DOCTYPE statement. It states that the document conforms to the class of documents described by the DDML document. The processing instruction has the following form:

[1] DDMLPI ::= '<?DDML' S SystemID PubID? Version? S? '?>'

[2] SystemID ::= 'System' Eq SystemLiteral

[3] PubID ::= S 'Public' Eq PubidLiteral

[4] Version ::= S 'Version' Eq (' VersionNum ' | " VersionNum ")

where the productions S, Eq, SystemLiteral, PubidLiteral, and VersionNum are the same as in [XML]. VersionNum describes the version number of the DDML document and must match the value of the Version attribute on the root DocumentDef element of the document. The rules for retrieving the DDML document are the same as those for retrieving external entities, as described in Section 4.2.2, "External Entities," of [XML], except that a DDML processor may choose not to retrieve the document if the version number specified by VersionNum is different from the version of DDML supported by the processor.

A DDML processing instruction must occur before the root element to be used; any DDML processing instructions that occur after the root element will be ignored.

An XML document may include multiple DDML processing instructions. The effect is as if a superior root DocumentDef element contains the root DocumentDef element of each DDML document. This allows a document to conform to elements in many existing DDML documents. For more information, see Section 5.3.5, "Reusing Element Declarations with Entities or Processing Instructions."

5.1.2 Inline DDML Elements (Non-Normative)

NOTE: Inline DDML elements are considered experimental and may change in the future.

In some applications it is useful to repeatedly change the schema of the XML document at run time.
For example, consider a system that continuously logs data in XML format. >From an XML standpoint, it is as if a root element was started when the system was started, all incoming information is nested beneath the root element, and the root element ends only when the system stops. For practical purposes, the root element might not actually exist.

If the system logs information from different sources, the format (schema) of the nested elements might be different for each source. DDML elements can be interspersed in this stream to describe the format of following information:

<Root>
  <DocumentDef>...schema #1...</DocumentDef>
  ...log information that conforms to schema #1...
  <DocumentDef>...schema #2...</DocumentDef>
  ...log information that conforms to schema #2...
  ...
</Root>

Because such use is not well-defined today, DDML processors that use inline DDML elements should follow these rules for the greatest chance of forward compatibility:

5.2 Validation

A DDML processor can validate an XML document against a DDML document; such a processor is called a validator. Because DDML does not support parsed entity declarations (see Section 5.3.2, "Entity Support in DDML,") this validation is slightly less comprehensive than that defined in [XML]. Validators must enforce all Validity Constraints in [XML] except:

If the instance document contains a DOCTYPE statement and the validator can discover the root element type declared there (for example, the validator parses the document itself or the parser informs it of the root element type), then the validator must enforce the Root Element Type Validity Constraint as well.

Validators, like all DDML processors, are not required to parse XML documents.

5.3 Suggested Uses of DDML Documents

(Non-Normative)

The following sections suggest possible uses of DDML documents. They are not binding on DDML processors or documents.

5.3.1 Parsed Entities in DDML Documents

Parsed general entities are used in DDML documents for the same reasons they are used in XML documents: to distribute documents across multiple files, to enable multiple character encodings, to act as text substitution macros, and so on. They can also be used in a manner similar to parameter entities in a DTD. For example, suppose the DTD contains the following declaration:

<!ENTITY latinattribute "<AttDef Name='Latin' Type='CData' Required='No'/">

This can be used in the content of the DDML document to add the Latin attribute to an element:

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    &latinattribute;
  </AttGroup>
</ElementDecl>

Because parameter entities are used only in the DTD, they offer no special advantages to DDML documents.

5.3.2 Entity Support in DDML

DDML does not directly support parsed entities. That is, there is no DDML element that can be used to define a parsed entity. The reason for this is that parsed entities are physical in nature. They describe how to physically construct a document: how to distribute a document across multiple files and what text substitutions to perform. This is orthogonal to the function of a schema, which is logical in nature: it describes the legal content of a document and how that content can be arranged. That a DTD can describe both the logical and physical structure of a document was viewed as a historical accident that DDML chose not to perpetuate.

One benefit of this is that it reduces the amount of information that must be duplicated in a DTD. Because DDML processors are expected to be built on top of DDML-unaware parsers, any parsed entity declarations that might otherwise have been in a DDML document would also have been required in the DTD; without them, DDML-unaware parsers could not parse the instance documents. Such duplication of declarations is prone to error.

5.3.3 DTD Replacement

As was noted in Section 5.1, a DDML document can define a class of XML documents. In this respect, it fulfills some of the functions of a DTD. That is, a DDML processor can validate an XML document against a DDML document and a DDML-aware XML parser can retrieve information about the XML document, such as default attribute values and the system and public identifiers of notations and unparsed general entities.

5.3.4 Schema Repository

DDML documents are not required to define a particular class of XML documents. For example, a DDML document might consist of nothing but attribute definitions. In this manner, a DDML document can function as a repository for schema definitions, which can then be reused by other DDML documents. Note that while a DDML document that defines a class of XML documents can always act as a repository, the converse is not always true.

5.3.5 Reusing Element Declarations with Entities or Processing Instructions

Element declarations in one DDML document can be reused by referring to them in a Ref element in second DDML document. For example, suppose a DDML repository defines a FullName element:

<ElementDecl Name="FullName">
  <Model>
    <Seq>
      <Ref Element="LastName"/>
      <Ref Element="FirstName"/>
      <Ref Element="MiddleName" Frequency="ZeroOrMore"/>
    </Seq>
  </Model>
</ElementDecl>

The DDML document that describes Letter documents might include FullName by reference, where the first instance is the author of the letter and the second instance is the recipient:

<ElementDecl Name="Letter">
  <Model>
    <Seq>
      <Ref Element="FullName"/>
      <Ref Element="FullName"/>
      <Ref Element="Paragraph" Frequency="OneOrMore"/>
    </Seq>
  </Model>
</ElementDecl>

The referenced declaration can be resolved in one of two ways. First, the second DDML document can include the first, either by cutting and pasting or through an external parsed general entity. For example:

<!DOCTYPE DocumentDef [
<!ENTITY nameRepository SYSTEM "names.ddm">
]>
<DocumentDef>
  &nameRepository;
  ... other declarations ...
</DocumentDef>

Second, a Letter (instance) document that can included processing instructions that point to both DDML documents. For example:

<!DOCTYPE Letter>
<?DDML SystemID="names.ddm" ?>
<?DDML SystemID="letter.ddm" ?>
<Letter>
  ...
</Letter>

Note that including a DDML processing instruction in letter.ddm that points to names.ddm will not have the intended effect. Rather than including the names.ddm, this processing instruction states that letter.ddm (a DDML document) conforms to the elements declared names.ddm. This is unlikely to be true.

5.3.6 Reusing Schema Definitions through XLinks

In the future, it should be possible to reuse schema definitions in a DDML document through XLinks. Although the exact manner in which this works cannot be determined until the XLink and XPointer specifications are complete, the example from section 5.3.5 might be performed as follows:

<ElementDecl Name="Letter">
  <Model>
    <Seq>
      <Ref Element="FullName"/>
      <Ref Element="FullName"/>
      <Ref Element="Paragraph" Frequency="OneOrMore"/>
    </Seq>
  </Model>
</ElementDecl>

<ElementDecl xml:link="simple"
             href="names.ddm#id(FullName)"
             inline="true"
             show="replace"/>

The second ElementDecl element points to, and is replaced by, the ElementDecl element for the FullName element in names.ddm. This eliminates the need to include the names.ddm through cut-and-paste, an entity, or a processing instruction.

DDML has been designed with such linking in mind. It is partially because of this that the container elements AttGroup, Enumeration, and Model exist and can be directly or indirectly nested inside themselves. For example, a new AttGroup might be constructed by nesting multiple AttGroup elements inside it, each of which contains an XLink to an AttGroup in a different DDML document.

5.3.7 Authoring

DDML documents support authoring tools (editors) by providing human-readable documentation and a template for legal document structures.

A typical editing session using a DDML-aware editor might proceed as follows:

  1. The editor displays a list of available DDML documents. The user chooses a DDML document to use as a template.
  2. The editor reads the chosen document and displays a list of elements for which the Root attribute has a value of Recommended. The user chooses a starting element.
  3. The editor prompts the user for element content and attributes based on the information in the DDML document. When the user requests help about a particular structure, the editor retrieves it from the corresponding Doc element.
  4. When the user is done, the editor saves the new document, using the file extension specified by the FileExtension attribute of the root DocumentDef element. At the start of this document, the editor inserts a DOCTYPE statement declaring the root element type and a DDML processing instruction declaring the template DDML document.

An editor can also support schema building and modification. For example, it might allow the user to construct a new DDML document from elements in existing DDML documents or add new elements to existing DDML documents.

5.3.8 General Schema Information

Because DDML documents can contain information about a class of documents, they can be used by tools that work with (as opposed to on) these documents. For example, a database tool might read a DDML document and construct a database schema or a programming tool might read a DDML document and create Java classes for each element. DDML documents can also be used as starting points for search engines, which can use them to construct query-by-example interfaces.

5.3.9 Custom Uses

The More element in DDML provides a way for users to customize their DDML documents. For example, subelements of the More element might be used to assign the data type (integer, date, string, etc.) of PCData elements or associate Java classes with elements.

NOTE: There are a number of existing proposals for data types in XML and it is hoped that the W3C (and therefore DDML) will adopt one of these in the future. For example, see [DCD] or [SOX].

Appendix A: References

[DCD]
Tim Bray, Charles Frankston, and Ashok Malhotra. Document Content Description for XML. 31 July 1998. See http://www.w3.org/TR/NOTE-dcd.
[IBTWSH]
John Cowan. Itsy Bitsy Teeny Weeny Simple Hypertext. See http://www.ccil.org/~cowan/XML/ ibtwsh.dtd.
[Namespaces]
Tim Bray, Dave Hollander, and Andrew Layman. Namespaces in XML. 17 Nov 1998. See http://www.w3.org/TR/1998/PR-xml-names-19981117.
[RDF Schemas]
Dan Brickley, R. V. Guha, and Andrew Layman. Resource Description Framework (RDF) Schema Specification. 14 August, 1998. See http://www.w3.org/TR/WD-rdf-schema.
[RFC 2046]
IETF (Internet Engineering Task Force). RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, ed. N. Freed and N. Borenstein. November, 1996. See http://www.isi.edu/in-notes/rfc2046.txt.
[RFC 2048]
IETF (Internet Engineering Task Force). RFC 2048: Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures, ed. N. Freed, J. Klensin, and J. Postel. November, 1996. See http://www.isi.edu/in-notes/rfc2048.txt.
[RFC 2119]
IETF (Internet Engineering Task Force). RFC 2119: Key words for use in RFCs to Indicate Requirement Levels, ed. Scott Bradner. 1997. See http://www.isi.edu/in-notes/rfc2119.txt.
[RFC 2376]
IETF (Internet Engineering Task Force). RFC 2376: XML Media Types, ed. E.J.Whitehead and Murata Makoto. July, 1998. See http://www.isi.edu/in-notes/rfc2376.txt.
[SOX]
Matt Fuchs, Murray Maloney, and Alex Milowski. Schema for Object-oriented XML. 1998. See http://www.w3.org/TR/NOTE-SOX.
[XML]
Tim Bray, Jean Paoli, and C.M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. 1998. See http://www.w3.org/TR/REC-xml.
[XML-Data]
Andrew Layman, et al. XML-Data. 5 Jan 1998. See http://www.w3.org/TR/1998/NOTE-XML-data.
[XML-DEV]
XML-DEV Mailing List, archived at http://www.lists.ic.ac.uk/hypermail/xml-dev/.
[XLink]
Eve Maler and Steve DeRose. XML Linking Language (XLink). 1998. See http://www.w3.org/TR/WD-xlink.
[XPointer]
Eve Maler and Steve DeRose. XML Pointer Language (XPointer). 1998. See http://www.w3.org/TR/WD-xptr.

Appendix B: Differences between DDML and XSchema

There are only two technical differences between DDML and XSchema:

Appendix C: DDML DTD

<!ELEMENT DocumentDef (Doc?, More?, (ElementDecl | Model | AttDef | AttGroup | Notation | UnparsedEntity | Enumeration | DocumentDef)*)>
<!ATTLIST DocumentDef
    xmlns         CDATA   #FIXED   "http://www.purl.org/NET/ddml/v1"
    xmlns:DDML    CDATA   #FIXED   "http://www.purl.org/NET/ddml/v1"
    ns            CDATA   #IMPLIED
    ElementNS     CDATA   #IMPLIED
    prefix        NMTOKEN #IMPLIED
    Version       CDATA   #FIXED   "1.0"
    MimeType      CDATA            "application/xml"
    FileExtension CDATA            "xml"
    id            ID      #IMPLIED>

<!ELEMENT ElementDecl (Doc?, More?, Model, AttGroup?)>
<!-- Name is the element name -->
<!ATTLIST ElementDecl
    Name   NMTOKEN #REQUIRED
    ns     CDATA   #IMPLIED
    prefix NMTOKEN #IMPLIED
    id     ID      #IMPLIED
    Root   (Recommended | Possible | Unlikely) "Possible">

<!ELEMENT Model (Doc?, More?, (Ref | Choice | Seq | Empty | Any | PCData | Mixed))>
<!ATTLIST Model
    id ID #IMPLIED>

<!ELEMENT Empty EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

<!ELEMENT Any EMPTY>
<!ATTLIST Any
    id ID #IMPLIED>

<!ELEMENT PCData EMPTY>
<!ATTLIST PCData
    id ID #IMPLIED>

<!ELEMENT Ref EMPTY>
<!-- Element references the name in an ElementDecl element -->
<!ATTLIST Ref
    Element       NMTOKEN #REQUIRED
    ElementNS     CDATA   #IMPLIED
    id            ID      #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

<!ELEMENT Mixed (Ref+)>
<!ATTLIST Mixed
    ElementNS CDATA        #IMPLIED
    id        ID           #IMPLIED
    Frequency (ZeroOrMore) #FIXED   "ZeroOrMore">

<!-- A Choice must have two or more children -->
<!ELEMENT Choice ((Seq | Ref | Model), (Seq | Ref | Model)+)>
<!ATTLIST Choice>
    ElementNS CDATA #IMPLIED
    id        ID    #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

<!-- A Seq must have two or more children -->
<!ELEMENT Seq ((Choice | Ref | Model),(Choice | Ref | Model)+)>
<!ATTLIST Seq
    ElementNS CDATA #IMPLIED
    id        ID    #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

<!ELEMENT AttGroup (Doc?, More?, (AttDef | AttGroup)*)>
<!ATTLIST AttGroup
    ns        CDATA   #IMPLIED
    prefix    NMTOKEN #IMPLIED
    id        ID      #IMPLIED>

<!ELEMENT AttDef (Doc?, More?, Enumeration?)>
<!ATTLIST AttDef
    Name      NMTOKEN      #REQUIRED
    ns        CDATA        #IMPLIED
    prefix    NMTOKEN      #IMPLIED
    Type      (CData    |
               ID       |
               IDRef    |
               IDRefs   |
               Entity   |
               Entities |
               Nmtoken  |
               Nmtokens |
               Notation |
               Enumerated) "CData"
    Required  (Yes | No)   "No"
    AttValue  CDATA        #IMPLIED
    id        ID           #IMPLIED>

<!ELEMENT Enumeration (Doc?, More?, EnumerationValue+)>
<!ATTLIST Enumeration
    id ID #IMPLIED>

<!ELEMENT EnumerationValue (Doc?, More?)>
<!ATTLIST EnumerationValue

    Value CDATA #REQUIRED>

<!ELEMENT Notation (Doc?, More?)>
<!ATTLIST Notation

    Name          NMTOKEN #REQUIRED
    PubidLiteral  CDATA   #IMPLIED
    SystemLiteral CDATA   #IMPLIED
    id         ;    ID      #IMPLIED>

<!ELEMENT UnparsedEntity (Doc?, More?)>
<!ATTLIST UnparsedEntity
    Name          NMTOKEN #REQUIRED
    SystemLiteral CDATA   #REQUIRED
    PubidLiteral  CDATA   #IMPLIED
    Notation      NMTOKEN #REQUIRED
    id            ID      #IMPLIED>

<!ENTITY % ibtwsh SYSTEM "http://www.ccil.org/~cowan/XML/ibtwsh.dtd">
%ibtwsh;
<!ELEMENT DDML:Doc %struct.model;>
<!ATTLIST DDML:Doc
    xmlns CDATA #FIXED "">

<!ELEMENT DDML:More ANY>
<!ATTLIST DDML:More
    xmlns CDATA "">

Appendix D: DDML in DDML

<?xml version="1.0"?>

<!DOCTYPE "DocumentDef">

<DocumentDef FileExtension="ddm" prefix="">

  <ElementDecl Name="DocumentDef" Root="Recommended">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
        <Choice Frequency="ZeroOrMore">
          <Ref Element="ElementDecl"/>
          <Ref Element="Model"/>
          <Ref Element="AttDef"/>
          <Ref Element="AttGroup"/>
          <Ref Element="Notation"/>
          <Ref Element="UnparsedEntity"/>
          <Ref Element="Enumeration"/>
          <Ref Element="DocumentDef"/>
        </Choice>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="xmlns" Required="Yes" AttValue="http://www.purl.org/NET/ddml/v1"/>
      <AttDef Name="DDML" prefix="xmlns" Required="Yes" AttValue="http://www.purl.org/NET/ddml/v1"/>
      <AttDef Name="ns"/>
      <AttDef Name="ElementNS"/>
      <AttDef Name="prefix" Type="Nmtoken"/>
      <AttDef Name="Version" Required="Yes" AttValue="1.0"/>
      <AttDef Name="MimeType" AttValue="application/xml"/>
      <AttDef Name="FileExtension" AttValue="xml"/>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="ElementDecl">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
        <Ref Element="Model"/>
        <Ref Element="AttGroup" Frequency="Optional"/>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="Name" Type="Nmtoken" Required="Yes"/>
      <AttDef Name="ns"/>
      <AttDef Name="prefix" Type="Nmtoken"/>
      <AttDef Name="id" Type="ID"/>
      <AttDef Name="Root" Type="Enumerated" AttValue="Possible">
        <Enumeration>
          <EnumerationValue Value="Recommended"/>
          <EnumerationValue Value="Possible"/>
          <EnumerationValue Value="Unlikely"/>
        </Enumeration>
      </AttDef>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Model">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
        <Choice>
          <Ref Element="Ref"/>
          <Ref Element="Choice"/>
          <Ref Element="Seq"/>
          <Ref Element="Empty"/>
          <Ref Element="Any"/>
          <Ref Element="PCData"/>
          <Ref Element="Mixed"/>
        </Choice>
      </Seq>    
    </Model>
    <AttGroup>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Empty">
    <Model>
      <Empty/>
    </Model>
    <AttGroup>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Any">
    <Model>
      <Empty/>
    </Model>
    <AttGroup>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="PCData">
    <Model>
      <Empty/>
    </Model>
    <AttGroup>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Ref">
    <Model>
      <Empty/>
    </Model>
    <AttGroup>
      <AttDef Name="Element" Type="Nmtoken" Required="Yes"/>
      <AttDef Name="ElementNS"/>
      <AttDef Name="id" Type="ID"/>
      <AttDef Name="Frequency" Type="Enumerated" AttValue="Required">
        <Enumeration>
          <EnumerationValue Value="Required"/>
          <EnumerationValue Value="Optional"/>
          <EnumerationValue Value="ZeroOrMore"/>
          <EnumerationValue Value="OneOrMore"/>
        </Enumeration>
      </AttDef>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Mixed">
    <Model>
      <Ref Element="Ref" Frequency="OneOrMore"/>
    </Model>
    <AttGroup>
      <AttDef Name="ElementNS"/>
      <AttDef Name="id" Type="ID"/>
      <AttDef Name="Frequency" Type="Enumerated" Required="Yes" AttValue="ZeroOrMore">
        <Enumeration>
          <EnumerationValue Value="ZeroOrMore"/>
        </Enumeration>
      </AttDef>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Choice">
    <Model>
      <Seq>
        <Choice>
          <Ref Element="Seq"/>
          <Ref Element="Ref"/>
          <Ref Element="Model"/>
        </Choice>
        <Choice Frequency="OneOrMore">
          <Ref Element="Seq"/>
          <Ref Element="Ref"/>
          <Ref Element="Model"/>
        </Choice>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="ElementNS"/>
      <AttDef Name="id" Type="ID"/>
      <AttDef Name="Frequency" Type="Enumerated" AttValue="Required">
        <Enumeration>
          <EnumerationValue Value="Required"/>
          <EnumerationValue Value="Optional"/>
          <EnumerationValue Value="ZeroOrMore"/>
          <EnumerationValue Value="OneOrMore"/>
        </Enumeration>
      </AttDef>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Seq">
    <Model>
      <Seq>
        <Choice>
          <Ref Element="Choice"/>
          <Ref Element="Ref"/>
          <Ref Element="Model"/>
        </Choice>
        <Choice Frequency="OneOrMore">
          <Ref Element="Choice"/>
          <Ref Element="Ref"/>
          <Ref Element="Model"/>
        </Choice>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="ElementNS"/>
      <AttDef Name="id" Type="ID"/>
      <AttDef Name="Frequency" Type="Enumerated" AttValue="Required">
        <Enumeration>
          <EnumerationValue Value="Required"/>
          <EnumerationValue Value="Optional"/>
          <EnumerationValue Value="ZeroOrMore"/>
          <EnumerationValue Value="OneOrMore"/>
        </Enumeration>
      </AttDef>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="AttGroup">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
        <Choice Frequency="ZeroOrMore">
           <Ref Element="AttDef"/>
           <Ref Element="AttGroup"/>
        </Choice>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="ns"/>
      <AttDef Name="prefix" Type="Nmtoken"/>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="AttDef">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
        <Ref Element="Enumeration" Frequency="Optional"/>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="Name" Type="Nmtoken" Required="Yes"/>
      <AttDef Name="ns"/>
      <AttDef Name="prefix" Type="Nmtoken"/>
      <AttDef Name="Type" Type="Enumerated" AttValue="CData">
        <Enumeration>
          <EnumerationValue Value="CData"/>
          <EnumerationValue Value="ID"/>
          <EnumerationValue Value="IDRef"/>
          <EnumerationValue Value="IDRefs"/>
          <EnumerationValue Value="Entity"/>
          <EnumerationValue Value="Entities"/>
          <EnumerationValue Value="Nmtoken"/>
          <EnumerationValue Value="Nmtokens"/>
          <EnumerationValue Value="Notation"/>
          <EnumerationValue Value="Enumerated"/>
        </Enumeration>
      </AttDef>
      <AttDef Name="Required" Type="Enumeration" AttValue="No">
        <Enumeration>
          <EnumerationValue Value="Yes"/>
          <EnumerationValue Value="No"/>
        </Enumeration>
      </AttDef>
      <AttDef Name="AttValue"/>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Enumeration">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequen cy="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
        <Ref Element="EnumerationValue" Frequency="OneOrMore"/>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="EnumerationValue">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="Value" Required="Yes"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Notation">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="Name" Type="Nmtoken" Required="Yes"/>
      <AttDef Name="PubidLiteral"/>
      <AttDef Name="SystemLiteral"/>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="UnparsedEntity">
    <Model>
      <Seq>
        <Ref Element="Doc" Frequency="Optional"/>
        <Ref Element="More" Frequency="Optional"/>
      </Seq>
    </Model>
    <AttGroup>
      <AttDef Name="Name" Type="Nmtoken" Required="Yes"/>
      <AttDef Name="SystemLiteral" Required="Yes"/>
      <AttDef Name="PubidLiteral"/>
      <AttDef Name="Notation" Type="Nmtoken" Required="Yes"/>
      <AttDef Name="id" Type="ID"/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="Doc" prefix="DDML">
    <Model>
       <!-- The struct model from IBTWSH goes here.
            Defining IBTWSH in DDML is left as an
            exercise to the reader.            ;       -->
    </Model>
    <AttGroup>
      <AttDef Name="xmlns" Required="Yes" AttValue=""/>
    </AttGroup>
  </ElementDecl>

  <ElementDecl Name="More" prefix="DDML">
    <Model>
      <Any/>
    </Model>
    <AttGroup>
      <AttDef Name="xmlns" AttValue=""/>
    </AttGroup>
  </ElementDecl>

</DocumentDef>

Appendix E: Contributors

The DDML specification is the result of contributions from a large number of people on the XML-Dev mailing list [XML-DEV], coordinated by a smaller group of editors. We apologize if we have left any contributors off the list below.

Eric Albright
Jacek Ambroziak
James Anderson
Mark D. Anderson
Curt Arnold
Jack Bolles
Jon Bosak
Frank Boumphrey
Tim Bray
Dan Brickley
David Brownell
Marcus Carr
Steven Champeon
Robin Cover
Alain Deseine
David G. Durand
Lars Marius Garshol
Bryan Gilbert
Matthew Gertner
Dirk Gouders
Jeremy H. Griffith

Paul Haahr
Carl Hage
Guy Huard
Will Hunt
Rick Jelliffe
Parameshwor Karki
Michael Kay
Bill la Forge
Andrew Layman
Chris Maden
Murata Makoto
Murray Maloney
Sean McGrath
David Megginson
Kenneth J. Meltsner
Matt Mower
Peter Murray-Rust
Steven R. Newcomb
Thuy-Lin Nguyen
Simon North
Francis Norton

Gisli Olafsson
David Ornstein
Don Park
W.E. Perry
Paul Prescod
Liam Quin
Paul Rabin
Lisa Rein
David Rosenborg
James K. Tauber
Arjun Ray
Todd Ross
John Simpson
Toby Speight
Jarle Stabell
Jeni Tennison
Mark Tucker
Scott Vanderbilt
Stefan Wagner
Steve Withall
Akitoshi Yoshida