[This local archive copy is from the official and canonical URL, http://www.extensibility.com/dt4dtd/spec.htm, 1999-12-01; please refer to the canonical source document if possible.]



Datatypes for DTDs (DT4DTD)

1.0

Public Specification, November-1999

Copyright ©1999 Extensibility, Charles F. Goldfarb, Paul Prescod.

Editors

Lee Buck, Extensibility ()
Charles F. Goldfarb, The XML Handbook™ ()
Paul Prescod, ISOGEN International ()

Abstract

The presented specification allows legacy systems that may presently be unable to convert their DTD markup declarations to XML Schema, to utilize XML Schema conformant datatypes. With it, DTD creators can specify datatypes for attribute values and data content, thereby providing the foundation for a smoother future transition path.

NOTE: Free open-source code that supports this specification for both SAX and DOM is available at www.extensibility.com/dt4dtd.

Table of Contents

1. Datatype Declarations
2. User-defined datatypes
2.1 XML Schema datatypes
2.2 XML-Data datatypes
2.3 Other datatypes
3. Strict Conformance
3.1 Conformance Declaration
4. Information Set Contribution

1. Datatype Declarations

A datatype declaration is a fixed attribute named "a-dtype" or "e-dtype". The fixed, "a-dtype" default value must consist of a series of name/value pairs according to the following syntax:

Attribute datatyping


[1]a-dtype::=S? ( attrName S dtypeName ) (S attrName S dtypeName)* S?
[2]dtypeName::=NCName

The fixed, e-dtype default value must consist merely of a name.

Content datatyping


[3]e-dtype::=dtypeName

For example:


<!ATTLIST person
   birthdate CDATA #IMPLIED
   height CDATA #IMPLIED
   e-dtype CDATA #FIXED
      "social-security-number"
   a-dtype CDATA #FIXED
      "pubdate date
       binding length">

The attrNames are the names of attributes declared in the same attribute-list declaration for the element type. The dtypeName that follows is associated with the attribute of that name.

The e-dtype attribute can only be declared if the associated element type's content allows data but no sub-elements. The dtypeName is associated with the data content of elements of that type.

NOTE: The "dtype" attributes are based on notations instead of XML Namespaces because they are meaningful in the DTD and not in the document instance. It must be possible for DTD-reading and writing applications to recognize the attributes in an entity containing DTD declarations without parsing a document for namespace declarations. The set of namespace declarations that apply to an element type can vary according to context and cannot in general be recognized in a DTD.

A dtypeName must be the name of a notation declared in the DTD; the notation declaration must include a system identifier that is a URI reference. However, if the datatype's name is somehow known to the information system because of the processing context, these notation declarations may be omitted. Notation declarations for datatypes native to the W3C XML Schema specifiation may always be omitted; every information system must natively recognize them from their dtypeName.

2. User-defined datatypes

When explicit notations are provided for dtypeNames, their associated system URIs must refer to one of the following.

2.1 XML Schema datatypes

The referent of the associated URI reference may be an XML Schema datatypeDefn. If so, a conforming processor may validate the attribute value or data content according to that specification.

2.2 XML-Data datatypes

The referent of the associated URI may be XML-Data datatypes as defined by these declarations, which may be included directly or through an entity reference.


<!NOTATION string
   SYSTEM "urn:schemas-microsoft-com:datatypes/string">
<!NOTATION number
   SYSTEM "urn:schemas-microsoft-com:datatypes/number">
<!NOTATION int
   SYSTEM "urn:schemas-microsoft-com:datatypes/int">
<!NOTATION float
   SYSTEM "urn:schemas-microsoft-com:datatypes/float">
<!NOTATION fixed.14.4
   SYSTEM "urn:schemas-microsoft-com:datatypes/fixed.14.4">
<!NOTATION boolean
   SYSTEM "urn:schemas-microsoft-com:datatypes/boolean">
<!NOTATION dateTime.iso8601
   SYSTEM "urn:schemas-microsoft-com:datatypes/date.type.iso8601">
<!NOTATION dateTime.iso8601tz
   SYSTEM "urn:schemas-microsoft-com:datatypes/dateTime.iso8601tz">
<!NOTATION date.iso8601
   SYSTEM "urn:schemas-microsoft-com:datatypes/date.iso8601">
<!NOTATION time.iso8601
   SYSTEM "urn:schemas-microsoft-com:datatypes/time.iso8601">
<!NOTATION time.iso8601.tz
   SYSTEM "urn:schemas-microsoft-com:datatypes/time.iso8601.tz">
<!NOTATION i1
   SYSTEM "urn:schemas-microsoft-com:datatypes/i1">
<!NOTATION i2
   SYSTEM "urn:schemas-microsoft-com:datatypes/i2">
<!NOTATION i4
   SYSTEM "urn:schemas-microsoft-com:datatypes/i4">
<!NOTATION i8
   SYSTEM "urn:schemas-microsoft-com:datatypes/i8">
<!NOTATION ui1
   SYSTEM "urn:schemas-microsoft-com:datatypes/ui1">
<!NOTATION ui2
   SYSTEM "urn:schemas-microsoft-com:datatypes/ui2">
<!NOTATION ui4
   SYSTEM "urn:schemas-microsoft-com:datatypes/ui4">
<!NOTATION ui8
   SYSTEM "urn:schemas-microsoft-com:datatypes/ui8">
<!NOTATION r4
   SYSTEM "urn:schemas-microsoft-com:datatypes/r4">
<!NOTATION r8
   SYSTEM "urn:schemas-microsoft-com:datatypes/r8">
<!NOTATION float.IEEE.754.32
   SYSTEM "urn:schemas-microsoft-com:datatypes/float.IEEE.754.32">
<!NOTATION float.IEEE.754.64
   SYSTEM "urn:schemas-microsoft-com:datatypes/float.IEEE.754.64">
<!NOTATION uuid
   SYSTEM "urn:schemas-microsoft-com:datatypes/uuid">
<!NOTATION uri
   SYSTEM "urn:schemas-microsoft-com:datatypes/uri">
<!NOTATION bin.hex
   SYSTEM "urn:schemas-microsoft-com:datatypes/bin.hex">
<!NOTATION char
   SYSTEM "urn:schemas-microsoft-com:datatypes/char">
<!NOTATION string.ansi
   SYSTEM "urn:schemas-microsoft-com:datatypes/strin.ansi">

2.3 Other datatypes

The referent of the associated URI may be some other notation established for use within some processing context. Optionally, such referents may be Java classes, XSL stylesheets, et. al. that are able to programatically test for validity.

3. Strict Conformance

This specification has certain optional features intended to support an existing installed base and also to allow customization of the specification for closed information systems. This section defines a subset called the "Strict Conformance Subset" that should be used for blind (not pre-arranged) information interchange on the World Wide Web.

A DTD that conforms to the strict subset will have a notation declaration for every dtypeName used in a datatype declaration except datatypes declared in the W3C XML Schema specification. It will aslo have a single explicit conformance declaration.

3.1 Conformance Declaration

The conformance declaration must only be used in DTDs that otherwise conform to the strict subset. This declaration is a notation declaration with the URI "http://www.w3.org/TR/1999/dt4dtd".

If present, the conformance notation declaration must precede the first attribute list declaration in a set of DTD declarations. For the convenience of human readers, we suggest that the conformance notation be named "dt4dtd".

4. Information Set Contribution

NOTE: The principle of Information Set Contributions is established in the W3C XML Schema specification.

In an information set supporting this specification: