ISO/IEC JTC1/SC34 has published an overview document for the Document Schema Definition Language (DSDL) and has appointed editors for three of the seven major parts which will make up the new International Standard. The Document Schema Definition Language (DSDL) is to be "a multipart International Standard defining a modular set of specifications for describing the document structures, data types, and data relationships in structured information resources. Two kinds of integrated specifications are included: (1) specifications for describing aspects of validity of a document, and (2) rules for combining and packaging a collection of processes applicable to the task of validating a document. This integration makes DSDL applicable to both business and publishing applications of structured information resources. This applicability reflects the expansion of Extensible Markup Language (XML) applications beyond the publishing environment in which XML and its foundation (Standard Generalized Markup Language, SGML) were first developed." The seven primary parts of the standard are: Part 1 - Framework; Part 2 - Grammar-oriented schema languages; Part 3 - Primitive data type semantics; Part 4 - Path-based integrity constraints; Part 5 - Object-oriented schema languages; Part 6 - Information item manipulation; Part 7 - Namespace-aware processing with DTD syntax.
Excerpts from the 'Part 0' overview for [ISO] 19757:
Part 1 - Framework. The DSDL framework includes: (1) a method of identifying the validation processes to be applied in pipelined paths of discrete steps; (2) a language for choreographing the validation processes as a set of available pipelines; (3) a description of the set of information items applicable to these validation processes. Portions of this Part will be initially based on RELAX Namespaces.
Part 2 - Grammar-oriented schema languages. Grammar-oriented schema languages validate the structure of information items in an instance conforms to a set of constraints described by a tree grammar. This includes constraining the text in the tree found at the terminal symbols in the grammar to data types and parameters described in Part 3 of this IS. This Part includes a syntax for specifying and identifying: (1) the grammar of the hierarchy; (2) the identity of data types, their parameters and the parameter values standardized by DSDL; (3) the identity of non-DSDL data types. This Part is initially based on RELAX NG.
Part 3 - Primitive data type semantics. Terminal symbols of text in the hierarchical tree may represent values of a data type. This Part defines: (1) a set of standardized named data types (e.g., integer); (2) a set of parameters and their values for each data type (e.g., minimum and maximum values); (3) a set of constraints describing a possibly infinite set of strings representing values of the data type. This Part is initially based on a subset of primitive data types and their facets from Part 2 of W3C XML Schema.
Part 4 - Path-based integrity constraints. The non-hierarchical links between information items in a structured resource can be reconstituted by addressing the items and expressing the relationship between them found in the original graph of information. The addressing mechanism includes hierarchy-based paths of steps along the tree to the information item being addressed... This Part is initially based on Schematron.
Part 5 - Object-oriented schema languages. Object-oriented schema languages validate the structure of information items in an instance conforms to a set of constraints described using inheritance. These constraints can be useful when using XML in conjunction with object-oriented concepts used widely in modern programming languages (e.g., Java) and modern modeling languages (e.g., UML). This Part is initially based on Part 1 of W3C XML Schema and the sections of Part 2 of W3C XML Schema describing the derivation of new simple types and describing the syntax for referring to primitive data types.
Part 6 - Information item manipulation. Structured information resources may need to be augmented, reduced, or have information items otherwise manipulated as part of the validation process. XML Document Type Definitions (DTDs) and HyTime include methods of defaulting attributes and information item renaming that characterize the changes that are sometimes necessary... This Part will be declarative in nature and will not attempt to provide totally general purpose transformation requirements.
Part 7 - Namespace-aware processing with DTD syntax. Existing structural constraints on and defaulted values for information items in a structured resource may already be described using XML Document Type Definition (DTD) syntax. These constraints could be interpreted accommodating namespaces. These constraints need not be directly coupled to the XML document through a document type declaration. This Part will address: (1) the semantics of the validation of a tree according to the syntax of a DTD; (2) decoupling the specification of the DTD from the instance to be validated by the DTD.
Principal references:
- WG1 Overview of DSDL (Part 0). ISO/IEC JTC 1/SC 34 N-0275. CD 19757-0. DSDL Part 0, Overview. Edited by G. Ken Holman. 11-December-2001.
- Proposed CD text of Part 2 of DSDL: Grammar-based Schema Languages. Document: ISO/IEC JTC 1/SC34 N0276. From ISO/IEC JTC 1/SC34: Information Technology -- Document Description and Processing Languages. Project: ISO/IEC 19757-2. Project editors: James J. Clark and Mokoto Murata. Document date: 13-December-2001. [Temporary stand-in for the complete text.] "The text proposed as ISO/IEC CD 19757-2 is that developed in OASIS as RELAX NG. For a short while, until the project editors can convert the document to ISO style, the text can be viewed at their site, http://www.relaxng.org or http://www.oasis-open.org/committees/relax-ng/spec-20011203.html.
- "First Public Working Draft of Document Schema Definition Language (DSDL)." 2001-11-03.
- "Recommendations of the ISO/IEC JTC1/SC34/WG1 Meeting Orlando December 9-12, 2001." From SC34/WG1. ISO/IEC JTC 1/SC34 N285. Informational content: "(1) Recommendation 1: Programme of Work. SC34/WG1 accepts the list of parts indicated in SC34 N275 as an appropriate subdivision of CD 19757, Document Schema Description Language (DSDL). SC34/WG1 requests that SC34 instruct the Secretariat to make appropriate changes in the SC34 Programme of Work. (2) Recommendation 2: DSDL, Part 0. SC34/WG1 accepts N275 as the proposed text for Part 0 of DSDL and recommends that SC34 forward it to the Secretariat for CD registration and ballot. (3) Recommendation 3: DSDL, Part 2. SC34/WG1 accepts N276 as the proposed text for Part 2, 'Grammar-based schema languages', of DSDL and recommends that SC34 forward it to the Secretariat for CD registration and ballot. (4) Recommendation 4: DSDL Editors. WG1 appoints editors for DSDL: Part 0, Ken Holman; Part 1, Ken Holman, James Clark, and Makoto Murata; Part 2, James Clark, and Makoto Murata; Part 4, Rick Jelliffe." See also the "Resolutions of the SC34 Meeting, Orlando, 8-13 December 2001."
- "Document Schema Definition Language (DSDL)" - Main reference page.