The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: April 29, 2009
Document Schema Definition Languages (DSDL)

Contents

Overview

Document Schema Definition Languages (DSDL) is an ISO project under ISO/IEC JTC 1/SC34 Information Technology -- Document Description and Processing Languages. DSDL initially was used as an abbreviation for Document Schema Definition Language; the name was later changed to Document Schema Definition Languages to signify the closely related set of languages. "The objective of developing Document Schema Definition Languages (DSDL) is to create a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology." See details in the draft Part 1: Overview of April 28, 2007.

Principal URLs

DSDL as a Multiple-Part Standard

Drafts and final versions of DSDL's several parts are referenced below (parts 0-9 are now parts 1-10):

Part 0: DSDL Overview

Update 2005-02: see now the 'Overview' as DSDL Part 1 below.

Part 1: DSDL Overview

[September 18, 2008] Information Technology — Document Schema Definition Languages (DSDL) — Part 1: Overview. Text for FCD [Final Committee Draft] ballot (Announcement of Document Availability). WG1 Project Editor: Mr. Martin Bryan. See memo to P, O and L members of ISO/IEC JTC 1/SC 34 with reference number: ISO/IEC JTC 1/SC 34 N1075: "In accordance with Resolution 6 adopted at the SC 34 Plenary meeting held in Kyoto, Japan, 2007-12-08/11 (SC 34 N 968), this document is circulated to the SC 34 members for an FDC ballot. Please vote/comment by 2009-01-19." Provided by Secretariat, ISO/IEC JTC 1/SC 34 IPSJ/ITSCJ (Information Processing Society of Japan/Information Technology Standards Commission of Japan — A Standard Organization accredited by JISC), Room 308-3, Kikai-Shinko-Kaikan Bldg., 3-5-8, Shiba-Koen, Minato-ku, Tokyo 105-0011 JAPAN; Tel: +81 3 3431 2808; Fax: +81 3 3431 6493; email to Toshiko Kimura. Copyright © ISO/IEC 2008. Status: Not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard. See 1075 in the ISO/IEC JTC 1/SC 34 Document Register.

Summary from the 'Foreword' and 'Introduction': ISO/IEC 19757-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. ISO/IEC 19757 consists of the following parts, under the general title Document Schema Definition Languages (DSDL): ISO/IEC 19757 consists of the following parts, under the general title Information Technology — Document Schema Definition Languages (DSDL): Part 1: Overview; Part 2: Regular-grammar-based validation — RELAX NG; Part 3: Rule-based validation — Schematron; Part 4: Namespace-based validation dispatching language — NVDL; Part 5: Datatypes; Part 7: Character repertoire description language — CRDL; Part 8: Document schema renaming language — DSRL; Part 9: Datatype- and namespace-aware DTDs.

ISO/IEC 19757 defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) or Standard Generalized Markup Language (SGML) documents. (XML is an application profile of SGML — ISO 8879.) A document model is an expression of the constraints to be placed on the structure and content of documents to be validated against the model and the information set that needs to be transmitted to subsequent processes. Since the development of Document Type Definitions (DTDs) as part of ISO 8879, a number of technologies have been developed through various formal and informal consortia notably by the World Wide Web Consortium (W3C) and the Organization for the Advancement of Structured Information Standards (OASIS). A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry. Historically, when many applications act on a single document, each application inefficiently duplicates the task of confirming that validation requirements have been met. Furthermore, such tasks and expressions have been developed and utilized in isolation, without consideration of how the features and functionality available in other technologies might enhance validation objectives.

The main objective of ISO/IEC 19757 is to bring together different validation-related tasks and expressions to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. In the past, different design and use criteria have led users to choose different validation technologies for different portions of their information. Bringing together information within a single XML document sometimes prevents existing document models from being used to validate sections of data. By providing an integrated suite of constraint description languages that can be applied to different subsets of a single XML document, ISO/IEC 19757 allows different validation technologies to be integrated under a well-defined validation policy.

This multi-part International Standard integrates constraint description technologies into a suite that:

  • provides user control of names, order and repeatability of information objects and their properties (elements and their attributes)
  • allows users to identify restrictions on the coexistence of information objects
  • allows specific information object within structured documents to be validated
  • allows restrictions to be placed on the contents of specific elements and attributes, including restrictions based on the content of other elements in the same document
  • allows the character set that can be used within specific elements to be managed, based on the application of ISO/IEC 10646
  • allows default values to be assigned to element contents and attribute values, and provides facilities for the incorporation of predefined fragments of structured data to be incorporated within documents
  • extends SGML DTDs to include functions such as namespace-controlled validation and datatypes by adapting XML techniques for these capabilities to SGML

Scope: ISO/IEC 19757 specifies a suite of technologies that can be used to validate the structure and contents of structured documents marked up using SGML (ISO 8879) and its derivatives, notably XML (W3C XML). ISO/IEC 19757 defines a set of semantics for describing and ordering validation rules, a set of syntaxes for declaring validation rules, and a syntax for defining models for the management of validation sequences. It includes: (1)Specifications of relevant validation technologies that can be used in isolation or within the DSDL framework. (2) References to validation technologies defined outside of ISO/IEC 19757 that can be used within the DSDL framework. (3) Semantics for managing the sequence in which different validation technologies are to be applied during the production of validation results. ISO/IEC 19757 identifies specifications that can be used by a data validator that accepts a structured input document and produces one or more validation results. ISO/IEC 19757 does not standardize how these specifications shall be invoked, or the error messages they produce... [source PDF]

[April 28, 2007] Information Technology — Document Schema Definition Languages (DSDL) — Part 1: Overview. [Draft] ISO/IEC FCD 19757-1. Copyright © ISO/IEC 2005. 15 pages. ISO/IEC 19757-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. Posted to the DSDL members discussion list by Martin Bryan on April 28, 2007. "Dear DSDLers: Please find attached a revision of Part 1 that takes into account the change of name and purpose for Part 6, and contains minor revisions to the first and last paragraphs of the description of Part 10 to indicate that the pipelining techniques employed will be those in the W3C XProc standard [XProc: An XML Pipeline Language]... I haven't referenced this as yet as I want to see the draft report before committing myself this..." ISO/IEC 19757 consists of the following parts, under the general title Document Schema Definition Languages (DSDL): Part 1: Overview; Part 2: Regular-grammar-based validation — RELAX NG; Part 3: Rule-based validation — Schematron; Part 4: Namespace-based validation dispatching language — NVDL; Part 5: Datatypes; Part 6: Stream-based integrity constraints; Part 7: Character repertoire description language — CRDL; Part 8: Document schema renaming language — DSRL; Part 9: Datatype- and namespace-aware DTDs; Part 10: Validation management. ISO/IEC 19757 defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) or Standard Generalized Markup Language (SGML) documents. (XML is an application profile of SGML — ISO 8879:1986.)... This multi-part International Standard integrates constraint description technologies into a suite that: (1) provides user control of names, order and repeatability of information objects and their properties -- elements and their attributes; (2) allows users to identify restrictions on the coexistence of information objects; (3) allows specific information object within structured documents to be validated; (4) allows restrictions to be placed on the contents of specific elements and attributes, including restrictions based on the content of other elements in the same document; (5) allows the character set that can be used within specific elements to be managed, based on the application of the ISO/IEC 10646 Universal Multiple-Octet Coded Character Set [UCS]; (6) allows default values to be assigned to element contents and attribute values, and provides facilities for the incorporation of predefined fragments of structured data to be incorporated within documents; (7) extends SGML DTDs to include functions such as namespace-controlled validation and datatypes by adapting XML techniques for these capabilities to SGML. [PDF source]

Document Schema Definition Language (DSDL) - Part 1: Overview. 2004-11-16. 16 pages. Reference from ISO/IEC JTC 1/SC 34 N0567. Produced by ISO/IEC JTC 1/SC 34 Information Technology — Document Description and Processing Languages. Project Editor: Mr. Martin Bryan. ['This part of the standard introduces the role of each of the other parts of the standard, and identifies the user requirements that the standard addresses.'] "This International Stadnard defines a set of semantics for describing and ordering validation rules, a set of syntaxes for declaring validation rules, and a syntax for defining models for the management of validation sequences. It includes: [i] Specifications of relevant validation technologies that can be used in isolation or within the DSDL framework; [ii] References to validation technologies defined outside of this International Standard that can be used within the DSDL framework; [iii] Semantics for managing the sequence in which different validation technologies are to be applied during the production of validation results... This International Standard identifies specifications that can be used by a data validator that accepts a structured input document and produces one or more validation results. This multi-part International Standard integrates constraint description technologies into a suite that: (1) provides user control of names, order and repeatability of information objects and their properties (elements and their attributes) (2) allows users to identify restrictions on the coexistence of information objects (3) allows specific information object within structured documents to be validated (4) allows restrictions to be placed on the contents of specific elements and attributes, including restrictions based on the content of other elements in the same document (5) allows the character set that can be used within specific elements to be managed, based on the application of the ISO/IEC 10646 Universal Multiple-Octet Coded Character Set (UCS) (6) allows default values to be assigned to element contents and attribute values, and provides facilities for the incorporation of predefined fragments of structured data to be incorporated within documents (7) extends SGML DTDs to include functions such as namespace-controlled validation and datatypes by adapting XML techniques for these capabilities to SGML... Documents that are not conformant with ISO 8879 (SGML) or one of its derivatives are not within the field of application of this International Standard. Documents prepared using SGML must be validated against an SGML DTD as the first stage in the validation process to produce a well-formed output that is conformant with the W3C XML information set. All intermediate and final expressions of information used for DSDL processing must be expressible using the XML Information Set, except where specific extensions are defined within this standard. The information set may be generated from external sources such as the ESIS of SGML. No expression of any concept supported by DSDL shall require anything beyond which can be expressed in an XML document..."

[December 2002] Early partitioning of DSDL registered 'Part 1' as the Interoperability framework document. Later, 'Part 1' became associated with the Overview. From the December 2002 presentation of Eric van der Vlist: "The Interoperability Framework is the glue between all the pieces of DSDL. The design principle of DSDL is to split the issue of describing and validating documents into simpler issues (grammar based validation, rule based validation, content selection, datatypes, ...). Different tools exist which needs to be integrated. Different types of validations and transformations (defined inside or outside the DSDL project) often need to be associated and a framework is needed to perform the integration. Examples of such mixing include localization of numeric or date formats, pre-validation canonicalization to simplify the expression of a schema, independent content split into different documents validated independently, aggregation of complex content into a single text node or split of structured simple content into a set of elements, etc"

Part 2: Regular-grammar-based validation — RELAX NG

"Regular-grammar-based schema languages can validate that the structure and content of information items in a document instance conforms to a model described by a tree grammar. Tree grammars are characterized by the specification of node patterns. Validation is based on the matching of elements identified in the stream being analyzed with one of the pattern definitions permitted at a particular point in a data tree. The regular-grammar-based language defined in this Part is based on the OASIS RELAX NG specification..." [Part 1: Overview]

Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG. [== Technologies de l'information — Langage de définition de schéma de documents (DSDL) — Partie 2: Validation de grammaire orientée courante — RELAX NG. In PDF format. International Standard. First edition: 2003-12-01. 42 pages. Reference number ISO/IEC 19757-2:2003(E). Copyright © ISO/IEC 2003. "This part of ISO/IEC 19757 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. This part of ISO/IEC 19757 establishes requirements for RELAX NG schemas and specifies when an XML document matches the pattern specified by a RELAX NG schema... Clause 5 describes the data model, which is the abstraction of an XML document used throughout the rest of the document. Clause 6 describes the syntax of a RELAX NG schema. Clause 7 describes a sequence of transformations that are applied to simplify a RELAX NG schema, and also specifies additional requirements on a RELAX NG schema. Clause 8 describes the syntax that results from applying the transformations; this simple syntax is a subset of the full syntax. Clause 9 describes the semantics of a correct RELAX NG schema that uses the simple syntax; the semantics specify when an element is valid with respect to a RELAX NG schema. Clause 10 describes requirements that apply to a RELAX NG schema after it has been transformed into simple form. Finally, Clause 11 describes conformance requirements for RELAX NG validators..." See also: Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG — Amendment 1: Compact Syntax. In PDF format [cache ZIP: ISO-RELAXNG-19757-2, Amendment 1]

[December 15, 2003]   RELAX NG XML Schema Language Published as an ISO Standard (DSDL Part 2).    A posting from James Clark announces the publication of the RELAX NG specification as an ISO standard, being Part 2 'Regular-Grammar-Based Validation' of the multi-part ISO 19575 Document Schema Definition Language (DSDL). In Clark's vision, the RELAX NG schema language is "based firmly on the labelled-tree abstraction," distinguished from other XML schema languages by what it leaves out; in RELAX NG, the syntax and minimal labelled-tree abstraction implicit in that syntax are at the center of XML processing." According to the DSDL Part 2 abstract, ISO/IEC 19757-2:2003 "specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. A RELAX NG schema is itself an XML document. ISO/IEC 19757-2:2003 specifies (1) when an XML document is a correct RELAX NG schema and (2) when an XML document is valid with respect to a correct RELAX NG schema." RELAX NG is supported by a growing collection of software tools, including validators, conversion utilities, code generators, and XML editors. ISO/IEC 19757-2:2003 is Part 2 of a planned ten-part ISO standard which will include "Rule-Based Validation: Schematron" (Part 3) as well. The goal of ISO SC34/WG1 (Document Description and Processing Languages, Information Description) in developing Document Schema Definition Languages (DSDL) is "to create a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology."

Part 3: Rule-based validation — Schematron

Rule-based schema languages allow documents to be validated by confirming that they do not conflict with a set of rules describing permitted relationships between document components. Rules do not need to be based on hierarchical relationships, but can use hierarchical relationships to identify applicable parts of data streams. Rules are required to allow the specification of constraints such as 'If the contents of the element named "Sex" is "Male" then the contents of the element "Diagnosis" may not include "Pregnant".' Rules can also ensure that sets of data are compatible, e.g. "If there are multiple items in an order for which different delivery dates have been specified, ensure that all delivery dates are between the order date and the date specificed as the maximum permitted time for completion of the order." The rule-based grammar defined in this part is based on the widely adopted Schematron specification..." [Part 1: Overview]

[June 15, 2006] "ISO Schematron Standard Published." By Rick Jelliffe. O'Reilly News. "The paper and online versions of the ISO Schematron standard are now available from ISO for CHF120 and from ANSI for US$98. I believe it is being translated into Japanese as a JIS standard and will be cloned as a British Industrial Standard. I'd like to thank everyone involved at ISO SC34, notably Ken Holman, Martin Bryan, Murata Makoto, Yushi Komachi, James Clark, Alex Brown, Eric van der Vlist, Lynn Price, and Charles Goldfarb. Special mention to my far-thinking sponsors at Academia Sinica, Taipei, for letting me developer the ideas and implementation, in particular Dr Simon Lin and Prof. C.C.Hsieh. Thanks also to various patient bosses or business partners at Geotempo, Topologi and Allette Systems. After almost a year with little news, it seems not a day goes by without someone from a large government organization or Fortune 500 company dropping me a line saying that they use Schematron: millions of documents. Schematron has been ticking away as a grassroots phenomenon: indeed AFAIK every implementation of it is Open Source. But Schematron's strength is not comprehensiveness but that it is a simple layer to allow validation using XPaths without requiring programming knowledge (e.g. XSLT skills). XPaths really are fantastic." According to the Schematron Implementer's FAQ, the ISO Secretariat "is still considering the request from ISO SC34 to make [the specification] free. There are drafts available at Schematron.com. The only technical difference in the final version is that "/" was missing as an allowed node in the context XPath, and the attribute name @queryLanguage was misnamed in one place..."

[November 09, 2004]   Final Committee Draft of ISO Schematron Released for Public Review.    A communiqué from Rick Jelliffe describes the availability of a ISO FCD (Final Committee Draft) for ISO/IEC 19757-3 Document Schema Definition Languages (DSDL) — Part 3: Rule-Based Validation — Schematron. Schematron is a language for making assertions about patterns found in XML documents, and serves as a schema language for XML. As Part 3 of the multi-part ISO/IEC 19757 (DSDL) standard, it defines "requirements for Schematron schemas and specifies when an XML document matches the patterns specified by a Schematron schema." This Final Committee [Review] Draft of ISO Schematron incorporates feedback from national standards bodies and from implementers. It is available online in PDF, HTML, and RTF formats. Improvements "include an annex on multilingual schemas, further treatment of abstract patterns, and validated schemas. The predicate logic used to specify Schematron formally has also been reworked. The specification remains very small, at about 35 pages including front matter, schemas and non-normative annexes." This FCD draft has been made publicly available for comment, for identification of spelling errors, and as an aid implementers and users until the final International Standard is published in paper by ISO and other nations that adopt Schematron as a national standard, expected in 2005. This text is suitable as the interim reference for organizations adopting Schematron. The editors encourage all Schematron implementers to check the draft standard and to add support for it for 2005. According to the Schematron.com Overview, "the Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages. If you know XPath or the XSLT expression language, you can start to use The Schematron immediately. It allows you to develop and mix two kinds of schemas: (1) Report elements allow you to diagnose which variant of a language you are dealing with; (2) Assert elements allow you to confirm that the document conforms to a particular schema."

[See provisionally: "Schematron: XML Structure Validation Language Using Patterns in Trees."]

Part 4: Namespace-based validation dispatching language — NVDL

NVDL.org is a "home for user information related to the International Standard for document model validation of instances with multiple namespaces using different document models. NVDL is Part 4 of ISO/IEC 19757 DSDL (Document Schema Definition Languages)."

This NVDL web site contains information not included in the ISO specification itself: schemas for NVDL scripts [RELAX NG, compact syntax, not shown in ISO/IEC 19757-4)]; proposed technical corrigenda; Sourceforge project for the development of tutorials and test suites [NVDL test suites and tutorials]; public NVDL demonstrations available on the web; NVDL tools, including enovdl - Mono .NET development platform; JNVDL: an open-source NVDL implementation written in Java; oNVDL - oXygen XML NVDL implementation based on Jing.

DSDL Part 4 provides an XML-based language for selecting elements and attributes in specific namespaces within a document instance that are to be validated by a specified schema. Such elements are known as validation candidates. This part also enables sets of validation rules to be shared between applications that share data components.

International Standard ISO/IEC 19757-4. First edition. 2006-06-01. Information technology — Document Schema Definition Languages (DSDL) — Part 4: Namespace-based Validation Dispatching Language (NVDL) Technologies de l'information — Langages de définition de schéma de documents (DSDL) — Partie 4: Langage de diffusion de validation d'espace de nom orienté (NVDL). Reference number: ISO/IEC 19757-4:2006(E). Copyright (c) ISO/IEC 2006. 58 pages. "ISO/IEC 19757 defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) documents. A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry. The main objective of ISO/IEC 19757 is to bring together different validation-related technologies to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. The motivations of this part of ISO/IEC 19757 are twofold. One is to allow the interworking of schemas describing different markup vocabularies. The other is to allow these schemas to be written in different schema languages. For this purpose, this part of ISO/IEC 19757 specifies a Namespace-based Validation Dispatching Language (NVDL). The structure of this part of ISO/IEC 19757 is as follows. Clause 5 describes the data model, which is the abstraction of an XML document used throughout the rest of the document. Clause 6 describes the full syntax and the simple syntax of NVDL scripts, and further describes the transformation from the full syntax to the simple syntax. Clause 7 describes primitive operations for the NVDL data model, which are used for defining the NVDL semantics. Clause 8 describes the semantics of a correct NVDL script in the simple syntax; the semantics specify how elements and attributes in a given document are dispatched to different validators and which schema is used by each of these validators. Clause 9 describes conformance requirements for NVDL dispatchers. Annex A and Annex B define the full syntax and the simple syntax using RELAX NG, respectively. Annex C defines the full syntax using NVDL and RELAX NG. Finally, Annex D provides examples of the application of NVDL.... An NVDL script controls the dispatching of elements or attributes in a given XML document to different validators, depending on the namespaces of the elements or attributes. An NVDL script also specifies which schemas are used by these validators. These schemas may be written in any schema languages, including those specified by ISO/IEC 19757." [Source: Information technology -- Document Schema Definition Languages (DSDL) —Part 4: Namespace-based Validation Dispatching Language (NVDL)]

See also Section 15: 'Related Work' in the Namespace Routing Language (NRL) specification. It clarifies that [as or 2003-06-19] "... Some of the evolution of NRL from MNS was inspired by the Namespace Switchboard. A Final Committee Draft (FCD) of Part 4 is currently in preparation; NRL will be submitted as input. At this stage, no guarantees can be made about how NRL will relate to the FCD. In the opinion of this document's author and of the DSDL Part 4 project editor (Murata Makoto), the functionality is likely to be similar, with the following possible exceptions: (1) There are concerns about 11 Element-name context: some feel it is too complicated; some feel it is too simple; (2) The functionality corresponding to 14 Transparent namespaces, was rejected on the last occasion it was discussed; one reason was the lack of implementation experience. It is hoped that this can be reconsidered in the light of NRL; (3) The functionality provided by the 'option' element in 3 Specifying the schema has not yet been considered for the FCD..."

2006-01-31. Proposed Technical Corrigenda for ISO/IEC 19757-4 - Document Schema Definition Languages (DSDL) - Part 4: Namespace-based Validation Dispatching Language (NVDL)

[2005-01-10] ISO/IEC 19757-4/FCD — Document Schema Definition Language (DSDL) — Part 4: Namespace-based Validation Dispatching Language (NVDL). PDF. See details for the FCD following.

[January 03, 2005] Document Schema Definition Languages (DSDL) — Part 4: Namespace-based Validation Dispatching Language — NVDL. [Draft] ISO/IEC JTC 1/SC 34. [Posted] Date: 2005-01-03. 55 pages ISO/IEC FCD 19757-4. ISO/IEC JTC 1/SC 34/WG 1. Secretariat: Standards Council of Canada. Available from MURATA Makoto (Project Editor). Early access to NVDL. Scope: "This part of the International Standard specifies a Namespace-based Validation Dispatching Language (NVDL). An NVDL schema controls the dispatching of elements or attributes in a given XML document to different validators, depending on the namespaces of the elements or attributes. An NVDL schema also specifies which schemas are used by these validators. These schemas may be written in any schema languages, including those specified by this International Standard..." Warning: "This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard. Recipients of this document are invited to submit, with their comments, notification of any relevant patent rights of which they are aware and to provide supporting documentation..." Schemas: (1) Full syntax: nvdlWithForeignEA.rnc in RNC [src]; nvdlWithForeignEA.rng in RNG [src]; (2) Simple syntax: nvdlSimple.rnc in RNC [src]; nvdlSimple.rng in RNG [src] ; (3) Full syntax in the combination of RNC and NVDL: nvdl.nvdl in NVDL [src]; nvdl.rnc in RNC [src]; nvdl.rng n RNG [src]. [source PDF]

[May 03, 2006] Jirka Kosek announced a first alpha release of JNVDL — an open-source NVDL implementation written in Java, including a binary distribution available for download from SourceForge. "NVDL is upcoming ISO standard which can be used to define 'meta-schemas' that define how to validate XML documents that are composed from elements from multiple namespaces. For each namespace, NVDL schema can define a schema against which validation should be performed. This schema can be written in an arbitrary schema language like RELAX NG, DTD or W3C XML Schema. NVDL was heavily inspired by NRL language. Although syntax details of NVDL are different from NRL, it is still useful to go through NRL specification to see what can be done with NRL — and thus also with NVDL. Until now, there was only one NVDL implementation, written for .NET. This is no longer true, as you can now download Java-based implementation called JNVDL. According to Rick Jelliffe, "NDVL will provide a great mechanism for allowing your to selectively dispatch different parts of your document to different validators. So you can pick the best schema language for the job. Or, as is more often the case, you may be working with different vocabularies each defined in a different schema language (DTD, RELAX NG, XSD, Schematron, etc)."

[December 23, 2003] "XML 2003 Session Report: Namespace Routing Language." By Uche Ogbuji. From XMLHack.com (December 22, 2003). At the XML 2003 Conference in Philadelphia "James Clark followed a block of sessions on ISO Document Schema Definition Languages (DSDL) with a presentation on Namespace Routing Language (NRL), which is a key contribution to DSDL Part 4: 'Selection of validation candidates'... Clark said that NRL tried to redeem some of the cost of namespaces by using them to divide-and-conquer schema problems, using the best independent schema in the next schema language to address each sub-problem. NRL identifies groups of elements and attributes based on namespaces. The developer specifies a schema for validating each group. The data model for the entire XML document to be processed is a tree of trees. The big tree is divided into 'sections', which must be subtrees. This division uses a simple set of rules considering the relative subtree for each element and its namespace compared to that of its parents. Sections can also be applied against attributes according to whether they have the same namespace as its owner element, allowing for processing of what some call 'global attributes'. The NRL schema language defines a set of rules for sectioning documents and instructions for executing validation on each section. Rules can invoke validation against multiple schemata in multiple languages, and they can be constructed to handle otherwise unspecified namespaces, say for extremely lax or extremely strict processing. NRL supports modes similar to those in XSLT (in fact the overall processing model is much like that in XSLT). Actions can specify modes to be used for processing children of th context element. NRL also supports explicit setting of context, which allows for processing patterns that can't be expressed with modes alone. For example, one could specify a rule for processing any RDF/XML only if it was contained within an XHTML head element. NRL is designed for streaming implementation, though a subschema language might enforce building of a subtree in memory. SAX is the basis of the implementation of NRL in the open-source RELAX NG processor Jing..."

  • Document Schema Definition Languages (DSDL) — Part 4: Namespace-based Validation Dispatching Language — NVDL. 2004-06-01. ISO/IEC JTC 1/SC 34. Date: 2004-05-31. ISO/IEC CD 19757-4. "This part of the International Standard specifies a Namespace-based Validation Dispatching Language (NVDL). An NDVL schema controls the dispatching of elements or attributes in a given XML document to different validators, depending on the namespaces of the elements or attributes. An NDVL schema also specifies which schemas are used by these validators. These schemas may be written in any schema languages, including those specified by this International Standard... The motivations of this part of ISO/IEC 19757 are twofold. One is to allow the interworking of schemas describing different markup vocabularies. The other is to allow these schemas to be written in different schema languages. For this purpose, this part of ISO/IEC 19757 specifies a Namespace-based Validation Dispatching Language (NVDL). The structure of this part of ISO/IEC 19757 is as follows. Clause 5 describes the data model, which is the abstraction of an XML document used throughout the rest of the document. Clause 6 describes the full syntax and the simple syntax of NVDL scripts, and further describes the transformation from the full syntax to the simple syntax. Clause 7 describes primitive operations for the NVDL data model, which are used for defining the NVDL semantics. Clause 8 describes the semantics of a correct NVDL script in the simple syntax; the semantics specify how elements and attributes in a given document are dispatched to different validators and which schema is used by each of these validators. Clause 9 describes conformance requirements for NVDL dispatchers. Annex A and Annex B define the full syntax and the simple syntax using RELAX NG, respectively. Annex C defines the full syntax using NVDL and RELAX NG. Finally, Annex D provides examples of the application of NVDL..." See the ISO/IEC JTC 1/SC 34 N0525 reference page. [cache]
  • "Comment disposition of JTC 1/SC 34 N 363 CD Ballot for 19757-4 - DSDL Part 4 - Selection of Validation , London, 3-4 May 2003." ISO/IEC JTC 1/SC34 N415. 7 May 2003. "Based on the comment disposition, Project Editors are requested to create a text for the FCD."
  • "Summary of Voting on JTC 1/SC 34 N 363 CD Ballot for 19757-4 - DSDL Part 4 - Selection of Validation Candidates." Project: 19757-4 - DSDL Part 4. ISO/IEC JTC 1/SC34 N0389. 26 March 2003. "Based on the ballot responses, this CD is approved to advance to FCD ballot. Project Editors are requested to review comments and take them into consideration when preparing revised text..."
  • [February 03, 2003] ISO Working Group Publishes Committee Draft for DSDL Standard, Part 4.     An ISO Committee Draft for Document Schema Definition Languages (DSDL) -- Part 4: Selection of Validation Candidates has been released by members of the ISO DSDL Project. The DSDL standard is being published as a multi-part specification; it "brings together multiple schema languages into a single framework that allows them to work together." The DSDL Validation Candidate Selection Language (VCSL) is an "XML-based language for controlling selection of validation candidates. DSDL allows specific parts of an XML document to be extracted and then validated; different schema languages and validators may be applied to different candidates. Descriptions in DSDL VCSL may be independent XML documents or they may be embedded in other XML documents. Specifically, when a DSDL framework is represtented by an XML document, it may reference to or contain descriptions in DSDL VCSL." DSDL Part 4 has been produced under the direction of project editor MURATA Makoto through within ISO/IEC JTC 1/SC34/WG1 (Information Technology -- Document Description and Processing Languages -- Information Description).
  • "Document Schema Definition Languages (DSDL) -- Part 4: Selection of Validation Candidates." Committee Draft. ISO/IEC JTC 1/SC34 N363 (Draft). 2002-12-11.

Part 5: Datatypes

This Part defines a syntax for: (1) creating named sets of user-defined datatypes (e.g. ISBN-number) (2) parsing element and attribute contents using regular expression grammars to identify component parts which need to be validated (3) defining sets of permitted (enumerated) values that can be used to validate the contents of a specific element or attribute (4) defining sets of repeatable, choice or optional items which can make up a datatype definition (5) identifying constraints that can be used to limit a particular datatype (e.g. minimum and maximum values) defining conditions that must be met if a datatype is to be considered valid (6) identifying relationships between datatypes (e.g. supertype/subtype relationships) and techniques for mapping from one datatype to another (7) defining collation orders for datatypes, or identifying externally defined collation rules..." [Part 1: Overview]

[December 16, 2008] Document Schema Definition Languages (DSDL) — Part 5: Extensible Datatypes. ISO/IEC FCD 19757-5. Date: 2008-12-17. 20 pages. Project Editor: Dr. Alex Brown. Normative Annex A provides the RELAX NG schema for Extensible Datatypes documents, with default namespace http://purl.oclc.org/dsdl/extensible-types[;]. Circulated to P- and O-members, and to technical committees and organizations in liaison for comment and voting. See the source PDF via the SC 34 Document, Reference number: ISO/IEC JTC 1/SC 34 N 1130. Introductory note for FCD 19757-5: "In accordance with Barcelona Resolution 1, 19757 was subdivided by Part. The first CD ballot was conducted as SC 34 N 716 and its summary of voting is contained in SC 34 N 740. The Disposition of Comments is contained in SC 34 N 762. In accordance with Resolution 6 of the SC 34 Plenary meeting held in Kyoto, Japan, 2007-12-08/11, this document is circulated to the SC 34 members for an FCD ballot. SC 34 members are requested to vote and comment via the CIB system on the ISO/TC server by the due date." ISO/IEC 19757-5 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. DSDL Part 5 "specifies a XML language that allows users to create and extend datatype libraries for their own purposes. The datatype definitions in these libraries may be used by XML validators and other tools to validate content and make comparisons between values. DSDL Part 5, Extensible Datatypes, is a powerful, XML-based language which enables users to create and extend their own libraries of datatypes using straightfoward declarative XML constructs. Such libraries are well-suited to being used in pipelining validation processes in conjunction with other XML schema languages. Unlike W3C [XML] Schema, ISO 19757-2:2003 (RELAX NG) does not itself provide a declarative mechanism for users to define their own datatypes. If they are not satisfied with the two built-in types of string and token, RELAX NG users have had either to use a pre-written library bundled with their validator, or to program a datatype library using that validator's API. Such programmed datatype libraries are hard to construct for non-programmer users, and built-in datatype libraries are often insufficient for users' needs. The schema language used is the compact syntax of RELAX NG, as defined by ISO 19757-2:2003 Amendment 1. Datatype libraries are defined in ISO 19757-2:2003 as being identified by an IRI, with each datatype within a given datatype library being identified by a NCName. an Extensible Datatypes document presents one or more such datatype libraries to implementations. Each datatype definition has a qualified name; the Namespace IRI identifies the datatype library to which the datatype belongs, and the local part identifies the name of the datatype within that datatype library..."

[August 23, 2006] Discussion Draft - Document Schema Definition Languages (DSDL) - Part 5: Datatypes. Project: CD 19757-5: Information technology - Document Schema Definition Languages (DSDL) - Part 5: Datatypes. Edited by Mr. Martin Bryan (CSW). Discussion document (for discussion in Montréal). 2006-08-03. 22 pages. See the cover page for ISO/IEC JTC 1/SC 34 N0777. "This International Standard specifies an XML language that allows users to create datatype libraries. The datatype definitions in these libraries may be used by XML validators and other tools to validate content and test equality between values... The primary use case for a language for datatype libraries is to enable users to construct their own datatypes without having to resort to a procedural programming language or use pre-defined sets of datatypes which might not be suitable. Unlike W3C Schema, ISO 19757-2:2003 (RELAX NG) does not provide a mechanism for users to define their own datatypes. If they are not satisfied with the two built-in types of string and token, RELAX NG users have to create a datatype library which they then refer to from the schema. Most RELAX NG validators provide built-in support for W3C Datatypes. Many also support an API that allows users to develop datatype modules to define extra datatypes. But because these datatype libraries have to be programmed, [and] non-programmer users find them hard to construct..." [source PDF]

[August 05, 2005] "Datatype Library Language (DTLL)." By Jeni Tennison (ed). Version 0.4. Posted August 2005. Martin Bryan, Project Editor for ISO/IEC JTC 1/SC 34 Information Technology -- Document Description and Processing Languages, reported that Jeni Tennison had produced a revision to her DTLL proposals for Part 5 of DSDL. 'The primary motivation for putting together a language for datatype libraries is to enable RELAX NG users to construct their own datatypes without having to resort to a procedural programming language or having to learn how to use XML Schema, which might not be suited for their needs.' Jeni: "This document is a basic specification of the Datatype Library Language (DTLL). It includes, embedded within it, the RELAX NG Compact Syntax schema for DTLL. There are still many areas that require greater detail. This version is a simplification of the previous version of DTLL which attempts to find the minimum required to support the definition of datatypes for the purposes of validation. In particular, the changes are: (1) Hierarchies of datatypes have been removed: datatypes no longer have supertypes or subtypes, and consequently do not have parameters or constraints. The concept of abstract datatypes is also no longer needed. It is still possible to create datatypes that are based on other datatypes, however; for example, to create an integer between 1 and 10, you could do... (2) There's no longer a specialised parsing method for enumerated values: these can be parsed using regular expressions and tested against external code lists using normal constraints by accessing the code lists using the doc() function. (3) The method for parsing lists of values has been simplified: the DTLL processor only has to break up the list into separate values; testing that these values are of particular types can be done using constraints. (4) There's no method for specifying the collation used to compare values of a particular datatype. The main purpose of supplying a collation is to facilitate XPath datatyping rather than validation. Although the lack of collations makes writing conditions harder, it's still generally possible to do so without them. (5) A couple of extra extension functions to XPath 1.0 have been added, though others have been removed. (6) Mapping has been modified to support the kinds of things you'd otherwise do with hierarchies, including strong and weak typing..." See also the RNC schema. [cache HTML and RELAX NG Compact Syntax Grammar]

Part 6: Stream-based integrity constraints

Path-based integrity constraints allow path-based languages, such as the XML Path Language (XPath), to be used to identify relationships between elements that must, or may not, occur in valid documents. This Part is based on the four-directional tree path navigation paradigm (parent, child, preceding sibling and following sibling) defined in XPath. Path-based constraints can be used to identify a fragment of a document against which a specific schema may be applied. For example, if a document has adopted the CALS table model without assigning a specific namespace to CALS-conformant elements, this Part can be used to select a table and parse it according a schema fragment that defines the structure of CALS tables. Path-based constraints will also permit the specification of inter-document relationships, allowing the validity of one document to depend on the presence of specific information in another document..." [From Part 1: Overview]

Part 7: Character repertoire description language — CREPDL

Note early title variants: Part 7: Character repertoire description language (CRDL) or CREPDL / Part 7: Character repertoire validation — CRVL]

[April 25, 2009] Document Schema Definition Languages (DSDL) — Part 7: Character Repertoire Description Language (CREPDL). ISO/IEC FDIS 19757-7:2009(E). [Candidate] Final Draft International Standard (FDIS). 17 pages. See the posting by MURATA Makoto on April 25, 2009 (source PDF). Message: "Dear colleagues, Attached please find my latest draft for ISO/IEC FDIS 19757-7. Unless I hear comments next week, I will send this document to ITTF..." Comment by Rick Jelliffe on 'minor typos'. Excerpt: "ISO/IEC 19757 defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) documents. A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry. The main objective of ISO/IEC 19757 is to bring together different validation-related technologies to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. This part of ISO/IEC 19757 provides a language for describing character repertoires. Descriptions in this language may be referenced from schemas. Furthermore, they may also be referenced from forms and stylesheets... follows. Clause 5 introduces kernels and hulls of repertoires. Clause 6 specifies the syntax of CREPDL schemas. Clause 7 specifies the semantics of a correct CREPDL schema; the semantics specify when a character is in a repertoire described by a CREPDL schema. Clause 8 defines CREPDL processors and their behaviour. Finally, Annex A describes differences of conformant CREPDL processors, and Annex B provides examples of CREPDL schemas... An CREPDL schema shall be an XML document (W3C XML) valid against the the NVDL (ISO/IEC 19757-4) script in Clause 6.3, which in turn relies on the RELAX NG (ISO/IEC 19757-2) schema in Clause 6.2. The elements allowed in the RELAX NG schema is of the namespace (W3C XML-Names) http://purl.oclc.org/dsdl/crepdl/ns/structure/1.0. Further constraints on the character content of the char, kernel or hull elements are shown in Clause 6.4..."

[January 2009] Document Schema Definition Languages (DSDL) — Part 7: Character Repertoire Description Language (CREPDL). ISO/IEC FDIS 19757-7:2009(E). [Candidate] Final Draft International Standard. 2009-01-13. 17 pages. Informative Annex A describes 'Differences of Conformant Processors'; Informative Annex B supplies 'Example CREPDL schemas' (B.1 ISO/IEC 8859-6; B.2 ISO/IEC; B.3 Armenian script; B.4 Malayalam script; B.5 The Japanese list of kanji characters for the first grade; B.6 The Japanese list of kanji characters for the second grade). Source: see the January 13, 2009 posting by MURATA Makoto [WWW home, Wikipedia] to the 'dsdl-discuss' mailing list with subject line "Draft of the Part 7 FDIS": "Here is a revised version. I will submit the FDIS probably in February [2009] to ITTF [ISO/IEC Information Technology Task Force]." ISO/IEC 19757-7 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 34, Document description and processing languages. ISO/IEC 19757 consists of [several] parts, under the general title Information technology — Document Schema Definition Language (DSDL)... ISO/IEC 19757 defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) documents. A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry. The main objective of ISO/IEC 19757 is to bring together different validation-related technologies to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. This part of ISO/IEC 19757 provides a language for describing character repertoires. Descriptions in this language may be referenced from schemas. Furthermore, they may also be referenced from forms and stylesheets. [Note: As the time of this writing, no schema languages provide mechanisms for referencing CREPDL schemas.] Descriptions of repertoires need not be exact. Non-exact descriptions are made possible by kernels and hulls, which provide the lower and upper limits, respectively. The structure of this part of ISO/IEC 19757 is as follows. Clause 5 introduces kernels and hulls of repertoires. Clause 6 describes the syntax of CREPDL schemas. Clause 7 describes the semantics of a correct CREPDL schema; the semantics specify when a character is in a repertoire described by a CREPDL schema. Clause 8 defines CREPDL processors and their behaviour... 6.1 General: "An CREPDL schema shall be an XML document (W3C XML) valid against the the NVDL (ISO/IEC 19757-4) script in 6.3, which in turn relies on the RELAX NG (ISO/IEC 19757-2) schema in 6.2. The elements allowed in the RELAX NG schema is of the namespace (W3C XML-Names) http://purl.oclc.org/dsdl/crepdl/ns/structure/1.0. Further constraints on the character content of the char, kernel or hull elements are shown in 6.4..." Note the 'Related Resources for CREPDL' referenced from the XML namespace document at http://purl.oclc.org/dsdl/crepdl/ns/structure/1.0, including, as of 7-January-2009, (1) NVDL, A (normative) NVDL crepdl.nvdl for CREPDL. It references to crepdl.rnc. (2) RELAX NG, A (normative) RELAX NG schema in the compact syntax crepdl.rnc for CREPDL. [Source PDF]

Document Schema Definition Languages (DSDL) — Part 7: Character Repertoire Description Language. 17 pages. Date: 2008-01-11. ISO/IEC FCD 19757-7. From: ISO/IEC JTC 1/SC 34/WG 1. Secretariat: Japanese Industrial Standards Committee. In accordance with Resolution 6 of the SC 34 Plenary meeting held in Kyoto, Japan (SC 34 N 968rev), this document is circulated to the SC 34 members for four months FCD ballot. P-members of SC 34 are requested to vote as soon as possible, but not later than 2008-05-11. See the reference document ISO/IEC JTC 1/SC 34 N0978. Warning: "This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard." ISO/IEC 19757-7 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. The document "Foreword" reports: "ISO/IEC 19757 consists of the following parts, under the general title Document Schema Definition Languages (DSDL): [...] Part 7: Character repertoire description language — CREPDL." [Or perhaps, per the reference document N0978: "Part 7: Character Repertoire Description Language (CRDL)".] "This International Standard defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) documents. A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry. The main objective of this International Standard is to bring together different validation-related technologies to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. This part of ISO/IEC 19757 provides a language for describing character repertoires. Descriptions in this language may be referenced from schemas. Furthermore, they may also be referenced from forms and stylesheets... Clause 5 introduces kernels and hulls of repertoires. Clause 6 describes the syntax of CREPDL schemas. Clause 7 describes the semantics of a correct CREPDL schema; the semantics specify when a character is contained by a repertoire described by a CREPDL schema. Clause 8 defines CREPDL validators and their behaviours. Clause 9 defines conformance of CREPDL processors. Finally, Annex A provides examples of the application of CREPDL... [source PDF]

Document Schema Definition Languages (DSDL) — Part 7: Character Repertoire Description Language. ISO/IEC JTC 1/SC 34. Date: 2006-11-04. ISO/IEC CD 19757-7. ISO/IEC JTC 1/SC 34/WG 1. Secretariat: Standards Council of Canada. [This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard. Recipients of this document are invited to submit, with their comments, notification of any relevant patent rights of which they are aware and to provide supporting documentation.] "This part of the International Standard specifies a Character Repertoire Description Language (CRDL). A CRDL schema describes a collection of characters defined in ISO/IEC 10646 or Unicode or default grapheme clusters defined in UAX#29. The structure of this part of ISO/IEC 19757 is as follows. Clause 5 introduces kernels and hulls of collections. Clause 6 shows how Unicode regular expression can be used to describe permissible characters and default grapheme characters. Clause 7 describes the syntax of CRDL schemas. Clause 8 describes the semantics of a correct CRDL schema; the semantics specify when a character is contained by a collection described by a CRDL schema.Clause 9 describes modes of CRDL validators..." [source PDF]

2007-02-22. Summary of Voting on JTC 1/SC 34 N 799 - Text for CD ballot for ISO/IEC 19757-7: Document Schema Definition Language (DSDL) Part 7 - Character Repertoire Description Language (CRDL)

"At present SGML and XML users have no control over which set of characters can appear in a particular element or attribute value. For example, an element could have an xml:lang attribute indicating it is in English but contain Chinese or Sanskrit characters. This Part provides a mechanism for checking that the contents of an element or attribute are taken from a formally defined subset of the ISO/IEC 10646 Universal Multiple-Octet Coded Character Set (UCS) that is the basis for XML encoded documents. This Part provides a syntax for: (1) defining named subsets of the ISO/IEC 10646 character set, and (2) identifying which named character set shall be used to validate the content of a specific element or attribute. CRVL utilizes the hull and kernel approach to character set definition whereby a minimal set of 'compulsory characters' (the kernel) can be supplemented by characters in a wider set of 'optional characters' (the hull) from which certain 'exceptions' have been excluded... [Part 1: Overview]

Document Schema Definition Languages (DSDL) — Part 7: Character Repertoire Description Language. ISO/IEC CD 19757-7. ISO/IEC JTC 1/SC 34. Date: 2006-11-1. ISO/IEC JTC 1/SC 34/WG 1. Secretariat: Standards Council of Canada. [MURATA Makoto]. See the posting for a "significantly revised version" of 'part7SecondCD.pdf' as of 2006-11-03. "This part of ISO/IEC 19757 provides a language for describing collections of characters defined in ISO/IEC 10646 or Unicode or default grapheme clusters defined in UAX#29. Descriptions in this language may be referenced from schemas. Furthermore, they may also be referenced from forms and stylesheets. Descriptions of collections need not to be exact. To provide non-exact descriptions, this part of ISO/IEC 19757 provides kernels and hulls, which provide the lower limit and upper limits, respectively. Clause 5 introduces kernels and hulls of collections. Clause 6 shows how Unicode regular expression can be used to describe permissible characters and default grapheme characters. Clause 7 describes the syntax of CRDL schemas. Clause 8 describes the semantics of a correct CRDL schema; the semantics specify when a character is contained by a collection described by a CRDL schema.Clause 9 describes modes of CRDL validators. The 19757 [draft, in-progress] Standard defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) documents. A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry. The main objective of this International Standard is to bring together different validation-related technologies to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified..." Note: "This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard." [source PDF 2006-11-01, source PDF, updated 2006-11-03]

Document Schema Definition Languages (DSDL) — Part 7: Character Repertoire Validation Language. ISO/IEC CD 19757-7. Date: 2005-02-18. Posted by MURATA Makoto. Produced by ISO/IEC JTC 1/SC 34/WG 1. Secretariat: Standards Council of Canada. 12 pages. 'Warning: This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard.'] "ISO/IEC 19757-7 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. This part of ISO/IEC 19757 provides a schema language for describing collections of ISO/IEC 10646 characters. Schemas in this language may be referenced from other schema languages. The structure of this part of ISO/IEC 19757 is as follows. Clause 5 introduces kernels and hulls of character collections ['A kernel contains characters that are guaranteed to be in the collection; the collection may contain other characters. A hull gives an outer boundary so that characters which are not in the hull are guaranteed not to be in the collection; some characters in the hull may not actually be in the collection.'] Clause 6 describes the syntax of CRVL schemas [An CRVL schema in the full syntax shall be an XML document valid against the following RELAX NG schema in the compact syntax...'. Clause 7 describes the semantics of a correct CRVL schema; the semantics specify when a character is contained by a collection described by a CRVL schema.Clause 8 describes conformance requirements for CRVL validators..."

References:

Part 8: Document semantics renaming language — DSRL

Note: Second FCD uses the title "Document Semantics Renaming Language", where earlier drafts used the title "Document Schema Renaming Language." The (proposed) FDIS title is Document Semantics Renaming Language (DSRL)

"Information Technology — Document Schema Definition Languages (DSDL) — Part 8: Document Semantics Renaming Language (DSRL) as ISO/IEC 19757-8 Second FCD was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. ISO/IEC 19757 defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Markup Language (XML) or Standard Generalized Markup Language (SGML) documents. (XML is an application profile of SGML, ISO 8879:1986.) To validate that a structured document conforms to specified constraints in structure and content relieves the potentially many applications acting on the document from having to duplicate the task of confirming that such requirements have been met. Historically, such tasks and expressions have been developed and utilized in isolation, without consideration for how the features and functionality available in other technologies might enhance validation objectives. The main objective of this part of ISO/IEC 19757 is to bring together different validation-related tasks and expressions to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. In the past, different design and use criteria have led users to choose different validation technologies for different portions of their information. Bringing together information within a single XML document sometimes prevents existing document models from being used to validate sections of data. By providing an integrated suite of constraint description languages that can be applied to different subsets of a single XML document, this part of ISO/IEC 19757 allows different validation technologies to be integrated under a well-defined validation policy... The Document Semantics Renaming Language (DSRL) provides a mechanism for declaring how an application can map locally meaningful element, attribute, entity and processing instruction names to the names assigned to equivalent XML elements, attributes, entities and processing instructions within a document model without having to completely rewrite the DTD or schema to which they are required to conform. In addition, DSRL provides an XML-based format for declaring the replacement text for entity references and provides a mechanism that allows users to define default values for both element content and attribute values. To allow for schemas that do not support the use of attributes, DSRL also allows users to convert attribute values to element content. DSRL maps are used to map names within document instances to names, or alternative structures, used within a validation schema. [from ISO/IEC FCD 19757-8, Second FCD]

"The Document Schema Renaming Language provides mechanisms whereby locally meaningful names can be associated with element, attribute and entity names used within schemas, and for the definition of locally meaningful 'templates' of reusable data. DSRL templates can be used to assign default values to specific parts of a data stream. This includes mechanisms for defining standard sequences of data that can be incorporated into document instances by reference to an identifying name, the provision of default content for elements and attributes for which no value is provided, and the matching of local element and attribute names to those used in a specific schema. DSRL defines a syntax for describing simple modifications to be made to the information set of a DSDL document instance, without requiring the full power of a general-purpose transformation language such as XSLT[3]. This Part provides a syntax for: (1) identifying elements, attributes and entities whose names have been altered from those required by the validation schema (2) assigning a default value to the contents of a specific type of element or attribute for which no value is specified in the document instance (3) defining named entities which can be refernced at points at which users wish to incorporate predefined sets of data elements (a template) within a document instance (4) removing elements or attributes from specific locations within the document model..." [Part 1: Overview]

[December 31, 2007] Information Technology - Document Schema Definition Languages (DSDL) - Part 8: Document Semantics Renaming Language (DSRL). Proposed ISO/IEC FDIS 19757-8. 2007-12-30. 23 pages. Edited by Martin Bryan. "Proposed FDIS text for Part 8 incorporating the changes agreed in Kyoto and documented in N0947 'Disposition of Comments to N0937 Summary of Voting on DSRL'." Posted to the 'dsdl-discuss@dsdl.org' list 2007-12-30. [source PDF]

[FCD-2] Information Technology — Document Schema Definition Languages (DSDL) — Part 8: Document Semantics Renaming Language (DSRL). ISO/IEC FCD 19757-8. © ISO/IEC 2007. 23 pages. 23 pages. [PDF source]

[0792 text] ISO/IEC FCD 19757-8. Information Technology — Document Schema Definition Languages (DSDL) — Part 8: Document Schema Renaming Language — DSRL. "The Document Schema Renaming Language (DSRL) provides a mechanism whereby users can assign locally meaningful names to XML elements, attributes, entities and processing instructions without having to completely rewrite the DTD or schema to which they are required to conform. In addition, DSRL provides an XML-based format for defining entities. DSRL allows users to define default values for both element content and attribute values. Default values can be forced to apply for all occurrences of the element or attribute, or only for those for which no value is provided in the document instance. DSRL maps can be used to: (1) Map document instances to validation schema; (2) Create schema that can be used to validate documents coded using alternative element or attribute names..." [source]

2007-02-22. Summary of Voting on JTC 1/SC 34 N 792 - Text for FCD ballot for ISO/IEC 19757-8: Document Schema Definition Language (DSDL) Part 8 - Document Schema Renaming Language (DSRL)

Document Schema Definition Languages (DSDL) — Part 8: Document Schema Renaming Language — DSRL. ISO/IEC 19757-8. Edited by Martin Bryan. Preparatory draft (with summary of editorial comments). 2004-10-23. See the reference page ISO/IEC JTC 1/SC 34 N0552.

Part 9: Datatype- and namespace-aware DTDs

"This Part specifies how the ISO 8879 and XML Document Type Declaration (DTD) syntaxes can be extended to validate documents that make use of XML Namespaces and Datatypes. Doing so will ensure that the investment that individuals and organizations have made in DTD development and deployment over many years can be preserved. It also simplifies conversion between DTDs and other forms of schema languages. The specification does not require documents using the schema language to violate XML's well-formedness or validity checks. It simply identifies processing instructions whose role can be considered to be that of specifying additional validation rules to be applied to specific elements or attributes..." [Part 1: Overview]

Document Schema Definition Languages (DSDL) — Part 9: Namespace- and datatype-aware DTDs. Francis Cave [and Martin Bryan]. 2006-11-03. [Proposed candidate to be sent for ballot as a CD.] ISO/IEC 19757-9. ISO/IEC 19757-9 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages. From the Introduction: "The language of Document Type Definitions (DTDs) was the original schema language defined by W3C XML and was closely based upon the DTD language defined by SGML. For a variety of reasons, both technical and economic, many users of XML for document-centric applications, especially among those who were previously (and in some cases continue to be) users of SGML, still favour the use of DTD language for grammar-based schema definition in such applications. It is important to provide users that have made a significant investment in DTDs with a migration path that will enable them to adopt DSDL without having to translate all their existing DTDs to a different schema language, especially as this would oblige them to replace all systems that only work with DTDs, with all the expense and organizational upheavals thereby entailed. A sensible migration path should enable such users to continue to use DTDs for as much of the document validation process as can reasonably be managed, but also enable them to reap the benefits of using those parts of DSDL that most obviously complement and extend the use of DTDs for validation purposes. It is equally important that the migration path should enable users to continue to use legacy systems that are incapable of using any kind of extension to the DTD language, while at the same time introducing new systems that are equipped to use such extensions. The method of extension must therefore be such that DTDs with extended functionality are valid XML DTDs in accordance with W3C XML. It should remain possible to validate an instance that does not make any use of the extended functionality against a DTD that contains the extended functionality, using legacy system tools, and achieve the same result as would be achieved if the DTD did not contain the extended functionality. Scope: This International Standard defines a language that is designed to extend the functionality of an XML DTD to include: (a) specifying one or more namespaces to which some or all of the element and attribute names in a DTD belong; (b) constraining elements with content model 'ANY' to contain elements whose names belong to one or more specified namespaces; (c) specifying datatypes for elements that contain data content and for attribute values..." See the posting and source PDF.

References:

Part 10: Validation Management

Part 10 of ISO/IEC 19757 shows how a language for managing the validation and pre-validation transformation pipelines can be used to integrate functions described in he other parts of this standard. Validation management includes: (1) a mechanism for invoking parsers which read non-XML sources (and XML sources that can't be identified by a single IRI) to create XML information sets (infosets) that can be used for subsequent processing. Examples of such sources include SGML and HTML documents, RDBMS query results, CSV documents and Web Services query results; (2) pre-validation transformations used to normalize and/or subset documents before validation; (3) multiple validations and transformations that are to be applied to the same document; (4) transformations that split a document into multiple resulting documents; (5) facilities to generate customized validation reports which can be output as XML document instances so that they can be further processed by other applications. This part also illustrates how technologies other than those specified in the parts of ISO/IEC 19757, such as the W3C XML Schema and XSLT transformation language, can be used in combination to manage XML and other forms of structured documentation using the XProc pipeline processing language developed by W3C. [adapted from Martin Bryan's Part 1: Overview of April 28, 2007]

References:

  • [April 20, 2005] Document Schema Definition Languages (DSDL) — Part 10: Validation Management. Committee Draft Technical Report. ISO/IEC PDTR 19757-10. Working Draft Status (only), for review and comment. Copyright (c) ISO/IEC 2005. 20 pages. Slightly updated version relative to the "April 15, 2005" draft cited immediately below. [Source: PDF referenced in Martin Bryan's posting.

  • [April 15, 2005] [Early Collection of Proposals for Consideration in DSDL Part 10, Validation Management]. Discusses "Using Pipelining for Validation Management." Reference: 'ISO/IEC PDTR 19757-10'. 'Committee/Preliminary Draft Technical Report'. Posted to the 'dsdl-discuss' list by Martin Bryan (Convenor, SC34/WG1) on April 15, 2005 with the subject "PDTR for Part 10." Contents: 3 Terms and definitions; 4 Using Pipelining for Validation; 4.1 Validation Management using DPML; 4.1.1 Example of Linear DPML; 4.1.2 Example of Asynchronous DPML; 4.2 Validation Management using Cocoon; 4.3 Validation Management using XPL. The documents combines the three submissions on the subject of using pipelining techniques to tackle Validation Management, released in the form of a Preliminary Draft Technical Report, to use as the basis for a discussion on the potential role of pipelining techniques for validation manangement at the SC34/WG1 meeting of 2005-05 Amsterdam. From the Scope Statement: "This International Standard specifies a suite of technologies that can be used to validate the structure and contents of structured documents marked up using ISO 8879 (SGML) and its derivatives (e.g., the W3C Extensible Markup Language, XML). This International Standard defines a set of semantics for describing and ordering validation rules, a set of syntaxes for declaring validation rules, and a syntax for defining models for the management of validation sequences. It includes: (1) Specifications of relevant validation technologies that can be used in isolation or within the DSDL framework. (2) References to validation technologies defined outside of this International Standard that can be used within the DSDL framework. (3) Semantics for managing the sequence in which different validation technologies are to be applied during the production of validation results. This technical report illustrates how existing pipelining languages can be used to manage the validation processes..." [source PDF]

    Proposals in the 2005-04-15 draft:

    1. Validation Management using DPML: "The Declarative Process Markup Language (DPML) is a very simple language for composing NetKernel hosted services into processes: it is not explicitly an XML pipelining language, rather it is one among many language runtimes that can be used on NetKernel to build Unix-like applications. NetKernel treats all software components as URI addressable services and uses a RESTful abstraction to invoke them. Higher level language runtimes (DPML, XRL, Java, Beanshell, Python, Groovy, Javascript) are used to compose services into 'pipelines'. DPML uses the term 'process' rather than 'pipeline' as the latter implies a linear synchronous flow whereas 'process' encompasses asynchronous forks, conditions and loops, etc." According to the DPML Quick Reference Guide, "DPML is a declarative XML processing language that is itself XML. It is more than a simple transform pipelining language as it allows conditional processing, loops, sub-routines, and exception handling. It considers XML documents as the basic data quantum and URIs as the method of addressing them. The XML processing instructions and resolution of URI schemes are abstracted from the language so as to be fully extensible."
    2. Validation Management using Cocoon: "The Apache Cocoon Project provides mechanisms that can be used to control the flow of files through data pipelines. The basic idea behind Cocoon is to identify a set of actions that need to be taken with a file whose location and name match a specific pattern are requested from an Apache server."
    3. Validation Management using XPL: "XPL is a powerful declarative language for processing XML using a pipeline metaphor. XML documents enter a pipeline, are efficiently processed by one or more processors as specified by XPL instructions, and are then output for further processing, display, or storage. XPL features advanced capabilities such as document aggregation, conditionals ('if' conditions), loops, schema validation, and sub-pipelines. XPL pipelines are built up from smaller components called XML processors. An XML processor is a software component which consumes and produces XML documents..."


Articles, Papers, News

  • [August 26, 2008] "Mapping of YANG to DSDL." By Ladislav Lhotka (CESNET). IETF Internet Draft 'draft-lhotka-yang-dsdl-map-00'. July 6, 2008; expires January 7, 2009. Produced by members of the IETF NETCONF Data Modeling Language (NETMOD) Working Group. "This memo describes the algorithm for translating YANG data models to a schema (or multiple schemas) utilizing a subset of the DSDL schema languages together with a limited number of other annotations. The IETF NETMOD working group complements the results of the NETCONF WG by addressing the data modeling issues. The major item in the NETMOD charter is a new data modeling language called YANG. This language, being based on SMIng (RFC 3216), builds on the experience of previous network management systems, most notably SNMP. However, since NETCONF chose eXtensible Markup Language as the method for encoding both configuration data and their envelope (RPC layer), this work can and should also benefit from the body of knowledge, standards and software tools that have been established in the XML world. To this end, YANG also provides an alternative syntax called YIN that is able to represent the same information using XML. Despite the syntactic differences, the information models of YANG and YIN are virtually identical and conversion between YANG and YIN is straightforward in both directions. However, having data models expressed in an XML syntax is not by itself sufficient for leveraging the existing XML know-how and tools. It is also necessary to convey the meaning of YANG models and present it in a way that the existing XML tools can understand. As a matter of fact, YANG/YIN can be viewed as yet another XML schema language. While there are several aspects that make YANG models specific to the NETCONF realm, for the most part the grammatical and semantic constraints that the models express can be equivalently represented in the general-purpose XML schema languages such as W3C XML Schema, RELAX NG, Schematron and others. Therefore, one of the chartered items of the NETMOD WG is to define a mapping from YANG to the Document Schema Definition Languages (DSDL) that is being standardized as ISO/IEC 19757. The DSDL framework comprises a set of XML schema languages that address grammar rules, semantic constraints and other data modeling aspect but also, and more importantly, can do it in a coordinated and consistent way... The aim is to map as much structural, datatyping and semantic information as possible from YANG to DSDL with annotations so that the resulting schema(s) can be used with standard XML tools for a relatively comprehensive validation of the contents of configuration datastores. The most important schema language in the DSDL framework is RELAX NG. The specification of the mapping algorithm given below assumes that output of RELAX NG uses the XML syntax. However, RELAX NG compact syntax is often the preferred form for perusal by human readers. A secondary goal therefore is to guarantee a reasonable level of readability of the resulting RELAX NG in the compact syntax as well, even if multiple embedded annotation types are used..." See also (1) the presentation at IETF 72; (2) YANG: A Data Modeling Language for NETCONF.

  • [June 24, 2004] "ISO's Document Schema Definition Languages (DSDL)." By Martin Bryan, CSW Group Ltd. Presented at the meeting "Developments in XML Schema Languages", a joint meeting of XML UK and W3C Office for UK and Ireland. Thursday 24-June-2004. Rutherford Appleton Laboratory, Didcot, Oxon UK. Conference Chair: Michael Wilson (W3C Office for UK and Ireland). "The multipart ISO 19757 Document Schema Definition Languages (DSDL) will provide an integrated suite of data validation techniques that will inclcude grammar-based validation (e.g. RELAX NG), rule-based validation (e.g. Schematron), namespace-based segmentation of validation candidates (e.g. NVDL), advanced user-customizable datatyping, path-based inter-element validation, character repertoire definition and validation, declarative document architectures and extensions to DTDs to access namespaces, datatypes, etc. The suite will interact through a validation management standard that will be used to control the order in which otherwise separated validation processes are fully integrated...." [Martin Bryan, a Senior Technical Consultant at CSW Informatics, convenes the ISO working group responsible for the development of DSDL. He represents XML UK on BSI's IST/41 panel that monitors the work of ISO/IEC JTC1/SC34. A regular contributor to Interchange and a member of ISUG, Martin has for many years promoted the use of structured document standards such as SGML, DSSSL, Topic Maps, XML and XSL througout Europe.] See: the abstract, and source for presentation (.ppt)

  • [April 21, 2004] "ISO DSDL Overview." By Eric van der Vlist (Dyomedea). Presentation given at XML Europe 2004. Full text in PDF format. "The notion of 'validation' of XML documents covers too many different aspects (structure, content, integrity, business rules, ...) to be performed by a single schema language. Furthermore, even when a single language is used, it is often the case that documents needs to be transformed, split or normalized to keep the schemas simpler. The ISO DSDL project (ISO/IEC JTC 1 SC 34 WG 1) is standardizing a set of specific and simple schema and pre validation transformation languages and a framework to define how these operations must be applied. These languages include well known technologies such as Relax NG and Schematron as well as new languages. This talk gives a full project overview, explaining the goal of each of the parts and present the latest developments of DSDL..."

  • [December 15, 2003] "XML 2003 Session Report: News from the World of DSDL." By Uche Ogbuji. From XMLhack.com (December 15, 2003). "On 10-December-2003 at XML 2003 in Philadelphia, Eric van der Vlist kicked off a block of presentations opening up the world of ISO Document Schema Definition Languages (DSDL) (ISO/IEC JTC 1 SC 34 WG 1), and some of the innovative work being undertaken in that working group. Eric presented an 'Update on ISO DSL Overview and Update'. He proceeded through the various parts of DSDL in order... Part 2: Grammar-based validation is a re-write of the RELAX NG OASIS Specification to meet the requirements of ISO publications, i.e., more formal language. The features will remain the same and the specifications are meant to be identical for assessment of conformance. Eventually RELAX NG compact syntax will be added as an addendum to DSDL Part 2. In Part 3: Rule-based validation, the intent is to create a hosting language for expressing general-purpose rules in XML. The main input is Schematron, and it has been decided that in effect, DSDL Part 3 will present the evolution of Schematron. An example of what DSDL Part 3 will add to Schematron is extension so that not only XPath 1.0 is supported, but also expressions taken from other languages such as EXSLT, XPath 2.0, XSLT 2.0, and even XQuery 1.0... An audience member expressed concern that DSDL is too 'secretive'. He mentioned too a dearth of documents available for public content, despite the clear volume of activity. He noticed that the public mailing list archives were very sparse and many of the archives were private. DSDL members in attendance reassured him that exclusion is not the intention, and expressed a willingness to address concerns about the openness of the project..." See also on RELAX NG (== DSDL Part 2) as an ISO standard: "RELAX NG XML Schema Language Published as an ISO Standard (DSDL Part 2)."

  • [June 19, 2003]   Namespace Routing Language (NRL) Supports Multiple Independent Namespaces.    James Clark has announced the publication of a Namespace Routing Language (NRL) specification. NRL is "an XML language for combining schemas for multiple namespaces; it allow the schemas that it combines to use arbitrary schema languages." The release includes a tutorial and specification document and a sample implementation in the Jing (RELAX NG Validator in Java) distribution. NRL "is the successor to Clark's Modular Namespaces (MNS) language and is intended to be another step on the path towards Document Schema Definition Languages (DSDL) Part 4." The W3C XML Namespaces Recommendation itself "allows an XML document to be composed of elements and attributes from multiple independent namespaces: each of these namespaces may have its own schema and the schemas for different namespaces may be in different schema languages. The problem then arises of how the schemas can be composed in order to allow validation of the complete document." The Namespace Routing Language attempts to solve this problem. Among the features and benefits of NRL: it supports schema language coexistence, allows extension of schemas not designed to be extended, makes authoring of extensible schemas easier supports 'transparent' namespaces, allows contextual control of extension, and allows concurrent validation. "For RELAX NG, it can be used to provide some of the namespace-based modularity features that are built-in to XSD. NRL is designed to allow an implementation to stream, and the sample implementation does so. The sample implementation has a SAX-based plug-in architecture that allows new schema languages to be added dynamically. It comes with support for RELAX NG (both XML and compact syntax), W3C XML Schema (via a wrapper around Xerces-J), Schematron, and (recursively) NRL; it can also use any schema language with an implementation that supports the JARV interface."

  • [May 01, 2003] "DSDL Interoperability Framework." By Eric van der Vlist. From XML.com (April 30, 2003). ['While W3C XML Schema has had rapid uptake in many web services and data-oriented XML applications, another set of technologies, ISO DSDL, has been under development by the self-proclaimed "document-heads." The needs of document validation are different, and markedly more pluralist, than those of the users of W3C XML Schema, and an ISO Working Group has started work on a standard to address those needs. Our main feature this week, from XML.com's resident schema expert Eric van der Vlist, briefly introduces ISO DSDL as well as the Document Schema Definition Languages, and gives an overview of the work underway to create the DSDL Interoperability Framework. This framework is essentially the glue that will join together the multiple schema languages and transformation steps supported by DSDL.'] "... Why does DSDL need an Interoperability Framework? The quick answer is that the Interoperability Framework is the glue between all the pieces of DSDL. The chief design principle of DSDL is to split the issue of describing and validating documents into simpler issues: grammar based validation, rule based validation, content selection, datatypes, and so on. Different types of validations and transformations, defined inside or outside the DSDL project, often need to be associated with each other. The framework allows for the integration of these validations and transformations. Examples of such mixing include localization of numeric or date formats, prevalidation canonicalization to simplify the expression of a schema, independent content separated into different documents to be validated independently, aggregation of complex content into a single text node, separation of structured simple content into a set of elements, and so on.. The two initial proposals (Schemachine and XVIF) were presented to the ISO DSDL working group in Baltimore (December 2002); although they were considered a valuable input, both were rejected, for different reasons: (1) Schemachine was considered 'too procedural': its focus is on defining pipes, that is, defining the algorithm used to validate a document, while it would be more appropriate to focus on defining the rules to meet to consider that a document is valid. (2) XVIF was considered too intrusive: to fully support XVIF, the semantics of the different schema languages must be extended and the schema validators need to be upgraded. An interoperability framework should work with existing schema languages and processors without requiring any update. To take these two requirements into account, a new proposal has been made which builds upon ideas from Schemachine and XVIF, but also from XSLT and Schematron. This proposal has been named 'XVIF/Outie', after a joke from Rick Jelliffe. A description of XVIF/Outie can be found online and a prototype implementation is available... Xvif/Outie or something derived from it should become an ISO DIS. I am also committed to develop Xvif and its micro pipes. When Outie becomes more stable, I will make sure to find a convergence between the two Xvif flavors..."

  • [January 31, 2003] Modular Namespaces (MNS) Language Supports Validation from Multiple Independent Namespaces. A posting from James Clark announces the design and implementation of a language called Modular Namespaces (MNS). "The XML Namespaces Recommendation allows an XML document to be composed of elements and attributes from multiple independent namespaces. Each of these namespaces may have its own schema. The problem then arises of how the schemas can be composed in order to allow validation of the complete document. In RELAX Namespace, Murata Makoto pioneered the idea of dividing the document into islands, with each island containing a single namespace, and validating each island separately against the schema for its namespace. RELAX Namespace formed the basis for the recently published Committee Draft of Document Schema Definition Languages (DSDL) -- Part 4: Selection of Validation Candidates. [This] language named Modular Namespaces (MNS) is an evolution of the ideas in RELAX Namespace and DSDL Part 4. RELAX Namespace was designed to work well with RELAX Core. RELAX Core cannot deal with documents that use multiple namespaces, nor does it provide any namespace-based wildcards. These limitations of RELAX Core are reflected in the design of RELAX Namespace. MNS is designed to be able to take advantage of more recent schema languages, such as RELAX NG, that are not limited in this way... There's a new release of Jing that includes a sample implementation. MNS is designed to be useful when: (1) you have an instance that uses elements and attributes from multiple namespaces; (2) you have one or more schemas (not necessarily all in the same schema language) each of which deals with one or more of these namespaces; (3) you want to use all these schemas to validate the instance..."

  • [July 24, 2002] "Schemachine. A Framework For Modular Validation of XML Documents." By Rick Jelliffe. June 21, 2002. ["This note specifies a possible framework for supporting modular XML validation. It has no official status whatsover. It is for discussion purposes only. Review comments are welcome... This has been developed as a strawman for the ISO DSDL effort. For another strawman using a different basis, see Eric van der Vlist's Xml Validation Interoperability Framework (xvif)..."] "The strawman has the following features: (1) based on XML Pipeline structures, but with rearrangement and renaming, (2) embedded in Schematron-like superstructure with titles and phases, (3) a minimal implementation is possible, where all validators and translators are command- line executable programs, and the framework document is translated into BAT files or Bourne shell scripts (i.e., validators etc. are treated as black boxes) , (4) the purpose is validation rather than declarative description per se. (In particular, the further down a transformation chain that data gets, the more difficult it will be to tie the effect of a schema to the original document. ) (5) this framework supports both validation of explicit structure and validation of complex data values. It leaves issues of simple datatyping to particular validators, (6) validation is a tree of processes, (7) supports inband signalling (@exclude) and out-of-band signalling ()@haltOnFail)..." See also the reference in "Locally Linked Infosets," by Francis Norton. [cache]

  • [June 26, 2002] "DSDL Examined." By Leigh Dodds. From XML.com. June 26, 2002. ['In his final column Leigh looks at DSDL, the ISO activity to standardise XML document validation.'] "The core of DSDL will be the Interoperability Framework (Part 1): the glue that binds together the other modules. This week Eric van der Vlist, who is the appointed editor of this section, and Rick Jelliffe have separately produced proposals that aim to explore these kind of framework structures in more detail. The two proposals, neither of which have any formal standing, take very different approaches to the same problem. Van der Vlist's XML Validation Interoperability Framework (XVIF) takes the approach of embedding validation and transformation pipelines within another vocabulary. The specification and online demonstrator both show how this could be achieved by embedding the pipelines within a schema language, but in principle the XVIF is language-neutral so could be embedded within an XSLT transformation for example. XVIF elements just rely on their container to provide the context node on which they will interact. The embedded pipelines may generate other nodes or a simple boolean validation flag. Van der Vlist has produced a prototype that supports using pipelines containing XPath expressions, XSLT transformations, and manipulating content with simple regular expressions, or using Regular Fragmentations. In contrast, Rick Jelliffe's proposal, 'Schemachine' is closer to other pipeline frameworks such as XPipe and Cocoon in that the pipelines are defined by a separate vocabulary. In fact Jelliffe notes that the proposal borrows a lot from XPipe and Schematron in that it has a number of similar elements and structures, e.g., phases. Schemachine divides pipeline elements up into particular roles such as Selectors (e.g., XPath expressions), Tokenizers (e.g., Regular Fragmentations) and Validators (e.g., RELAX NG, Schematron). Jelliffe differentiated XVIF and Schemachine as 'innies and outies'. Technology aside, the important aspect of these proposals is the intent: publicly exploring strawman proposals and implementations to gather feedback before considering standardization. That's a path which seems not only likely to produce viable results, but may actually deliver useful tools that others benefit from in the shorter term..."

  • [May 29, 2002] DSDL expanded: On 2002-05-29, Google search for "Document Schema Definition Language" turned up "Results 1 - 10 of about 170" while the search for "Document Schema Definition Languages" returned "Results 1 - 3 of about 6." As of this same date, the DSDL.org website and WikiWikiWiki explain DSDL as the name of an ISO Project "Document Schema Definition Language" -- "DSDL (Document Schema Definition Language) is a new project to create a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology." However, ISO/IEC DIS 19757-2 (DSDL Part 2) uses the plural in the document title: "Document Schema Definition Languages (DSDL)." The earliest attestations of "DSDL" in reference to the (proposed) project and (proposed) standard used an expansion "Document Schema Definition Language."

  • [March 29, 2002] "DISARM: Document Information Set Articulated Reference Model." By Rick Jelliffe. Discussion Draft. February 24, 2002. ISO/IEC JTC1/SC34 Document #292. "This note proposes an ISO standard 'Document Information Set Articulated Reference Model' be developed, to provide the basis for ISO DSDL and for renewing ISO 8879 SGML... The utility of DISARM might include that it can provide an attractive way to allow a top-down re-specification of SGML in a future ISO 8879. It would might also provide some help for DSDL." Motivation: "Since 1986, there have been four notable streams in markup languages: (1) ISO 8879 SGML, extended by the General Facilities, Architectural Forms Definitions Requirements (ADFR), Lexical Types Definition Requirements (LTDR), Formal System Identifiers (FSI), Annexes J to L, augmented with OASIS Catalogs. A parser implementation of mature SGML in Open Source is James Clark's SP. (2) W3C HTML, in various versions, with dialects including ASP, JSP, PHP, and Blogger. A parser implementation for mature HTML in Open Source is Dave Ragget's Tidy. (3) W3C XML, extended by Namespaces, XBase, XInclude. Widespread implementations of parsers use the mature SAX API. (4) The current ISO DSDL project, informed by RELAX Namespaces, RELAX NG, W3C XML Schemas, Schematron. The Xerces XNI API is a recent attempt to cope with post-processing XML, for uses such as validation and creating typed information sets. In all these cases, the natural increase in complexity of evolving standards has made it difficult to understand the processing order and operation. ISO 8879 has been widely criticized for not being amenable to simple grammatical analysis ('not using "computer science concepts"'), yet the same problems are experienced even with overtly layered specifications such as the XML family, due to this entropy. These problems would be reduced by introducing a reference model which was neutral with regard to each of the four main streams, but allowed clear and diagrammatic exposition of the stages of parsing and processing a marked-up document incrementally from bits to a terminal information set... The reference model uses UML terminology and diagrams at the top-level only. If desired, specific graphical stereotypes could be created, as allowed by UML. It models the kinds of markup processing of interest as a chain of components, one connected to the next, each of which implements a common event-passing interface. Different markup languages and SGML features can be modeled using particular chains of components..." Cf. also the DSDL list. Also available from the SC 34 website. [cache]

  • [January 20, 2002] Rick Jelliffe 2002-01-20: ISO DSDL, mentioned on the list recently, aims to go one step beyond RDDL, in one sense. It aims to provide a processing chain for schema and augmentation systems (e.g., perhaps even to be able express things like "I want to validate my document using RELAX NG with W3C Schemas datatypes, then run it through a Regular Fragmentations converter and then a Schematron validation, then convert the RELAX schema to a W3C XML Schema and generate the appropriate PSVI.") That is a particularly hairy example, but the idea, as I understand it, is that to allow schema language modularity (i.e. to fight the "one-schema-language -should-fit-all" bloat) we need to provide not only a way to specify and group the schemas, but also their processing inter-dependencies... [XML-DEV]

  • [January 20, 2002] 'Schematron and DSDL'. Email thread on "Schematron-love-in" list, 2002-01. Gary Robertson asked: "Could anybody tell me whether the ISO DSDL standard based on Schematron is going ahead? If so, how would I go about monitoring and contributing to it? Is there a mailing list? Is the intention to produce a complete XML schema language that can be used describe the structure of arbitrarily complex data or is this a longer term goal?..." Rick Jelliffe: "Yes, I have agreed to edit it. It would be nice to have a draft available in April. I would like to use this list to discuss it. I think I would like the ISO standard to be much simpler than the current spec. ISO have structures they like, but probably it only needs to be (1) Terms: 1-2 page; (2) Motivation/overview: 1-2 page; (2) Formal spec: 1-2 page; (3) DTD and Schematron schema: 5-20 pages; (4) References: 1 page. The key technical questions I see are: (1) Do we want to (use this as an excuse to) simplify it? E.g. get rid of key? (2) Do we want to (use this as an excuse to) enhance it? There have been various wishlist features: I would like to parameterize patterns very much, and there have been other suggestions on this list in the past. There is one case (providing value() in assertions) which I oppose (because an assertion is not a diagnostic) but many users expect/want, which maybe we should revisit. (3) How do we handle XPath 2 and ESLT?..." See followups.

  • [December 13, 2001]   ISO Publishes Initial Draft Portions of the Document Schema Definition Language (DSDL).    ISO/IEC JTC1/SC34 has published an overview document for the Document Schema Definition Language (DSDL) and has appointed editors for three of the seven major parts which will make up the new International Standard. The Document Schema Definition Language (DSDL) is to be "a multipart International Standard defining a modular set of specifications for describing the document structures, data types, and data relationships in structured information resources. Two kinds of integrated specifications are included: (1) specifications for describing aspects of validity of a document, and (2) rules for combining and packaging a collection of processes applicable to the task of validating a document. This integration makes DSDL applicable to both business and publishing applications of structured information resources. This applicability reflects the expansion of Extensible Markup Language (XML) applications beyond the publishing environment in which XML and its foundation (Standard Generalized Markup Language, SGML) were first developed." The seven primary parts of the standard are: Part 1 - Framework; Part 2 - Grammar-oriented schema languages; Part 3 - Primitive data type semantics; Part 4 - Path-based integrity constraints; Part 5 - Object-oriented schema languages; Part 6 - Information item manipulation; Part 7 - Namespace-aware processing with DTD syntax. [Full context]

  • [December 13, 2001] "Recommendations of the ISO/IEC JTC1/SC34/WG1 Meeting Orlando December 9-12, 2001." From SC34/WG1. ISO/IEC JTC 1/SC34 N285. [cache]

  • [November 03, 2001]   First Public Working Draft of Document Schema Definition Language (DSDL).    In October 2001, ISO Subcommittee 34 (ISO/IEC JTC 1/SC34, Information Technology: Document Description and Processing Languages) released a first working draft for DSDL. Edited by Martin Bryan, the draft 'Document Schema Definition Language' (DSDL) "allows the definition of document structures, data types and data relationship constraints that can be applied to data represented using the ISO/IEC 8879 Standard Generalized Markup Language and its derivatives, such as ISO/IEC 10744, Hypermedia/Time-based Structuring Language (HyTime), and the W3C Extensible Markup Language (XML). A new, compact, efficient and XML-based document type definition for the integrated description of document structures, data types and data relationships will make it possible to automate the processing of structured information resources to the level required by business users, which has a higher level of requirements than those identified from the publishing community for which SGML was originally developed. The standard will also define the scope and notation for converting and interworking a core subset of document structure, data type, and data relationship constraint models among the three notations: DSDL, DTD declarations and XSD. Informative Annex 4 of the draft ['Alphabetical List of DSDL Components'] supplies (1) a list of DSDL components common to SGML and XML, viz. DSDL components which can be used to describe documents conforming to the WebSGML subset of ISO/IEC 8879, and (2) DSDL components specific to SGML, viz. extensions which could be made if it is decided that DSDL should be able to express all constructs in SGML document instances as well as the WebSGML subset. [Full context]

  • [November 03, 2001] DSDL First Working Draft. 2001-10-22. [cache]

  • [October 22, 2001] First Working Draft bibliographic information: "U.K. National Body Contribution to First Working Draft of Document Schema Definition Language (DSDL)." Document: ISO/IEC JTC 1/SC34 N264. 22-October-2001. First Working Draft. Edited by (Project Editor) Martin Bryan. Source provided by G. Williams, UK NB. From the activity of ISO/IEC JTC 1/SC34 Information Technology -- Document Description and Processing Languages.

  • [October 06, 2001] "DSDL Use Cases." [Presented as 'Possible Extensions to RELAX NG.'] By Martin Bryan (The SGML Centre). BSI IST/41. Posted to the RELAX-NG mailing list. "In talking to James Clark earlier today about the relationship between RELAX NG and the proposed new ISO Document Structure Definition Language (DSDL) James asked if I could provide some use cases that would justify the initial set of requirements that the DSDL proposal contained. The attached document starts by listing the requirements identified as being essential for DSDL, and then provides a set of use case statements that seeks to justify each requirement. It also contains brief use cases for supporting three optional features of SGML that are not supported by XML, and not listed as being requirements for DSDL, but for which cases can be made within data streams being used by businesses..."

  • [October 04, 2001] "Summary of Voting on SC 34 N 223 - Document Schema Definition Language (DSDL)." Reference: ISO/IEC JTC 1/SC34 N257. "Based on the voting results, this project is approved. This document is forwarded to WG 1, which is instructed to begin development of this new work item. The JTC 1 NP number for this project is ISO/IEC 19757. [cache]

  • [September 14, 2001] From ISO/IEC JTC 1/SC34 N0253 ("Business Plan for JTC1/SC34 for the year 2002," by James D. Mason, Chairman, JTC1/SC34): "The development of XML has led to the creation of alternative systems for specifying the allowable structures in SGML and XML applications. The traditional method of DTDs (Document Type Definitions) specified in ISO 8879 has been supplemented by the W3C's XML Schema. SC34 has just completed the Fast Track process for RELAX (Regular Language description for XML), ISO/IEC TR 22250-1, Part 1: RELAX Core (http://www.xml.gr.jp/relax/). Because the creators of the RELAX project have already moved on to later technology, SC34 has submitted a proposal for a Document Schema Definition Language (DSDL) that will incorporate later developments, such as RELAX-NG..."

  • [June 12, 2001] ISO JTC1/SC34/WG1 recently approved a proposal for a new work item on a 'Document Schema Definition Language (DSDL)'. The NP was submitted by the British Standards Institution (BSI), who have been asked to appoint an editor "to complete a first draft based on extensions to RELAX-NG and forward it to SC34 for review." The specification would govern "the definition of document structures, data types and data relationship constraints that can be applied to data represented using the ISO/IEC 8879 Standard Generalized Markup Language and its derivatives, such as ISO/IEC 10744, Hypermedia/Time-based Structuring Language (HyTime), and the W3C Extensible Markup Language (XML)." Background for the NP: "SGML Document Type Definitions (DTDs) allow document structures to be formally modelled but do not allow details of data types or data relationships to be recorded in an XML-compatible way. While the W3C XML Schema Definition language (XSD) does allow data types to be used to validate the contents of SGML elements and values of attributes, it does not allow the relationships between the values of different attributes and contents of elements to be validated. A new, compact, efficient and XML document type definition for the integrated description of document structures, data types and data relationships will make it possible to automate the processing of structured information resources to the level required by business users, which has a higher level of requirements than those identified from the publishing community for which SGML was originally developed. The standard will also define the scope and notation for converting and interworking a core subset of document structure, data type, and data relationship constraint models among the three notations: DSDL, DTD declarations, and XSDL." According to the draft proposal, a "preparatory draft will be contributed by the UK National Body for the SC34 meeting in December 2001. Liaison with the W3C XML Coordinating Committee will be undertaken to keep the standard aligned with the work being done to manage information sets developed for XML. The committee expects to be able to integrate the best practices of [recent] proposals to form the basis of a first draft of the new standard... [for example,] the RELAX TR developed by the Japanese National Body as ISO 22250 and the TREX language developed by James Clark (the editor of the ISO/IEC 10179 Document Style Semantics and Specification Language) both propose efficient XML representations of document models, including data types. The widely acknowledged XML Schema Data Types specification will be referenced. The Schematron language allows the relationships between data elements and attributes to be described."

  • [June 12, 2001] "Document Schema Definition Language (DSDL) Proposed as ISO New Work Item."

  • [May 23, 2001] Bibliographic reference: Proposal for a New Work Item. Title: "Document Schema Definition Language (DSDL)." Date of presentation of proposal: 2001-05-23. Proposer: Martin Bryan, ISO/IEC JTC 1/SC 34. Secretariat: National Body (Acronym) BSI. NWI is expected to lead to a single International Standard. NWI proposed for assignment to ISO/IEC JTC 1 / SC34 / WG1. [Martin Bryan will contribute a first draft for the requirements statement; James Clark may be proposed as editor by BSI.]

  • [May 23, 2001] Proposed NP for Document Schema Definition Language (DSDL). [PDF]

  • Proposal For a New Work Item [.DOC], [cache]


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/dsdl.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org