A communiqué from Rick Jelliffe describes the availability of a ISO FCD (Final Committee Draft) for ISO/IEC 19757-3 Document Schema Definition Languages (DSDL) — Part 3: Rule-Based Validation — Schematron.
Schematron is a language for making assertions about patterns found in XML documents, and serves as a schema language for XML. As Part 3 of the multi-part ISO/IEC 19757 (DSDL) standard, it defines "requirements for Schematron schemas and specifies when an XML document matches the patterns specified by a Schematron schema."
This Final Committee [Review] Draft of ISO Schematron incorporates feedback from national standards bodies and from implementers. It is available online in PDF, HTML, and RTF formats. Improvements "include an annex on multilingual schemas, further treatment of abstract patterns, and validated schemas. The predicate logic used to specify Schematron formally has also been reworked. The specification remains very small, at about 35 pages including front matter, schemas and non-normative annexes."
This FCD draft has been made publicly available for comment, for identification of spelling errors, and as an aid implementers and users until the final International Standard is published in paper by ISO and other nations that adopt Schematron as a national standard, expected in 2005. This text is suitable as the interim reference for organizations adopting Schematron. The editors encourage all Schematron implementers to check the draft standard and to add support for it for 2005.
According to the Schematron.com Overview, "the Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages. If you know XPath or the XSLT expression language, you can start to use The Schematron immediately. It allows you to develop and mix two kinds of schemas: (1) Report elements allow you to diagnose which variant of a language you are dealing with; (2) Assert elements allow you to confirm that the document conforms to a particular schema."
The Schematron "can be useful in conjunction with many grammar-based structure-validation languages: DTDs, XML Schemas, RELAX, TREX, etc. As part of an ISO DSDL (Document Schema Description Languages) standard, Schematron is designed to allow multiple, well-focussed XML validation languages to work together. One can even embed a Schematron schema inside an XML Schema <appinfo> element or inside a RELAX NG schema."
ISO Schematron differs from the earlier Schematron Version 1.5 in four ways: (1) A new namespace has been adopted using a Persistent URL (PURL), in common with other ISO DSDL languages: http://purl.oclc.org/dsdl/schematron. (2) The title element is always used for human-readable titles, and the attributes is always an ID, or key. (3) Variables (let) are now available on all basic elements, with scoping. (3) Abstract patterns have been introduced."
The main objective of the ISO/IEC 19757 Document Schema Definition Languages (DSDL) standard "is to bring together different validation-related tasks and expressions to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified. By providing an integrated suite of constraint description languages that can be applied to different subsets of a single XML document, DSDL allows different validation technologies to be integrated under a well-defined validation policy."
Bibliographic Information
ISO/IEC 19757-3 Document Schema Definition Languages (DSDL) — Part 3: Rule-Based Validation — Schematron. Final Committee Draft for review/reference. October 2004. 35 pages. ISO/IEC 19757-3 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages.
ISO/IEC 19757-3 Schematron: FCD Document Scope and Structure
Document Schema Definition Languages (DSDL) — Part 3: Rule-based validation — Schematron
This part [Part 3] of ISO/IEC 19757 specifies Schematron, a schema language for XML.
Considered as a document type, a Schematron schema contains natural-language assertions concerning a set of documents, marked up with various elements and attributes for testing these natural-language assertions, and for simplifying and grouping assertions.
Considered theoretically, a Schematron schema reduces to a non-chaining rule system whose terms are boolean functions invoking an external query language on the instance and other visible XML documents, with syntactic features to reduce specification size and to allow efficient implementation.
Considered analytically, Schematron has two characteristic high-level abstractions: the pattern and the phase. These allow the representation of non-regular, non-sequential constraints that Part 2 cannot specify, and various dynamic or contingent constraints.
This part of ISO/IEC 19757 establishes requirements for Schematron schemas and specifies when an XML document matches the patterns specified by a Schematron schema.
The structure of this part of ISO/IEC 19757 [19757-3, FCD] is as follows:
- Clause 5 describes the syntax of an ISO Schematron schema
- Clause 6 describes the semantics of a correct ISO Schematron schema; the semantics specify when a document is valid with respect to an ISO Schematron schema
- Clause 7 describes conformance requirements for implementations of ISO Schematron validators
- Annex A is a normative annex providing the Part 2 (RELAX NG) schema for ISO Schematron
- Annex B is a normative annex providing the ISO Schematron schema for constraints in ISO Schematron that cannot be expressed by the schema of Annex A
- Annex C is a normative annex providing the default query language binding to XSLT
- Annex D is a non-normative annex providing a DTD and corresponding ISO Schematron schema for a simple XML language Schematron Validation Report Language
- Annex E is a non-normative annex providing motivating design requirements for ISO Schematron
- Annex F is a normative annex allowing certain Schematron elements to be used in external vocabularies
- Annex G is a non-normative annex with a simple example of a multi-lingual schema
About Document Schema Definition Languages (DSDL)
The "Document Schema Definition Languages (DSDL)" local reference document contains bibliographic citations for some parts of DSDL. ISO/IEC 19757 is being prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages.. ISO/IEC 19757 consists of the following parts, under the general title Document Schema Definition Languages (DSDL):
- Part 1: Overview
- Part 2: Grammar-based validation — RELAX NG
- Part 3: Rule-based validation — Schematron
- Part 4: Namespace-based validation dispatching language — NVDL
- Part 5: Datatypes
- Part 6: Path-based integrity constraints
- Part 7: Character repertoire description language — CRDL
- Part 8: Document schema renaming language — DSRL
- Part 9: Datatype- and namespace-aware DTDs
- Part 10: Validation management [from ISO/IEC FDIS 19757-3]
From the Introduction:
This International Standard [ISO/IEC 19757] defines a set of Document Schema Definition Languages (DSDL) that can be used to specify one or more validation processes performed against Extensible Stylesheet Language (XML) or Standard Generalized Markup Language (SGML) documents. (XML is an application profile SGML ISO 8879:1986.)
A document model is an expression of the constraints to be placed on the structure and content of documents to be validated with the model. A number of technologies have been developed through various formal and informal consortia since the development of Document Type Definitions (DTDs) as part of ISO 8879, notably by the World Wide Web Consortium (W3C) and the Organization for the Advancement of Structured Information Standards (OASIS). A number of validation technologies are standardized in DSDL to complement those already available as standards or from industry.
To validate that a structured document conforms to specified constraints in structure and content relieves the potentially many applications acting on the document from having to duplicate the task of confirming that such requirements have been met. Historically, such tasks and expressions have been developed and utilized in isolation, without consideration of how the features and functionality available in other technologies might enhance validation objectives.
The main objective of this International Standard is to bring together different validation-related tasks and expressions to form a single extensible framework that allows technologies to work in series or in parallel to produce a single or a set of validation results. The extensibility of DSDL accommodates validation technologies not yet designed or specified.
In the past, different design and use criteria have led users to choose different validation technologies for different portions of their information. Bringing together information within a single XML document sometimes prevents existing document models from being used to validate sections of data. By providing an integrated suite of constraint description languages that can be applied to different subsets of a single XML document, this International Standard allows different validation technologies to be integrated under a well-defined validation policy.
Principal references:
- ISO/IEC 19757-3 Document Schema Definition Languages (DSDL) — Part 3: Rule-Based Validation — Schematron. PDF. Also in HTML format. [cache]
- Schematron specification reference page
- Schematron Resources
- Schematron reference page. From Academia Sinica Computing Centre, Taibei.
- "Comment Disposition for the voting on JTC 1/SC 34 N 538 -- Document Schema Definition Languages (DSDL) -- Part 3: Rule-based validation (Schematron)." 2004-10-27. ISO/IEC JTC 1/SC 34 N0553.
- "Summary of Voting on JTC 1/SC 34 N 524 -- Document Schema Definition Languages (DSDL) -- Part 3: Rule-base validation (Schematron)." 2004-09-10. ISO/IEC JTC 1/SC 34 N0538.
- CD for National Body ballot, ISO/IEC JTC 1/SC 34 N0524, and PDF [cache]
- ISO/IEC JTC 1/SC 34 — Document Description and Processing Languages — Secretariat Manager's Report
- Contact: Rick Jelliffe
- "Yet More Schematron Activity." By Uche Ogbuji. From O'Reilly Developer Weblogs. October 08, 2004.
- "Discover the Flexibility of Schematron Abstract Patterns. Advanced Schematron Features Open Up Extraordinary Possibilities for XML Schemata." By Uche Ogbuji (Principal Consultant, Fourthought, Inc). From IBM developerWorks. October 08, 2004.
- "Schematron 1.5: Looking Under the Hood." By Bob DuCharme. From XML.com (October 06, 2004).
- "ISO DSDL Overview." By Eric van der Vlist (Dyomedea). Presentation given at XML Europe 2004. Full text in PDF format. "The notion of 'validation' of XML documents covers too many different aspects (structure, content, integrity, business rules, ...) to be performed by a single schema language. Furthermore, even when a single language is used, it is often the case that documents needs to be transformed, split or normalized to keep the schemas simpler. The ISO DSDL project (ISO/IEC JTC 1 SC 34 WG 1) is standardizing a set of specific and simple schema and pre validation transformation languages and a framework to define how these operations must be applied. These languages include well known technologies such as Relax NG and Schematron as well as new languages. This talk gives a full project overview, explaining the goal of each of the parts and present the latest developments of DSDL..."
- "XML Schemas" - Main reference page.
- "Document Schema Definition Languages (DSDL)" - Main reference page.
- "Schematron: XML Structure Validation Language Using Patterns in Trees" - Main reference page.