CP RSS Channel
About Our Sponsors
Articles & Papers
Technology and Society
[June 02, 2001] RELAX NG is a "specification for a language that validates XML documents," otherwise characterized as a "simple schema language for XML" which focuses upon description and validation of the structure and content of an XML document without attempting to specify application processing semantics.
[December 15, 2003] RELAX NG XML Schema Language Published as an ISO Standard (DSDL Part 2). A posting from James Clark announces the publication of the RELAX NG specification as an ISO standard, being Part 2 'Regular-Grammar-Based Validation' of the multi-part ISO 19575 Document Schema Definition Language (DSDL). In Clark's vision, the RELAX NG schema language is "based firmly on the labelled-tree abstraction," distinguished from other XML schema languages by what it leaves out; in RELAX NG, the syntax and minimal labelled-tree abstraction implicit in that syntax are at the center of XML processing." According to the DSDL Part 2 abstract, ISO/IEC 19757-2:2003 "specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. A RELAX NG schema is itself an XML document. ISO/IEC 19757-2:2003 specifies (1) when an XML document is a correct RELAX NG schema and (2) when an XML document is valid with respect to a correct RELAX NG schema." RELAX NG is supported by a growing collection of software tools, including validators, conversion utilities, code generators, and XML editors. ISO/IEC 19757-2:2003 is Part 2 of a planned ten-part ISO standard which will include "Rule-Based Validation: Schematron" (Part 3) as well. The goal of ISO SC34/WG1 (Document Description and Processing Languages, Information Description) in developing Document Schema Definition Languages (DSDL) is "to create a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology."
[November 21, 2002] RELAX NG Compact Syntax Published as an OASIS Committee Specification. The OASIS RELAX NG Technical Committee has released a committee specification for RELAX NG Compact Syntax. Edited by James Clark, the committee specification describes a compact, non-XML syntax for the RELAX NG Specification (OASIS Committee Specification 3-December-2001). The compact syntax is specified by a grammar in BNF; the translation into the XML syntax is specified by annotations in the grammar. "The goals of this compact syntax are to: (1) maximize readability; (2) support all features of RELAX NG -- it must be possible to translate a schema from the XML syntax to the compact syntax and back without losing significant information; (3) support separate translation -- a RELAX NG schema may be spread amongst multiple files, it must be possible to represent each of the files separately in the compact syntax, and the representation of each file must not depend on the other files. The compact syntax has similarities to W3C XQuery 1.0 Formal Semantics, to Regular Expression Types for XML (XDuce), and to the DTD syntax of XML 1.0. The body of the document contains an informal description of the syntax and how it maps onto the XML syntax. Developers should consult Appendix A for a complete, rigorous description. The non-normative Appendix B presents an example compact syntax RELAX NG schema for RELAX NG."
[December 03, 2001] OASIS Releases RELAX NG Version 1.0 XML Language Validation Specification. The OASIS RELAX NG Technical Committee has produced a version 1.0 Committee Specification for RELAX NG, a simple schema language for XML. The three principal work products completed in the version 1.0 release include: (1) the RELAX NG Specification itself, which supplies the definitive specification of RELAX NG, a simple schema language for XML based on RELAX and TREX; (2) a RELAX NG Tutorial; (3) a RELAX NG DTD Compatibility document, which defines datatypes and annotations for use in RELAX NG schemas; the purpose of these datatypes and annotations is to support some of the features of XML 1.0 DTDs that are not supported directly by RELAX NG. RELAX NG "offers a complementary alternative to the W3C XML Schema Recommendation, providing an option for developers who value ease-of-use and a middle ground for those adopting multiple schema languages." According to James Clark, 'The key to RELAX NG's simplicity lies in the fact that it does not have any mechanisms specific to particular XML applications. Instead, RELAX NG concentrates on the syntax of XML documents. This opens RELAX NG to as wide a variety of applications as XML itself.' "Publication RELAX NG and XML 1.0 as JIS (Japanese Industrial Standards) is under consideration, and INSTAC (Japanese Information Technology Research and Standardization Centre) plans to prepare the draft." James Clark has updated several software tools and resources to support XML processing under the RELAX NG 1.0 specification. [Full context]
[August 15, 2001] RELAX NG Version 0.9 Released for Two-Month Review and Implementation Period. James Clark (OASIS RELAX NG Technical Committee Chair) has posted an announcement for the release of the RELAX NG Version 0.9 specification. The technical committee has "allocated a period of two months for public comment and implementation. At the end of this period, the team plans to resolve all comments received and release RELAX NG version 1.0." A RELAX NG Tutorial has also been published as an OASIS Committee Specification. Appendices in this tutorial document provide summary comparisons of RELAX NG with XML DTDs, RELAX Core, and TREX. RELAX NG is "a simple schema language for XML, based on RELAX and TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document; a RELAX NG schema thus identifies a class of XML documents consisting of those documents that match the pattern... The key features of RELAX NG are that it is simple, easy to learn, uses XML syntax, does not change the information set of an XML document, supports XML namespaces, treats attributes uniformly with elements so far as possible, has unrestricted support for unordered content, has unrestricted support for mixed content, has a solid theoretical basis, and can partner with a separate datatyping language. RELAX NG itself performs only validation: it does not change the infoset of an XML document. Most of the features of XML 1.0 DTDs that are not supported by RELAX NG involve modification to the infoset. In XML 1.0, validation and infoset modification are combined in a monolithic XML processor. It is a goal of the [RELAX NG] specification to provide a clean separation between validation and infoset modification, so that a wide variety of implementation scenarios are possible." [Full context]
Comparison of RELAX NG with XML DTDs. "RELAX NG provides functionality that goes beyond XML DTDs. In particular, RELAX NG (1) uses XML syntax to represent schemas; (2) supports datatyping; (3) integrates attributes into content models; (4) supports XML namespaces; (5) supports unordered content; (6) supports context-sensitive content models; (7) has improved support for cross-references. RELAX NG does not support features of XML DTDs that involve changing the infoset of an XML document. In particular, RELAX NG (1) does not allow defaults for attributes to be specified; (2) does not (so!) allow entities to be specified; (3) does not (so!) allow notations to be specified; (4) does not specify whether white-space is significant. Also RELAX NG does not define a way for an XML document to associate itself with a RELAX NG pattern." [From the June 08, 2001 RELAX NG Tutorial, edited by James Clark and Makoto MURATA; see posting 27-June-2001 for the crucial missing "not" in tutorial drafts, bis.]
[July 05, 2001] Initial Release of a RELAX NG Working Draft Specification. James Clark has announced the release of an initial working draft specification for RELAX NG. Edited by James Clark and Makoto MURATA for the OASIS TC, this working draft is not [yet] an official committee work product; comments are invited. The document presents "the definitive specification of RELAX NG, a simple schema language for XML, based on RELAX and TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The WD specifies (1) when an XML document is a correct RELAX NG schema, and (2) when an XML document is valid with respect to a correct RELAX NG schema. Section 2 describes the RELAX NG data model, which is the abstraction of an XML document used throughout the rest of the document. Section 3 describes the syntax of a RELAX NG schema; any correct RELAX NG schema must conform to this syntax. Section 4 describes a sequence of transformations that are applied to simplify a RELAX NG schema; applying the transformations also involves checking certain restrictions that must be satisfied by a correct RELAX NG schema. Section 5 describes the syntax that results from applying the transformations; this simple syntax is a subset of the full syntax. Section 6 describes the semantics of a correct RELAX NG schema that uses the simple syntax; the semantics specify when an element is valid with respect to a RELAX NG schema. Section 7 describes restrictions in terms of the simple syntax; a correct RELAX NG schema must be such that, after transformation into the simple form, it satisfies these restrictions. Finally, Section 8 describes conformance requirements for RELAX NG validators." Appendix A supplies the proposed RELAX NG schema for RELAX NG. [Full context]
Because RELAX NG represents the unification of TREX and RELAX Core, its (early) development may be understood, in large measure, by reviewing Tree Regular Expressions for XML (TREX) and REgular LAnguage description for XML (RELAX) as of Q1 2001. The 2001-06-01 tutorial for RELAX NG referenced below delineates some sixteen differences between TREX and RELAX NG; one may see that the changes relate to design/implementation details rather than philosophy. The purpose of the TREX TC (renamed RELAX NG TC) as originally chartered: "to create a specification for a schema language for XML based on the TREX proposal (http://www.thaiopensource.com/trex/). The key features of TREX are that it: (1) is simple, (2) is easy to learn, (3) uses XML syntax, (4) does not change the information set of an XML document, (5) supports XML namespaces, (6) treats attributes uniformly with elements so far as possible, (7) has unrestricted support for unordered content, (8) has unrestricted support for mixed content, (9) has a solid theoretical basis, (10) can partner with a separate datatyping language [such W3C XML Schema Datatypes]..."
[June 05, 2001] TREX and RELAX Unified as RELAX NG, a Lightweight XML Language Validation Specification. Significant progress has been made on the specification for 'RELAX NG' since the April 2001 announcement by the TREX and RELAX design teams declaring their intent to unify the two similar structure-validation languages. The OASIS Technical Committee originally chartered under the name TREX has been named RELAX NG, and key draft documents have been published as sketches for the new validation language. These include a RELAX NG Tutorial, a RELAX NG Formal Semantics specification, and a draft RELAX NG schema for RELAX NG. The goals for RELAX NG are summarized in a recent announcement from the TC: "Members of the OASIS TREX Technical Committee announced their decision to integrate TREX (Tree Regular Expressions for XML) and RELAX (REgular LAnguage description for XML) in order to collaborate on a unified lightweight specification for validating XML-based languages. They renamed their work RELAX NG. RELAX was initially developed at the Information Technology Research and Standardization Centre (INSTAC) in Japan, which advances Japanese national standards for XML under the auspices of the Japanese Standard Association (JSA). TREX was created by James Clark, widely regarded as one of the most prolific contributors to the field of structured information standards. Clark decided to continue development of his schema language at the OASIS XML interoperability consortium in March 2001. 'RELAX and TREX both focus on simplicity,' said James Clark, chair of what is now the OASIS RELAX NG Technical Committee. 'RELAX NG will remain straightforward and easy to use, incorporating the best of TREX and RELAX.' Said Murata Makoto, one of the original developers of RELAX: 'It is important to note that RELAX NG is not intended to replace the W3C XML Schema Recommendation. Instead, it represents a lightweight alternative to Schema. We believe that users are likely to adopt multiple schema languages, and many will find RELAX NG fills a very important need.' According to the OASIS technical committee, the specification offers a middle ground that will make RELAX NG a useful tool for many developers. The team is interested in facilitating conversion among DTDs, XML Schema and RELAX NG. 'RELAX NG fits in well with the W3C XML Schema Formal Description,' added Clark. 'Our hope is that RELAX NG will be a constructive influence on the future development of XML Schema'." [Full context]
In April 2001, it was decided that RELAX Core and TREX (Tree Regular Expressions for XML) would be unified, since the two are very similar as structure-validation languages. The unified TREX/RELAX language will be called RELAX NG [for "Relax Next Generation," pronounced "relaxing"]. This design work is now being conducted within the OASIS TREX [now: RELAX NG] Technical Committee, where a (first) specification is expected by July 1, 2001. The OASIS TC has also been renamed 'RELAX NG' [mailing list: 'email@example.com'] to reflect the new name of the unified TREX/RELAX language. The RELAX NG development team plans to submit the OASIS specification to ISO, given the importance of ISO standards in Europe.
A snapshot of RELAX NG extracted from James Clark's tutorial of June 01, 2001:
RELAX NG is a simple schema language for XML, based on RELAX and TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema thus identifies a class of XML documents consisting of those documents that match the pattern. A RELAX NG schema is itself an XML document.
RELAX NG Non-features: The role of RELAX NG is simply to specify a class of documents, not to assist in interpretation of the documents belonging to the class. It does not change the infoset of the document. In particular, RELAX NG (1) does not allow defaults for attributes to be specified (2) does not allow entities to be specified (3) does not allow notations to be specified (4) does not specify whether white-space is significant Also  RELAX NG does not define a way for an XML document to associate itself with a RELAX NG pattern.
RELAX NG Cross references: RELAX NG generalizes the ID/IDREF feature of XML. A data pattern may have either a key or a keyRef attribute. A data pattern with a key attribute behaves like an XML ID; a data pattern with a keyRef attribute type behaves like an XML IDREF. Whereas XML has a single symbol-space of IDs and IDREFs, RELAX NG has an unlimited number of named symbol-spaces. The value of the key or keyRef is an unprefixed name identifying the symbol-space. An element or attribute that matches a data pattern with a key attribute is called a key; an element or attribute that matches a data pattern with a keyRef attribute is called a key-reference. A document is invalid if it has two distinct keys in the same symbol-space with same value; it is also invalid if it contains a key-reference that does not have a corresponding key in the same symbol-space in the same document with the same value. Whereas in XML IDs and IDREFs must be names, in RELAX NG keys and key-references may have any datatype; whether an element or attribute is a key or key-reference is orthogonal to its datatype.The values of keys and key-references are compared using the datatype specified by the data pattern. All data patterns sharing the same symbol space must specify the same value for the type attribute.
RELAX NG Non-restrictions: RELAX NG does not require patterns to be "deterministic" or "unambiguous". [Cf. ambiguity and determinism as defined in SGML/XML.]
RELAX NG Nested grammars: There is no prohibition against nesting grammar patterns. A ref pattern refers to a definition from nearest grammar ancestor. There is also a parentRef element that escapes out of the current grammar and references a definition from the parent of the current grammar. Imagine the problem of writing a pattern for tables. The pattern for tables only cares about the structure of tables; it doesn't care about what goes inside a table cell. First, we create a RELAX NG pattern table.rng... [see the example]
RELAX NG Datatyping: RELAX NG allows patterns to reference externally-defined datatypes, such as those defined by W3C XML Schema Part 2. RELAX NG implementations may differ in what datatypes they support. You must use datatypes that are supported by the implementation you plan to use. The data pattern matches a string that represents a value of a named datatype. The datatypeNamespace attribute contains a URI identifying the collection of datatypes being used. The datatype collection defined W3C XML Schema Part 2 would be identified by the URI http://www.w3.org/2001/XMLSchema-datatypes. The type attribute specifies the name of the datatype in the collection identified by the datatypeNamespace attribute. For example, if a RELAX NG implementation supported the built-in datatypes of W3C XML Schema Part 2, you could use: <element name="number"> <data type="integer" datatypeNamespace="http://www.w3.org/2001/XMLSchema-datatypes"/> </element>. It is inconvenient to specify the datatypeNamespace attribute on every data element, so RELAX NG allows the datatypeNamespace attribute to be inherited. The datatypeNamespace attribute can be specified on any RELAX NG element. If a data element does not have a datatypeNamespace attribute, it will use the value from the closest ancestor that has a datatypeNamespace attribute. Typically, the datatypeNamespace attribute is specified on the root element of the RELAX NG pattern...
[June 13, 2001] Jing is a validator for RELAX NG implemented in Java. As a command-line tool, it validates an XML instance against a RELAX NG schema and reports (any) errors in a file; one may specify multiple XML files for validation in a single command. Jing is written on top of SAX2, and represents an adaptation of James Clark's validator for TREX. Jing supports validation of datatypes from W3C XML Schema Part 2. The version 2001-06-11 implementation is available for download as a JAR file and as a Win32 executable for use with the Microsoft Java VM; the sources are also available.
RELAX NG Formal Semantics. James Clark created an "inference-rule style formal semantics (like XML Schema Formal Description) for RELAX NG; this is the approximate equivalent of section 4 of the TREX specification... [He] used semantic markup so it will be easy to completely change the notation. The inference rule notation can look a little daunting if you haven't seen it before, but it's really quite easy. An inference rule says that if all the judgements above the line are true, then the judgment below the line is. All the variables occurring in the rule are implicitly universally quantified...." See the explanation in the associated posting of 2001-06-02.
[May 30, 2001] "RELAX NG is a simple schema language for XML, based on RELAX and TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema thus identifies a class of XML documents consisting of those documents that match the pattern. A RELAX NG schema is itself an XML document... RELAX NG Non-features: The role of RELAX NG is simply to specify a class of documents, not to assist in interpretation of the documents belonging to the class. It does not change the infoset of the document. In particular, RELAX NG does not allow defaults for attributes to be specified, does not allow entities to be specified, does not allow notations to be specified, [and] does not specify whether white-space is significant. Also, RELAX NG does not define a way for an XML document to associate itself with a RELAX NG pattern." Note tutorial section 17, 'Differences from TREX': "(1) the
concur pattern has been removed; (2) the
string pattern has been replaced by the
value pattern; (3) the
anyString pattern has been renamed to
text; (4) the namespace URI is different; (5) pattern elements must be namespace qualified; (6) anonymous datatypes have been removed; (7) the
data pattern can have parameters specified by
param child elements; (8)
zeroOrMoreTokens patterns have been added for matching whitespace-separated sequences of tokens; (9) the
data pattern can have a
keyRef attribute; (10) the
group values for the
combine attribute have been removed; (11) an
include element in a grammar may contain
define elements that replace included definitions." [from the RELAX NG Tutorial 2001-05-25]
Articles, Papers, News, History
[November 17, 2008] James Clark 2008-11: Working on Jing and Trang. Blog article. November 09, 2008. I've been back to working on Jing (A RELAX NG Validator in Java) and Trang (Multi-format Schema Converter Based on RELAX NG) for about a month now. It would be something of an understatement to say that they were badly in need of some maintenance love: It's been five years since the last release... I started a jing-trang project on Google Code to host future development. There are new releases of both Jing and Trang in the downloads section of the project site. The code base for Jing and Trang had evolved over a number of years, incorporating various bits of functionality that were independent of each other to various degrees; its structure only made any sense from a historical perspective. The current structure is now nicely modular. I converted my CVS repository to subversion before I started moving things around, so the complete history is available in the project repository. For people who want to stay on the bleeding edge, it's now really easy to check out and build from subversion. My natural tendencies are much more to the cathedral than to the bazaar, but I'm trying to be more open. I'm pleased to say that are already two committers in addition to myself. There's a commercial XML editor called 'oXygen/', which uses Jing and Trang to support RELAX NG. The main guy behind that, George Bina, had made a number of useful improvements. In particular, he upgraded Jing's support for the Namespace Routing Language to its ISO-standardized version, which is called NVDL (you might want to start with this NVDL tutorial rather than the spec). This is now on the trunk. The other committer is Henri Sivonen, who has been using Jing in his Validator.nu service. My goals for the next release are: (1) complete support for NVDL (I think the only missing feature is inline schemas); (2) support for the ISO-standardized version of Schematron; (3) customizable resource resolution support (so that, for example, you can use XML catalogs); (4) support standard JAXP XML validation API (javax.xml.validation); (5) more code cleanup. Please use the issue tracker to let me know what you would like. Google Code has a system that allow you to vote for issues: if you are logged in, which you can do with a regular Google account, each issue will be displayed with a check box next to a star; checking this box "stars" the issue for you, which both adds a vote for the issue and gets you email notifications about changes to it...
[June 29, 2005] "Documenting Relax NG." By Sebastian Rahtz. Posting to 'rng-users' Discussion List. June 30, 2005. Sebastian Rahtz (Information Manager, Oxford University Computing Services) reports that The Text Encoding Initiative 'P5' XML schemas "are maintained using a module of the TEI called 'tagdocs', which allows for describing elements, attributes, classes of elements, 'entities', and so on. The whole of the large TEI Guidelines are written in the same single document; hence the name of 'ODD' for this system (One Document Does it all). The ODD format reverts to Relax NG markup for specifing the content model of elements. From this we derive, as needed, (1) documentaton (HTML, PDF, LaTeX, TEI XML etc); (2) Relax NG schemas (RNC using trang); (3) XSD schemas (using trang); (4) DTDs (direct translation). There is a complex and powerful system for writing customizations (additions, deletions, changes, internationalization), expressed in the same language."
[November 23, 2004] "RELAX NG With Custom Datatype Libraries. Define New Types With Java Technology." By Elliotte Rusty Harold (Adjunct Professor, Polytechnic University). From IBM developerWorks (November 23, 2004). "The RELAX NG XML schema language has achieved huge success over the past three years; this is due in large part to its incredibly clean and straightforward syntax, especially compared to the W3C XML Schema language. Numerous groups, including OpenOffice, DocBook, and the Text Encoding Initiative, have adopted the RELAX NG schema language. RELAX NG has even begun to replace W3C schemas within the W3C, where both the SVG and XHTML working groups are writing their schemas in RELAX NG, then translating them to DTDs and W3C XML Schemas. While RELAX NG doesn't mandate support for XML schema datatypes, in practice, major implementations such as Jing and Sun's Multischema Validator do support them. However, in all the excitement over how much better RELAX NG does the same things as the W3C XML Schema language, the fact that it can actually do quite a bit more has been overlooked. In particular, unlike the W3C XML Schema language, RELAX NG is not limited to one preordained collection of primitive data types with a limited set of facets for extension. RELAX NG enables developers to define custom type libraries that can assert any constraints a program can verify... You can write libraries that contain more than one simple type, and you can define types that have more complex validation rules; but all any type library requires is a few classes to define the type and set up the factories to load it. Here I've demonstrated validation with a command-line user interface (UI), but you can also validate with a graphical user interface (GUI) tool or integrate validation into your own programs using the Java API for RELAX Verifiers (JARV) or the Java API for XML Processing (JAXP) 1.3 validation package. Because type libraries are loaded dynamically using the services API, you don't need to change your Java code at all. Simply place the type library JAR in the classpath, then reference the types in your schemas. You are no longer limited to the W3C simple data types. You can validate absolutely any string that conforms to any decidable set of rules. You can mold the type library to fit your business rules instead of trimming the business rules to fit the schema language..."
[June 19, 2003] Namespace Routing Language (NRL) Supports Multiple Independent Namespaces. James Clark has announced the publication of a Namespace Routing Language (NRL) specification. NRL is "an XML language for combining schemas for multiple namespaces; it allow the schemas that it combines to use arbitrary schema languages." The release includes a tutorial and specification document and a sample implementation in the Jing (RELAX NG Validator in Java) distribution. NRL "is the successor to Clark's Modular Namespaces (MNS) language and is intended to be another step on the path towards Document Schema Definition Languages (DSDL) Part 4." The W3C XML Namespaces Recommendation itself "allows an XML document to be composed of elements and attributes from multiple independent namespaces: each of these namespaces may have its own schema and the schemas for different namespaces may be in different schema languages. The problem then arises of how the schemas can be composed in order to allow validation of the complete document." The Namespace Routing Language attempts to solve this problem. Among the features and benefits of NRL: it supports schema language coexistence, allows extension of schemas not designed to be extended, makes authoring of extensible schemas easier supports 'transparent' namespaces, allows contextual control of extension, and allows concurrent validation. "For RELAX NG, it can be used to provide some of the namespace-based modularity features that are built-in to XSD. NRL is designed to allow an implementation to stream, and the sample implementation does so. The sample implementation has a SAX-based plug-in architecture that allows new schema languages to be added dynamically. It comes with support for RELAX NG (both XML and compact syntax), W3C XML Schema (via a wrapper around Xerces-J), Schematron, and (recursively) NRL; it can also use any schema language with an implementation that supports the JARV interface."
[May 26, 2003] "XHTML 2.0." W3C Working Draft 6-May-2003. Edited by Jonny Axelsson (Opera Software), Beth Epperson (Netscape/AOL), Masayasu Ishikawa (W3C), Shane McCarron (Applied Testing and Technology), Ann Navarro (WebGeek, Inc), and Steven Pemberton (CWI - HTML Working Group Chair). Latest version URL: http://www.w3.org/TR/xhtml2. "XHTML 2 is a general purpose markup language designed for representing documents for a wide range of purposes across the World Wide Web. To this end it does not attempt to be all things to all people, supplying every possible markup idiom, but to supply a generally useful set of elements. It provides the possibility of extension using the span and div elements in combination with stylesheets... This version includes an early implementation of XHTML 2.0 in RELAX NG, but does not include the implementations in DTD or XML Schema form. Those will be included in subsequent versions, once the content of this language stabilizes. This version also does not address the issues revolving around the use of XLINK by XHTML 2..."
[May 26, 2003] "XML Matters: Kicking Back with RELAX NG, Part 3. Compact Syntax and XML Syntax." By David Mertz, Ph.D. (Facilitator, Gnosis Software, Inc). From IBM developerWorks, XML zone. May 14, 2003. See also Part 1 and Part 2 in the series 'Kicking Back with RELAX NG'. ['The RELAX NG compact syntax provides a much less verbose, and easier to read, format for describing the same semantic constraints as RELAX NG XML syntax. This installment looks at tools for working with and transforming between the two syntax forms.'] "Readers of my earlier installments on RELAX NG will have noticed that I chose to provide many of my examples using compact syntax rather than XML syntax. Both formats are semantically equivalent, but the compact syntax is, in my opinion, far easier to read and write. Moreover, readers of this column in general will have a sense of how little enamored I am of the notion that everything vaguely related to XML technologies must itself use an XML format. XSLT is a prominent example of this XML-everywhere tendency and its pitfalls -- but that is a rant for a different column. Later in this article, I will discuss the format of the RELAX NG compact syntax in more detail than the prior installments allowed... On the downside, since the RELAX NG compact syntax is newer -- and not 100% settled at its edges -- tool support for this syntax is less complete than for the XML syntax. For example, even though the Java tool trang supports conversion between compact and XML syntax, the associated tool jing will only validate against XML syntax schemas. Obviously, it is not overly difficult to generate the XML syntax RELAX NG schema to use for validation, but direct usage of the compact syntax schema would be more convenient. Likewise, the Python tools xvif and 4xml validate only against XML syntax schemas. To help remedy the gaps in direct support for compact syntax, I have produced a Python tool for parsing RELAX NG compact schemas, and for outputting them to XML format. While my rnc2rng tool only does what trang does, Eric van der Vlist and Uche Ogbuji have expressed their interest in including rnc2rng in xvif and 4xml, respectively. Ideally, in the near future direct validation against compact syntax schemas will be included in these tools... In some corner cases, rnc2rng differs from trang. For example, both tools force an annotation to occur inside a root element in XML syntax, even if the annotation line occurs before the root element in the compact syntax. Since well-formed XML documents are single-rooted, this is a necessity. But trang also moves comments in a similar manner, while rnc2rng does not. At a minimum, the two tools use whitespace in a slightly different manner. Most likely, a few other variations exist, but ideally none that are semantically important..." Article also in PDF format. See the column listing for other articles in 'XML Matters'.
[March 28, 2003] "XML Matters: Kicking Back with RELAX NG, Part 2. Tools and Special Issues." By David Mertz, Ph.D. (Facilitator, Gnosis Software, Inc). From IBM developerWorks, XML zone. March 26, 2003. ['RELAX NG schemas provide a more powerful, concise, and semantically straightforward means of describing classes of valid XML instances than do W3C XML Schemas. In this installment, David continues the discussion of RELAX NG begun in part 1 of this series by addressing a few additional semantic issues and looking at tools for working with RELAX NG.'] "In the last installment I gave you a fairly complete overview of both the syntax and semantics of RELAX NG schemas. However, a few issues were glossed over, and are worth looking at more closely. Both DTDs and W3C XML Schemas allow for infoset augmentation, while RELAX NG does not. James Clark, one of the creators of RELAX NG (and many other widely used XML tools), argues vehemently that infoset augmentation violates modularity in the roles of XML instance documents and schemata. In other words, for Clark, RELAX NG has a feature where DTDs and W3C Schemas have a bug. My own feelings on the matter are mixed, but I can understand his intuition... Unfortunately, XML editors do not yet support RELAX NG as widely as they do W3C XML Schemas. Of course, DTDs remain much more widely supported than either of these schema styles. This is a shame because it would actually be far easier to include customizations around RELAX NG in an editor because of the simple conceptual framework of RELAX NG validation. Ideally, a custom XML editor would utilize a RELAX NG schema to direct and assist a user in the insertion of attributes and elements in ways that maintain validity. One compromise would be to use a tool like trang to convert a RELAX NG schema into a W3C XML Schema or DTD that approximates it, then use those within a GUI XML editor. But doing so would help only to a limited extent. One XML editor is built around RELAX NG -- the Java technology-based XML Operator... I played with it a little, and found that it could be potentially useful, but it would fall on the low end of the XML editors I have previously reviewed; XML Operator implements just a few features here and there, and provides neither the huge array of tools of XML Spy, or the simple elegance of oXygen. XML Operator implements just a few features here and there, and provides neither XML Spy's huge array of tools, or oXygen's simple elegance... In part 1 and here in part 2, I have looked at most of the elements of RELAX NG, and included a summary of tools for working with it. The third and final installment will touch briefly on how RELAX NG lets you include external schemas in your schema, and selectively merge the specifications of different schemas. But part 3 will primarily look at the RELAX NG compact syntax in more detail, and explain the exact correspondences between compact syntax and XML syntax..."
[February 26, 2003] "XML Matters: Kicking back with RELAX NG, Part 1. Doing Better Than the W3C XML Schema." By David Mertz, Ph.D. (Idempotentate, Gnosis Software, Inc). From IBM developerWorks, XML zone. February 2003. ['RELAX NG schemas provide a more powerful, more concise, and semantically more straightforward means of describing classes of valid XML instances than do W3C XML Schemas. The virtue of RELAX NG is that it extends the well-proven semantics of DTDs while allowing orthogonally extensible datatypes and easy composition of related instance models. David Mertz takes a first look at RELAX NG in this, the first installment of a three-part series.'] "I have long been wary of W3C XML Schemas, and to some extent of XML itself. A jumble of companies and groups with divergent interests and backgrounds cobbled together the W3C XML Schema specification by throwing in a little bit of everything each party wanted, creating a typical committee-designed, difficult-to-understand standard. In fact, I have so many reservations that I generally recommend sticking with DTDs for validation needs, and filling any gaps strictly at an application level. About a month ago, however, I started taking a serious look at RELAX NG. Like many readers, I had heard of this alternative schema language previously, but I had assumed that RELAX NG would be pretty much more of the same, with slightly different spellings. How wrong I was. RELAX NG is simply better than either W3C XML Schemas or DTDs in nearly every way! In fact, RELAX NG's ability to support unordered (or semi-ordered) content models answers most of my prior concerns about the mismatch between the semantic models of OOP datatypes and the linearity of XML elements. This article is the first of three XML Matters installments that discuss RELAX NG. This installment will look at the general semantics of RELAX NG, and touch on datatyping. The second installment will look at tools and libraries for working with RELAX NG. The final installment will discuss the RELAX NG compact syntax in more detail... The semantics of RELAX NG are enormously straightforward -- in this respect, they are a natural extension of DTD semantics. What a RELAX NG schema describes is patterns that consist of quantifications, orderings, and alternations. In addition, RELAX NG introduces a pattern for unordered collection, which neither DTDs nor W3C XML Schemas support (SGML does, but less flexibly than RELAX NG). Moreover, RELAX NG treats elements and attributes in an almost uniform manner. Element/attribute uniformity corresponds much better with the conceptual space of XML than does the rigid separation in both DTDs and W3C XML Schemas. In actual design, the choice between use of an attribute and an element body is frequently underdetermined by design considerations and/or is contextually sensitive..."
[February 26, 2003] "Relax NG." By Eric van der Vlist. Website for the online book. 2002-2003. Work in progress, with substantial content as of 2003-02-26. "Relax NG is a book in progress written by Eric van der Vlist for O'Reilly and submitted to an open review process. The result of this work will be freely available on the World Wide Web under a Free Documentation Licence (FDL). The subject of this book, Relax NG (http://relaxng.org), is a XML schema language developped by the OASIS RELAX NG Technical Committee and recently accepted as Draft International Standard 19757-2 by the Document Description and Processing Languages subcommittee (DSDL) of the ISO/IEC Joint Technical Committee 1 (ISO/IEC JTC 1/SC 34/WG 1)..." See also: (1) "Document Schema Definition Languages (DSDL)"; (2) general references in "XML Schemas."
[January 23, 2003] Trang Multi-Format Schema Converter Supports DTD to W3C XML Schema Conversion. A posting from James Clark to the XML-DEV List announces a new release of Trang, Clark's Multi-Format Schema Converter based on RELAX NG. The conversion tool supports several schema languages for XML, including RELAX NG (XML syntax), RELAX NG compact syntax, XML 1.0 DTDs, W3C XML Schema. With one exception, Trang will convert between any of these formats (W3C XML Schema is supported for output only, not for input). "Trang is written in Java, and available under a BSD-style license. In this release, [Clark has] added an input module for DTDs based on his DTDinst program; this implies that Trang can now convert directly from DTDs to W3C XML Schema (XSD)." Clark identifies three unique features of Trang: "(1) it can reliably turn parameter entities into the higher-level semantic constructs available in XSD (simple types, groups, attribute groups) -- even in the presence of arbitrarily deep nesting of parameter entity references within parameter entity declarations; (2) it supports namespaces, including DTDs that mix multiple namespaces; (3) it can create good-quality, idiomatic XSD, which takes advantage of features such as substitution groups."
[December 09, 2002] "RELAX NG: DTDs On Warp Drive." By John Cowan (Senior Internet Systems Developer, Reuters Health Information, USA). Portions written by James Clark (used by permission). From the tutorial given at XML 2002. 114 slides. Also in derived PDF format. Abstract: "In this tutorial you will learn how to use the RELAX NG schema language, an alternative schema language for XML. RELAX NG allows easy and intuitive descriptions of just what is and what is not allowed in an XML document. It is simple enough to learn in a few hours, and rich and flexible enough to support the design and validation of every kind of document from the very simple to the very complex. Once RELAX NG's concepts have crossed the blood-brain barrier, you will never be able to take any other schema language very seriously again... RELAX NG is an evolution and generalization of XML DTDs, and it shares the same basic paradigm. Based on experience with SGML and XML, RELAX NG both adds and subtracts features from DTDs. XML DTDs can be automatically converted into RELAX NG. Experts in designing SGML and XML DTDs will find their skills transfer easily to designing RELAX NG. Design patterns that are used in XML DTDs can be used in RELAX NG. Overall, RELAX NG is much more mature (and it is possible to have a higher degree of confidence in its design) than it would be if it were based on a completely new and different paradigm... A major goal of RELAX NG is that it be easy to learn and easy to use. Schemas can be patterned after the structure of the documents they describe, but need not be: definitions to be composed from other definitions in a variety of ways. Attributes and elements are treated uniformly as much as possible. RELAX NG supports pluggable simple datatype libraries, from a trivial one that describes only strings and tokens to the full XML Schema Part 2; new ones can be readily designed and built as needed. RELAX NG provides full support for namespaces. RELAX NG provides two interconvertible syntaxes, an XML one for processing, and a compact non-XML one for human authoring. RELAX NG is being standardized in OASIS by the RELAX NG Technical Committee, and is a major component of ISO DSDL, the Document Schema Definition Languages umbrella..." [cache .PPT]
[January 08, 2003] "Converting RELAX NG to W3C XML Schema." By James J. Clark (Director, Thai Open Source Software Center Ltd, Thailand). Presentation given at the XML 2002 Conference, Baltimore, MD, USA. December 11, 2002. ['These are the slides for a talk given at the XML 2002 conference in Baltimore. They have been combined into a single HTML file. The talk was designed to assess how well RELAX NG can be made to work as a mechanism for creating W3C XML Schemas.'] See also the prepared abstract for the presentation (in part): "RELAX NG, especially in its compact syntax, provides a very easy to learn and easy to use schema language for XML. On the other hand, W3C XML Schemas currently enjoys much more widespread industry support. Automatic conversion of RELAX NG to W3C XML Schema allows users to have the best of both worlds. RELAX NG is more expressive than W3C XML Schema. Thus there are RELAX NG schemas that it is impossible to exactly translate into W3C XML Schemas. However, such schemas can be 'approximated' by generating a W3C XML Schema that allows a superset of what the RELAX NG schema allows. When generating W3C XML Schema, the goal is not simply to produce a schema that validates the same documents as the original schema. It is also desirable to preserve the way that the original RELAX NG schema used defines and includes, so that the resulting W3C XML Schema is as human-understandable as possible. Ideally, the resulting schema should be similar to something that might be produced by somebody authoring directly in W3C XML Schema. Some examples of the challenges to be confronted in performing the conversion are: (1) Handling multi-namespace documents: RELAX NG allows elements and attributes from multiple namespaces to be freely mixed, whereas W3C requires a rigid segmentation of the schema into separate namespaces. (2) Wildcards: RELAX NG handles elements whose names are specified by wildcards in a way that is relatively uniform with other elements, whereas in W3C XML Schema wildcards are handled quite differently. (3) Attribute constraints: RELAX NG integrates attributes into content models allowing very expressive constraints, whereas W3C XML Schema supports only optional/required attributes; this requires approximation. (4) Definitions: RELAX NG provides one kind of top-level definition (using the <define> element), whereas W3C XML Schema provides many kinds of top-level definitions/declarations (element, attribute, group, attributeGroup, complexType, simpleType); the conversion has to intelligently select the appropriate kind to use..." Trang is a tool developed by Clark for translating schemas written in RELAX NG into different formats; for example, it will translate a RELAX NG schema in either the XML or compact syntax into a DTD, and translate a RELAX NG schema in either the XML or compact syntax into a W3C XML Schema. See "Update of Jing and Trang from James Clark."
[December 26, 2002] A posting from MURATA Makoto provides a provisional reference for an XML 2002 presentation entitled "RELAX NG Validator on Mobile Phones": A web page created from my slides is temporarily available... My validator does not support interleave, list, value, and data yet. Its Jar file is 27KB, including kXML2 and a table created from a small RELAX NG schema. I will improve my validator and disclose its source..." See the abstract for "Implementing RELAX NG Validators as State Machines" on the IDEAlliance website: "We propose a two-phase validation for RELAX NG. The first phase creates a state machine from a RELAX NG schema. This phase can be performed without having any instance documents. The second phase validates instance documents by using this state machine. This phase does not require schemas in their original forms. This two-phase validation has three advantages. First, it becomes easier to implement RELAX NG validators on many platforms, since only the second phase has to be ported. Second, validators become lightweight, since the first phase is not necessary at run-time. Third, validation is expected to become faster, because a state machine works in a way much simpler than the schema validation. Although this approach can be easily applied to DTD-based validation, epressiveness of RELAX NG imposes significant challenges as below. (1) Non-deterministic tree (or hedge) automata: Since RELAX NG can capture any tree (or hedge) regular language, state machines are required to mimic non-deterministic tree (or hedge) automata. (2) Attribute-element constraints: RELAX NG allows elements and attributes to be freely combined in one content model. Although such compound content models are powerful in expressing constraints between attributes and elements, our state machines have to handle attributes as well as elements. (3) Interleaving: RELAX NG supports supports fully-generalized unordered content models (interleaving). Naive construction of state machines for interleaving leads to exponential blowup. We first introduce state machines for handling RELAX NG in its entirety. Our state machines meet these challenges: (1) our state machines capture non-deterministic tree (or hedge) automata by maintaining a set of (possibly multiple) states per element or attribute; (2) state machines handle attribute-element constraints by converting attribute-element automata to element automata at run time; and (3) state machines simulate variations of shuffle automata for handling interleave patterns without exponential blowup. Next, we demonstrate how a RELAX NG schema is compiled into such a state machine. This process is done by computing derivatives of content models. We also present ptimization techniques for reducing the size of state machines. Finally, we show our open-source implementations of RELAX NG validators. As of this writing, two schema compilers (the first-phase) have been implemented, and two run-time systems (Java and Win32/Visual C++,) have been implemented. We also show how these run-time systems interface with other components in their respective environments..."
[September 23, 2002] Bali: RELAX NG Validatelet Compiler. A posting from Kohsuke Kawaguchi (Sun Microsystems) announces the release of a RELAX NG validatelet compiler named "Bali." Bali is "a Java tool that reads a RELAX NG grammar and produces source code for a validator which is specialized for that grammar. For example, Bali can read a XHTML schema at the compile time, and it produces XHTMLValidatelet.java, which is a Java source code. Then you compile this file along with your other source code, and at the runtime this class can be used to validate XHTML documents before you process it. The concept of transforming a RELAX NG grammar into a compact table is invented primarily by MURATA Makoto. Bali uses the Multi-Schema XML Validator (MSV) from Sun Microsystems to parse RELAX NG grammars. Compared to general-purpose validators such as MSV and Jing, this approach has the following benefits: (1) Bali can produce a validator in various programming languages, which makes it easy to use RELAX NG in those platforms that don't have general purpose RELAX NG validator implementations. For this release, Bali supports Win32/VC++ and Java. (2) The generated validator is usually small compared to a general-purpose RELAX NG validator, both in terms of the runtime memory consumption and the code size. (3) The generated validator (is expected to) run faster than general-purpose validators. Bali is a fully conforming to RELAX NG; the Win32 and Java implementations pass all of the James Clark test suite. The source code and binaries of the compiler/runtime are covered by the BSD license."
[August 02, 2002] "RELAX NG: The Power Is in the Patterns." By Tom Gaven. In XML Journal Volume 3, Issue 7 (July 2002). "Schema languages are languages that allow you to specify the structure of XML instance documents. RELAX NG is an XML schema language that is considered to be simple, yet powerful. This article gives an overview of an important concept of the RELAX NG schema language called patterns. The power of RELAX NG can be found in its patterns. Schema languages also describe the allowed names of elements and attributes that are found in XML instance documents. And they allow you to specify element ordering, occurrence, and allowed content, like simple text, or datatypes, like integers. Some examples of schema languages are W3C XML Schema, RELAX NG, Schematron, and DTD. RELAX NG differs from other schema languages in that it's built around the concept of patterns. To understand the power of RELAX NG, you must first understand the basic RELAX NG patterns and how they can be combined. Let's begin by taking a look at the following XML instance document..."
[July 29, 2002] "Introduction to RELAX NG." By Michael Classen. In WebReference.com (July 23, 2002). ['Whether you prefer compact or full size definitions, one recent schema specification has you covered. Michael Classen introduces you to both the short and long forms of RELAX NG syntax.'] "In the last installment we discussed the different approaches to schema definition put forward by the W3C and OASIS. More specifically, we followed the criticism surrounding XML Schema, and looked at some improvements offered in the alternative, RELAX NG. Today we'll explain the basics of RELAX NG by example..."
[July 23, 2002] A posting from James Clark announces an update for RELAX NG resources, available from the Thai Open Source Software Center. From the posting: "I've updated jing, trang and dtdinst. Trang now has experimental support for generating W3C XML Schema. DTDinst has a new option -i for inlining attribute list declarations; this makes its generated output work better as input for generating W3C XML Schema. Jing has a couple of minor bug fixes... There are still lots of things I want to add to the trang XSD output module. Feedback on what improvements are most needed is welcome... The XML Schema support (provisional) has several limitations..."
[June 21, 2002] "RELAX NG's Compact Syntax." By Michael Fitzgerald. From XML.com. June 19, 2002. ['The RELAX NG schema language for XML offers a simple approach to writing schemas for XML documents, and is seen as a competitive alternative to W3C XML Schema for many applications. RELAX NG is currently being developed by an OASIS Technical Committee. One of the most recent things to emerge from that committee has been the RELAX NG Compact Syntax. If you've ever written a schema by hand, you'll know all the XML tags are tedious to write and can obscure the meaning of the schema. RELAX NG Compact is a non-XML syntax that makes schemas a lot more readable. In our main feature this week Michael Fitzgerald provides an introduction to this new syntax and the tools that use it.'] "Working with XML Schema is like driving a limousine. It's true that it has some nice appointments (datatypes come to mind), but the wheelbase is a bit on the long side, making it difficult to turn corners easily, and I am inclined to let somebody else do the driving for me. Using RELAX NG, on the other hand, is like driving a sports car. It holds corners amazingly well, and I am much less interested in handing over the keys to anyone. You may prefer to drive a limo over a sports car. But I'll take the sports car any day. You are probably familiar with XML Schema and RELAX NG. Both are schema languages for XML. The former was released by the W3C in May 2001, while the latter was released in December 2001 by OASIS. RELAX NG, which was developed by a small technical committee lead by James Clark, merges Murata Makoto's RELAX and Clark's TREX. It is a simple, yet elegant evolution of the DTD, which is also easy to learn. It is modular in design. The main core of RELAX NG is focused on validation alone and doesn't modify the infoset in the process of validation; in other words, no PSVI. RELAX NG is also part of an ISO draft standard, ISO/IEC DIS 19757-2. RELAX NG schemas were originally written in XML, but there's also a compact, non-XML syntax. While this article doesn't contain an exhaustive review of all the features of RELAX NG, it will give you a good idea of how to use the main parts of the compact syntax. If you don't know much about RELAX NG, I suggest that you read Eric van der Vlist's RELAX NG Compared before finishing this article. I think you'll find the compact syntax quite readable and easy to learn. In some respects, a RELAX NG schema in compact form looks like a context-free grammar, which provides a familiar view of the language, is readily comprehensible, and amenable to parsing..."
[May 24, 2002] RELAX NG Published as ISO/IEC DIS 19757-2 (DSDL Part 2). A posting from James Clark to the RELAX NG mailing list announces that ISO/IEC JTC 1/SC 34 [Document Description and Processing Languages] has voted to send out the edited text of RELAX NG as an ISO Draft International Standard (DIS). The text prepared by James Clark and Murata-san contains "no technical changes" vis-à-vis the OASIS specification, but has been changed editorially to meet ISO publication requirements. James indicates that the next stage is for the ISO national member bodies to vote on the DIS; if the draft is approved without comment, it may then be sent out for approval as a full-fledged International Standard; otherwise, there may be another round involving a Final DIS (FDIS). ISO/IEC 19757 (Document Schema Definition Languages - DSDL) is planned as a ten-part specification, of which RELAX NG is Part 2. [Full context]
[May 08, 2002] "RELAX NG Compact Syntax." Edited by James Clark, for the OASIS RELAX NG Technical Committee. Working Draft 8-May-2002. The document "specifies a compact, non-XML syntax for RELAX NG. The semantics of this syntax are specified by specifying how the syntax can be translated into the XML syntax. The goals of this syntax are: (1) maximize readability; (2) support all features of RELAX NG; it must be possible to translate a schema from the XML syntax to the compact syntax and back without losing significant information; (3) support separate translation; a RELAX NG schema may be spread amongst multiple files; it must be possible to represent each of the files separately in the compact syntax; the representation of each file must not depend on the other files. The syntax has similarities to XQuery Formal Semantics, to XDuce and to the DTD syntax of XML 1.0." The most recent update: (1) fixes the way annotations get attached to patterns and name-classes; (2) the BNF is annotated with references to applicable constraints (like WFCs and VCs in XML 1.0).
[April 05, 2002] Topologi Collaborative Markup Editor Supports RELAX NG. A posting from Rick Jelliffe announces support for RELAX NG in the Topologi Collaborative Markup Editor. The Topologi application is "a high-productivity XML and SGML editor for professional publishing teams; it is written in 100% pure Java and uses the Jing native interface. The editor also supports XML DTDs, XML Schemas, Schematron (including phases), and the Topologi NII (NamedInformationItem) schema formats. All these schemas can be put in an XAR file (a ZIP format for distributing document types and application code), and the editor will upload them over a network or between peers, so deploying schemas to systems should be pretty easy. The editor will be shipping with RELAX NG as one of the supplied applications; however it is not an IDE but targeted at data capture for the same kinds of publishing uses that SGML has succeeded in. This beta version is not considered feature-complete, but is being released with the goal of soliciting early feedback from users. Subsequent development is expected to provide undo, spell-checking, red-lining, context-sensitive sidebars, better collaborative authoring features, better support for ODRL (Open Digital Rights Language), etc. Interested parties may register for the beta program. [Full context]
"Zvon Relax NG Reference." November 21, 2001 or later.
RELAX NG 1.0 DTD. Posted by Bob DuCharme (Consulting Software Engineer, LexisNexis) December 6, 2001. "I put together a RELAX NG DTD to make it easier to create RNG schemas with DTD-driven XML editors... please let me know of any bugs, suggestions, 1.0 schemas that won't validate against it, etc... There's nothing outside of the DTD itself; I just did it as an exercise to see how hard it would be, and the section 3 grammar made it very easy.... The 'relaxng.dtd' is based on grammar in section 3 of RELAX NG 1.0 (3 December 2001). While entity declarations such as those for param, start, and define may look redundant, I wanted this to reflect the BNF grammar as closely as possible..." [cache 2001-12-07]
[November 26, 2002] A posting from Sebastian Rahtz (Oxford University Computing Services Information Manager) announces updated Relax NG Schemas for the TEI. "There are RelaxNG schemas for MathML and SVG and a demonstration of how to include them in a TEI Relax NG schema and document. I have devised a crude way to 'flatten' a Relax NG schema to remove inclusions and redundant definitions, yielding a single portable file with no dependencies. For each of my example TEI Schemas, I have used James Clark's trang program to generate a W3C Schema (.xsd schema file). The next stage in this exercise will be to rewrite the TEI "pizzachef" tool to work with the RelaxNG version of the TEI, and generate DTD Relax and W3C constraints according to the users specifications. Comments on any of the above very welcome... [The relevant directory] contains a set of Relax NG Schema specifications corresponding to TEI P4. They were created automatically from the ODDs source of the TEI, and are kept in sync; you can download all the .rng files in a zip file."
[March 23, 2002] "RELAX NG Schemas for TEI P4." Prepared by Sebastian Rahtz (OUCS Information Manager). See also the ZIP package.
RELAX NG Specification. Updated 2001-07-18. Sources. Cache spec and sources.
- RELAX NG Specification. Updated 2001-07-13. Sources (.rng, .xml, .xsl) are also available. See the update notification. Cache spec and sources.
- Previous RELAX NG Specification. Working Draft 5-July-2001. [cache 2001-07-05]
- RELAX NG Project on SourceForge. "RELAX NG [project] is a public space for test cases and other ancillary software related to the construction of the RELAX NG language and its implementations." [Fabio Arciniegas]
- [August 17, 2001] RELAX NG FAQ document. Under development by members of the OASIS RELAX NG TC, 2001-08-17.
[February 19, 2002] "XML Schema and RELAX NG Element Comparison." By Michael Fitzgerald. Reference posted to the RELAX NG TC list. "This document briefly compares XML Schema's 42 elements with RELAX NG's 28 elements. In the table that follows, the first column lists all the XML Schema elements while the second column lists any RELAX NG elements that have a one-to-one relationship, a comparable purpose, or only a roughly similar purpose to XML Schema elements. Elements unique to each language are also listed in separate tables below..." ['I have made an attempt to briefly compare the purpose of XML Schema's elements with RELAX NG's elements. The comparison appears in three tables totaling about 2 and 1/2 pages printed. I would appreciate any comments you have about this document...'] Note also the relax ng links on the Wy'east Communications web site.
[February 14, 2002] Clark Updates Jing - A RELAX NG Validator in Java. James Clark has announced a new version of Jing with significant changes and revised documentation. Jing version '2002-02-13' implements the final RELAX NG 1.0 Specification and also implements parts of RELAX NG DTD Compatibility, specifically checking of ID/IDREF/IDREFS. James has "almost completely rewritten the validator using an improved algorithm. In the old algorithm, the state of the validation was represented by a stack of sets of patterns; in the new algorithm, the state is represented by a single pattern... The new release includes a documented API for Jing; in fact there are two APIs, a native API and JARV. James has rewritten the description of derivative-based validation to correspond to what's been implemented and to incorporate feedback received on the previous version from Murata-san and Kawaguchi-san... The Jing implementation is available for download as a JAR file and as a Win32 executable for use with the Microsoft Java VM. [Full context]
[February 04, 2002] "Relaxing into 2002." By Sean McGrath. In XML In Practice (January 10, 2002). ['RELAX NG is a blend of simplicity and pragmatism aimed at providing a validation system for XML documents. By separating the validation from other XML processing features, RELAX NG keeps DTDs from doing more than they are intended to do.'] "From time-to-time, a markup technology comes along without much fanfare that really changes the way we think about and build XML systems. The SAX API, developed under the tutelage of David Megginson on the XML-DEV mailing list, springs to mind. Without a big brouhaha or marketing budget, without the imprimatur of any august institution or consortium, SAX has quietly become an indispensable part of the XML application development landscape. I strongly suspect that RELAX NG is following in the footsteps of SAX. SAX was a humble blend of simplicity and pragmatism aimed at providing an event-oriented API for XML processing. Similarly, RELAX NG is a humble blend of simplicity and pragmatism aimed at providing a validation system for XML documents. RELAX NG schemas are themselves XML documents. Now those of you who work with XSLT know that can be a mixed blessing. On one hand, you can throw all your XML processing tools at them; on the other hand, they can be somewhat legibility challenged. RELAX NG is a pure delight in this regard, being truly readable after an hour or two of practice. Furthermore, you can intermingle your own markup at will into a RELAX NG schema, which makes adding your own annotations for documentation and module maintenance very simple. Anyone familiar with DTDs that wish to play around with Relax should get DTDinst, a Java based tool that converts DTDs into RELAX NG notation..."
[January 24, 2002] "Relax NG, Compared." By Eric van der Vlist. From XML.com. January 23, 2002. ['The RELAX NG schema language explained and compared to W3C XML Schemas.] "This article is a companion to two different works already published on XML.com: my introduction to W3C XML Schema is a tutorial introducing the language's main features, with a progression which I hope is intuitive; and my comparison between the main schema languages, an attempt to provide an objective and practical feature-by-feature comparison between XML schema languages. In this new article, I have taken the same approach as the one used in the W3C XML Schema tutorial but this time I've implemented the schemas using RELAX NG... it provides a good starting point for those of us who know W3C XML Schema and want to quickly point out the differences with RELAX NG. Links are provided throughout to the corresponding sections of the W3C XML Schema tutorial, and you are encouraged to follow both simultaneously... Throughout this comparison, we have seen that one of the main differences between the two languages is a matter of style: while RELAX NG focuses on generic 'patterns', W3C XML Schema has differentiated these patterns into a set of distinct components (elements, attributes, groups, complex and simple types). The result is on one side a language which is lightweight and flexible (RELAX NG) and on the other side a language which gives more 'meaning' or 'semantic' to the components that it manipulates (W3C XML Schema). The question of whether the added features are worth the price in terms of complexity and rigidity is open, and the answer probably depends on the applications. Independently of this first difference between the two, the different positions regarding 'non-determinism' between RELAX NG, which accepts most of the constructs a designer can imagine, and W3C XML Schema, which is very strict, mean that a number of vocabularies which can be described by RELAX NG cannot be described by W3C XML Schema. A way to summarize this is to notice that an implementation such as MSV (the 'Multi Schema Validator' developed by Kohsuke Kawaguchi for Sun Microsystems) uses a RELAX NG internal representation as a basis to represent the grammar described in W3C XML Schema and DTD schemas. This seems to indicate that RELAX NG can be used as the base on which object oriented features such as those of W3C XML Schema can be implemented. The value of an XML-specific object-oriented layer is still to be determined, though, since generic object-oriented tools should be able to generate RELAX NG schemas directly..." See W3C XML Schema and "RELAX NG."
[January 07, 2002] "An Algorithm for RELAX NG Validation." By James Clark.. January 07, 2001. Author's note to XML-DEV: 'I have written a paper describing one possible algorithm for implementing RELAX NG validation. This is the algorithm used by Jing, which I believe has also been adopted by MSV... If you try to use this to implement RELAX NG and something isn't clear, let me know and I'll try to improve the description.' From the introduction: "This document describes an algorithm for validating an XML document against a RELAX NG schema. This algorithm is based on the idea of what's called a derivative (sometimes called a residual). It is not the only possible algorithm for RELAX NG validation. This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct. We use Haskell to describe the algorithm. Do not worry if you don't know Haskell; we use only a tiny subset which should be easily understandable." Jing is a validator for RELAX NG implemented in Java; it represents an adaptation of the validator for TREX. Jing is written on top of SAX2.
[October 15, 2001] "The Design of RELAX NG." By James Clark. [October 15, 2001 or later] draft version of paper to be presented at XML 2001 in Orlando in December, 2001. URL: http://www.thaiopensource.com/relaxng/design.html. Abstract: "RELAX NG is a new schema language for XML. This paper discusses various aspects of the design of RELAX NG including the treatment of attributes, datatyping, mixed content, unordered content namespaces, cross-references and modularity." Excerpt: "RELAX NG is a schema language for XML, based on TREX and RELAX. At the time of writing, RELAX NG is being standardized in OASIS by the RELAX NG Technical Committee (TC). A tutorial and language specification have been published by the TC. This paper describes the thinking behind the design of RELAX NG. It represents the personal views of the author and is not the official position of the TC. RELAX NG is an evolution of XML DTDs. It shares the same grammar-based paradigm. Based on experience with SGML and XML, RELAX NG both adds and subtracts features relative to XML DTDs. The evolutionary nature of RELAX NG has a number of advantages. XML DTDs can be automatically converted into RELAX NG. Experts in designing SGML and XML DTDs will find their skills transfer to designing RELAX NG. Design patterns that are used in XML DTDs can be used in RELAX NG. Overall, RELAX NG is much more mature and it is possible to have a higher degree of confidence in its design than it would be if it were based on a completely different paradigm. A major goal of RELAX NG is that it be easy to learn and easy to use. One aspect of RELAX NG that promotes this is that the schema can follow the structure of the document. Nesting of patterns in the schema can be used to model nesting of elements in the instance. There is no need to flatten the natural hierarchical structure of the document into a list of element declarations, as you would have to do with DTDs (although RELAX NG allows such flattening if the schema author chooses). An XML DTD consists of a number of top-level declarations. Each declaration associates a name (the left hand side of the declaration) with some kind of object (the right hand side of the declaration). With some kinds of declaration (e.g., ELEMENT, ATTLIST) the name on the left hand side occurs in the instance, for others (parameter entity declarations) the name is purely internal to the DTD. Similarly, W3C XML Schema distinguishes between definitions and declarations. The name of a declaration occurs in an instance, whereas names of definitions are internal to the schema. RELAX NG avoids this complexity. RELAX NG has, in the terminology of W3C XML Schema, only definitions. There is no concept of a declaration. Names on the left hand side of a definition are always internal to the schema. Names occurring in the instance always occur only within the right hand side of a definition. This approach comes from XDuce..."
[April 2002] The Sun RELAX NG Converter is a tool to convert schemas written in various schema languages to their equivalent in RELAX NG. It supports schemas written in XML DTD, RELAX Core, RELAX namespace, TREX, W3C XML Schema, and RELAX NG itself. This software relies on Sun Multi-Schema Validator (MSV). Therefore any limitations of MSV apply also to this converter." See following entry for the earlier version.
[October 17, 2001] Sun Microsystems Releases Generalized Schema-Related Tools for Validation and Conversion. A posting from Kohsuke KAWAGUCHI (Sun Microsystems) announces the availability of an updated version of Sun's Multi-Schema XML Validator (MSV), along with three new schema-related tools. The new Sun XML Instance Generator "is a Java technology tool to generate various XML instances from several kinds of schemas; it supports DTD, RELAX Namespace, RELAX Core, TREX, and a subset of XML Schema Part 1. The RELAX NG Converter is a tool to convert schemas written in various schema languages to their equivalent in RELAX NG. The new Multi-Schema XML Validator Schematron add-on is a Java tool to validate XML documents against RELAX NG schemas annotated with Schematron schemas. By using this tool, you can embed Schematron constraints into RELAX NG schemas, making it easy to write many constraints that are difficult to achieve by RELAX NG alone." [Full context]
[September 03, 2001] "Guidelines for using W3C XML Schema Datatypes with RELAX NG." OASIS [RELAX NG TC] Working Draft 3-September-2001. Edited by James Clark and MURATA Makoto. [James Clark: 'I have written a first draft of the guidelines for using XML Schema Datatypes with RELAX NG...'] Abstract: "This document specifies guidelines for using the datatypes defined by W3C XML Schema Datatypes with RELAX NG."
[September 03, 2001] "RELAX NG DTD Compatibility." OASIS [RELAX NG TC] Working Draft 3-September-2001. Edited by James Clark and MURATA Makoto. Abstract: "This specification defines datatypes and annotations for use in RELAX NG schemas. The purpose of these datatypes and annotations is to support some of the features of XML 1.0 DTDs that are not supported directly by RELAX NG."
[August 17, 2001] "RELAX NG Non-XML Syntax." By James Clark. Reference posted to the OASIS RELAX NG Mailing List, 'firstname.lastname@example.org', 17-August-2001, with subject 'A non-XML syntax for RELAX NG'. "I've developed an experimental non-XML syntax for RELAX NG. There's an implementation in Java (using JavaCC) that translates into RELAX NG. The RELAX NG schema for RELAX NG in the non-XML syntax is 64 lines (2107 bytes), versus 342 lines (8187 bytes) for the XML syntax. It's quite similar in many ways to the type syntax of the current XQuery 1.0 Formal Semantics WD..." From the web site description: "[This document references] a description of the non-XML syntax for RELAX NG. There is a Java program that translates from this non-XML syntax to RELAX NG's XML syntax; this is available packaged as a ZIP file containing source, documentation and a jar file, and as a Win32 executable; there is also documentation on how to use the translator; also an example showing the schema for RELAX NG's XML syntax written in the non-XML syntax. This syntax is not a part of RELAX NG, and is not a product of the OASIS RELAX NG TC..."
- "RELAX NG Tutorial." Also available from the Thai Open Source Software Center.
- Jing, a validator for RELAX NG implemented in Java. Relaxer 0.14.1 also partially supports RELAX NG; see the entry of June 25, 2001 below.
- RELAX NG Resources [schemas, transformation stylesheets, software]
- RELAX NG Formal Semantics. 2001-06-02 or later. See also XML format and corresponding XSL stylesheet. [alt URL]
- RELAX NG schema for RELAX NG. relaxng.rng. Draft 2001-06-02 or later. [alt URL, cache 2001-06-02]
- RELAX NG Issues List.
- [June 02, 2001] "RELAX NG Tutorial." Working Draft 1 June 2001 (or later). From thaiopensource.com. Edited by James Clark for the OASIS RELAX NG Technical Committee. [cache 2001-06-02]
- [June 01, 2001] "RELAX NG Tutorial." Working Draft 1-June-2001. Edited by James Clark for the OASIS RELAX NG Technical Committee. Comments on the working draft may be sent to email@example.com.
- [May 24, 2001] RELAX NG Tutorial." Edited by James Clark [for the the OASIS TC]. Draft/Version: 2001-05-25.
[August 16, 2001] "RELAX NG Shorthand Guide." By Kohsuke KAWAGUCHI (Sun Microsystems). Posted to the OASIS RELAX NG mailing list 2001-08-16 ('firstname.lastname@example.org'). ['I've written a simple XSLT stylesheet that allows you to write a RELAX NG schema in concise way, and produce a fully compliant RELAX NG schema automatically...'] "This document describes the functionality of RELAX NG short-hand processor. RELAX NG is a nice schema language, but sometimes it is painful to type all tags by hand. For example, if you want to write an optional attribute (which is IMO very common), you need to type in [...] it becomes especially hard if you are using normal text editor. The RELAX NG short-hand processor partially addresses this problem by providing several "short-hand" notations that makes schema authoring easier. I wrote a RELAX NG schema for VoiceXML by using this short-hand processor and it took 690 lines. After the processing, RELAX NG schema becomes 1036 lines. So in this case, it saves nearly 1/3 of the typing. Your experience will vary, but I hope you find this processor useful... As you see, it's almost like normal RELAX NG, but you'll notice that the namespace URI is different and there are unfamiliar attributes (@occurs and @type). The current processor is written in XSLT, so once you completed the schema, use XSLT processor to produce a normal RELAX NG schema. If you are using Windows, you can use msxsl tool as: c:\>msxsl myschema.srng shortRNG.xsl > myschema.rng And the produced myschema.rng file can be used with any RELAX NG compliant processor." [cache 2001-08-16]
[September 05, 2001] VoiceXML in RELAX NG. 2001-09-05 or later. From Kohsuke KAWAGUCHI. "I have translated the VoiceXML 1.0 DTD into RELAX NG syntax. I originally wrote it in my short-hand syntax and then used my tool to convert it to full RELAX NG syntax. I've never tested it, so it may well contain several translation errors. All the files are available in one zip file..." [cache]
[September 04, 2001] DTDinst Tool Converts XML DTDs into XML Instance Format. A posting from James Clark announces the availability of a DTD converter 'DTDinst' which converts XML DTDs into XML instance format. "The XML instance can be in either a format specific to DTDinst or can be in RELAX NG format." DTDinst-specific output format is documented in RELAX NG non-XML syntax and in RELAX NG format. The key feature of DTDinst "is its handling of parameter entities: it is able to reliably turn parameter entity declarations and references into a variety of higher-level semantic constructs. It can do this even in the presence of arbitrarily deep nesting of parameter entity references within parameter entity declarations. At the same time, it accurately follows XML 1.0 rules on parameter entity expansion, so that any valid XML 1.0 DTD can be handled. If a parameter entity is used in a way that does not correspond to any of the higher-level semantics constructs supported by DTDinst, then references to that parameter entity will be expanded in the DTDinst output. DTDinst is available as a precompiled JAR file; the source is also available." Clark provides an XSLT stylesheet that "converts DTDinst format to RELAX NG; it has many more limitations than the converter builtin to DTDinst, but it may be useful as a basis for XSLT-based processing of DTDinst format." James writes: "Feedback is welcome, especially on any DTDs it doesn't handle well and on additional features that you would like to see..." [Full context]
- [July 17, 2001] "RELAX NG: Unification of RELAX Core and TREX." By MURATA Makoto (International University of Japan, currently visiting IBM Tokyo Research Lab.) Paper [to be] presented at Extreme Markup Languages 2001, August 12-17, 2001, Montréal, Canada. "RELAX Core and TREX are schema languages for XML. RELAX Core was designed in Japan and has recently been approved as an ISO Technical Report (ISO TR 22250-1); TREX was designed by James Clark. RELAX Core and TREX are similar: they are based on tree automata and do not change information sets. On the other hand, there are some significant differences: attributes, unordered content models, namespaces, wild cards, the syntax, and the underlying implementation techniques. At OASIS, it was decided to unify these two languages and the new language is called RELAX NG. This talk shows how differences between RELAX Core and TREX are resolved in RELAX NG."
- [June 20, 2001] "RELAX NG schema for W3C XML Schema." Prepared by Jeni Tennison. Posted to 'email@example.com' on 20-Jun-2001. Comments: "I think that the XML Schema vocabulary is quite a neat showcase for RELAX NG because there are so many co-dependencies between attributes and between attributes and elements. This RELAX NG schema follows the XML Schema for XML Schema to a certain extent (using the same kind of naming scheme) to facilitate comparison between the two. I have also added comments about the ease with which the two handle different aspects of the vocabulary. I've tested it with Jing against various XML Schemas, and it seems to be working, though obviously if anyone spots any bugs please get in touch..." [cache 2001-06-20]
[June 25, 2001] Relaxer 0.14.1 with partial support for RELAX NG. "...This version also supports a preliminary implementation for RELAX NG. Please check sampleNG directory of the distribution. Although the version supports just small subset of RELAX NG functionality, next generation Relaxer is scheduled to support all the functionality of RELAX NG... To use for a RELAX NG schema, please specify -grammarVerify:none... Relaxer is a Java class generator that addresses a XML document complied with the XML model defined by RELAX." See the posting of ASAMI Tomoharu.
[June 20, 2001] "Simplified XML Syntax for RDF." By Jonathan Borden (Tufts University School of Medicine, The Open Healthcare Group). June 17, 2001 or later. "The XML syntax for RDF 1.0 can be described in terms of a tree regular expression. This form can be thought of as expressing constraints on the XML Infoset which arises when parsing an RDF document. The advantage of expressing the syntax in this form over EBNF, is that a tree regular expression (e.g., RELAXNG/TREX schema http://relaxng.org) already takes into account the rules of XML syntax + XML namespaces, e.g., correctly handles namespace prefixes, empty elements, mixed content, whitespace, attribute ordering etc. Such schemata are also described as 'hedge regular expressions' or 'hedge automata' [http://www.oasis-open.org/cover/hedgeAutomata.html]. The tree regular expression schema for RDF 1.0 is available [online]. This schema handles several proposed updates such as the requirement that the "rdf:about" and "rdf:ID" attributes be prefixed/qualified. A tree regular expression for the proposed syntax is available [online]..."
- [June 04, 2001] "A Triumph of Simplicity: James Clark on Markup Languages and XML. Markup Languages, the Standardization Process, and the Importance of Simplicity. [DDJ Interviews James Clark. Feature.]" By Eugene Eric Kim and James Clark. In Dr. Dobb's Journal Issue 326 (July 2001), pages 56-60. [Clark served as the technical lead of the original W3C XML Working Group and as the editor of the XSLT and XPath recommendations. He recently founded Thai Open Source Software Center. His latest project is TREX, an XML schema language. Clark sat down with Eugene Eric Kim to discuss markup languages, the standardization process, and the importance of simplicity.'] Note: With the decision to merge RELAX Core and TREX under the name 'RELAX NG', we may assume that much of what Clark writes about TREX applies largely to RELAX NG as well. E.g., "[TREX:] ...You can think of it as DTDs in XML syntax minus some things and plus some others. TREX just does validation. DTDs mush together both validation and interpretation of the documents, providing various things like entities and notations. Mushing them together is problematic because often you want one thing but not the other. My work with XML and SGML has convinced me that what you need is good separation between these different things. I wanted to remove from DTDs the things that augment the information in the XML document. And I wanted to add in some of the things that I think XML DTDs have always been missing. One of the things XML DTDs removed from SGML DTDs was AND groups, which allow you to have unordered content. The SGML AND groups had a bad reputation, and don't have quite the right semantics. TREX adds them back and tries to do them right. XML also radically simplified the kinds of mixed content that you're allowed because there's a problem with the way SGML does it. Instead of restricting it, TREX solves the problem... Another big difference is that TREX tries to treat attributes and elements as uniformly as possible. If you're designing an XML or SGML markup language, it's often pretty much arbitrary whether you represent some bit of information as an attribute or as a child element. In my view, XML processing tools and languages should try to minimize the differences between elements and attributes and should try to treat them as uniformly as possible. You can see that in XSLT and XPath. I wanted to apply that idea to schema languages. In TREX, attributes are integrated into the content model, so it makes it easy to say, 'You can have this element or you can have this attribute.' It's very common. For example, W3C's XML Schemas, in the restriction element, you can either have a base attribute that names the base type or you can have a simpleType child element that describes the base type directly rather than by referring to it by name. So you want to say, 'Either I have the base attribute or I have the simpleType child element.' And in TREX, you say exactly that. It's just as easy to say that as it is to say, 'Either I have a foo element or a bar element'..."
- Announcement 2001-06-05: "OASIS Unites TREX and RELAX to Create Lightweight XML Language Validation Specification." [source]
- See: Tree Regular Expressions for XML (TREX)
- See: REgular LAnguage description for XML (RELAX)
- See: "XML Schemas" - Main reference page.
|Receive daily news updates from Managing Editor, Robin Cover.|