[Cache from http://www.thaiopensource.com/relaxng/spec.html 2001-07-05; please use this canonical URL/source if possible.]


RELAX NG Specification

Working Draft 5 July 2001

This version:
Working Draft: 5 July 2001
Editors:
James Clark <jjc@jclark.com>, Makoto MURATA <mura034@attglobal.net>

Copyright © The Organization for the Advancement of Structured Information Standards [OASIS] 2001. All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Abstract

This is the definitive specification of RELAX NG, a simple schema language for XML, based on [RELAX] and [TREX]. A RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document.

Status of this Document

This is a working draft constructed by the editors. It is not an official committee work product and may not reflect the consensus opinion of the committee. Comments on this document may be sent to relax-ng-comment@lists.oasis-open.org.

Table of Contents

1 Introduction
2 Data model
3 Full syntax
4 Simplification
4.1 Annotations
4.2 Whitespace
4.3 href attribute
4.4 externalRef element
4.5 name attribute on start element
4.6 include element
4.7 datatypeLibrary attribute
4.8 name attribute of element and attribute elements
4.9 ns attribute
4.10 QNames
4.11 div element
4.12 Number of child elements
4.13 mixed element
4.14 optional element
4.15 zeroOrMore element
4.16 combine attribute
4.17 grammar element
4.18 define and ref elements
4.19 notAllowed element
4.20 empty element
5 Simple syntax
6 Semantics
6.1 Name classes
6.2 Patterns
6.2.1 choice pattern
6.2.2 group pattern
6.2.3 empty pattern
6.2.4 text pattern
6.2.5 oneOrMore pattern
6.2.6 interleave pattern
6.2.7 attribute pattern
6.2.8 element pattern
6.2.9 data and value pattern
6.2.10 Builtin datatype library
6.2.11 list pattern
6.2.12 key and keyRef pattern
6.3 Keys
6.4 Validity
7 Restrictions
7.1 Composition
7.2 Duplicate attributes
7.3 Interleave
7.4 key and keyRef
8 Conformance

Appendixes

A RELAX NG schema for RELAX NG
References

1. Introduction

This document specifies

  • when an XML document is a correct RELAX NG schema
  • when an XML document is valid with respect to a correct RELAX NG schema.

An XML document that is being validated with respect to a RELAX NG schema is referred to as an instance.

The structure of this document is as follows. Section 2 describes the data model, which is the abstraction of an XML document used throughout the rest of the document. Section 3 describes the syntax of a RELAX NG schema; any correct RELAX NG schema must conform to this syntax. Section 4 describes a sequence of transformations that are applied to simplify a RELAX NG schema; applying the transformations also involves checking certain restrictions that must be satisfied by a correct RELAX NG schema. Section 5 describes the syntax that results from applying the transformations; this simple syntax is a subset of the full syntax. Section 6 describes the semantics of a correct RELAX NG schema that uses the simple syntax; the semantics specify when an element is valid with respect to a RELAX NG schema. Section 7 describes restrictions in terms of the simple syntax; a correct RELAX NG schema must be such that, after transformation into the simple form, it satisfies these restrictions. Finally, Section 8 describes conformance requirements for RELAX NG validators.

2. Data model

RELAX NG deals with XML documents representing both schemas and instances through an abstract data model. XML documents representing schemas and instances must be well-formed in conformance with [XML 1.0] and must conform to the constraints of [XML Namespaces].

An XML document is represented by an element. An element consists of

  • a name
  • a context
  • an unordered collection of attributes
  • an ordered sequence of zero or more children; each child is either an element or a non-empty string; the sequence never contains two consecutive strings

A name consists of

  • a string representing the namespace URI; the empty string has special significance, representing the absence of any namespace
  • a string representing the local name; this string matches the NCName production of [XML Namespaces]

A context consists of

  • a base URI
  • a namespace map; this maps prefixes to namespace URIs, and also may specify a default namespace URI (as declared by the xmlns attribute)

An attribute consists of

  • a name
  • a string representing the value

A string consists of a sequence of zero or more characters, where a character is an integer in the range 0 to #x10FFFF.

The element for an XML document is constructed from an instance of the [XML Infoset]) as follows. We use the notation [x] to refer to the value of the x property of an information item. An element is constructed from a document information item by constructing an element from the [document element]. An element is constructed from an element information item by constructing the name from the [namespace name] and [local name], the context from the [base URI] and [in-scope namespaces], the attributes from the [attributes], and the children from the [children]. The attributes of an element are constructed from the unordered set of attribute information items by constructing an attribute for each attribute information item. The children of an element are constructed from the list of child information items first by removing information items other than element information items and character information items, and then by constructing an element for each element information item in the list and a string for each maximal sequence of character information items. An attribute is constructed from an attribute information item by constructing the name from the [namespace name] and [local name], and the value from the [normalized value]. When constructing the name of an element or attribute from the [namespace name] and [local name], if the [namespace name] property is not present, then the name is constructed from an empty string and the [local name]. A string is constructed from a sequence of character information items by constructing a character from the [character code] of each character information item.

3. Full syntax

The following grammar summarizes the syntax of RELAX NG. Although we use a notation based on the XML representation of an RELAX NG schema as a sequence of characters, the grammar must be understood as operating at the data model level. For example, although the syntax uses <text/>, an instance or schema can use <text></text> instead, because they both represent the same object at the data model level. All elements shown in the grammar are qualified with the namespace URI:

http://relaxng.org/ns/structure/0.9

In addition to the attributes shown explicitly, any element can have an ns attribute and any element can have a datatypeLibrary attribute.

Any element can also have foreign attributes in addition to the attributes shown in the grammar. A foreign attribute is an attribute with a name whose namespace URI is neither the empty string nor the RELAX NG namespace URI. Any element that cannot have string children (i.e. any element other than value and param) may have foreign child elements in addition to the child elements shown in the grammar. A foreign element is an element with a name whose namespace URI is not the RELAX NG namespace URI. There are no constraints on the relative position of foreign child elements with respect to other child elements.

Any element can also have as children strings that consist entirely of whitespace characters, where a whitespace character is one of #x20, #x9, #xD or #xA. There are no constraints on the relative position of whitespace string children with respect to child elements.

pattern  ::=  <element name="QName"> pattern+ </element>
| <element> nameClass pattern+ </element>
| <attribute name="QName" global="boolean"?> pattern? </attribute>
| <attribute> nameClass pattern? </attribute>
| <group> pattern+ </group>
| <interleave> pattern+ </interleave>
| <choice> pattern+ </choice>
| <optional> pattern+ </optional>
| <zeroOrMore> pattern+ </zeroOrMore>
| <oneOrMore> pattern+ </oneOrMore>
| <list> pattern+ </list>
| <mixed> pattern+ </mixed>
| <ref name="NCName"/>
| <parentRef name="NCName"/>
| <key name="NCName"> pattern </key>
| <keyRef name="NCName"> pattern </keyRef>
| <empty/>
| <text/>
| <value type="NCName"?> string </value>
| <data type="NCName"> param* exceptPattern? </data>
| <notAllowed/>
| <externalRef href="anyURI"/>
| <grammar> grammarContent* </grammar>
param  ::=  <param name="NCName"> string </param>
exceptPattern  ::=  <except> pattern+ </except>
grammarContent  ::=  start
| define
| <div> grammarContent </div>
| <include href="anyURI"> includeContent* </include>
includeContent  ::=  start
| define
| <div> includeContent* </div>
start  ::=  <start name="NCName"? combine="method"?> pattern+ </start>
define  ::=  <define name="NCName" combine="method"?> pattern+ </define>
method  ::=  choice
| interleave
nameClass  ::=  <name> string </name>
| <anyName> exceptNameClass? </anyName>
| <nsName> exceptNameClass? </nsName>
| <choice> nameClass+ </choice>
exceptNameClass  ::=  <except> nameClass+ </except>

QName and NCName are defined in [XML Namespaces]. A boolean must be true, false, 1 or 0. The anyURI symbol has the same meaning as the anyURI datatype of [W3C XML Schema Datatypes]: it indicates a string which, after escaping of disallowed values as described in Section 5.4 of [XLink], is a URI reference as defined in [RFC 2396] (as modified by [RFC 2732]).

Leading and trailing whitespace is allowed for strings matching QName, NCName, boolean or the value of the combine attribute.

4. Simplification

The full syntax given in the previous section is transformed into a simpler syntax by applying the following transformation rules in order. The effect must be as if each rule was applied to all elements in the schema before the next rule is applied. A transformation rule may also specify constraints that must be satisfied by a correct schema.

4.1. Annotations

Foreign attributes and elements are removed.

4.2. Whitespace

For each element other than value and param, each child that is a string containing only whitespace characters is removed.

Leading and trailing whitespace characters are removed from the value of each name, type, global and combine attribute and from the content of each name element.

4.3. href attribute

The value of the href attribute on an externalRef or include element is transformed into an absoluteURI as defined in [RFC 2396] (as modified by [RFC 2732]). First, disallowed characters are escaped as specified in Section 5.4 of [XLink]. Next, the URI reference is resolved into an absolute form of URI reference as described in section 5.2 of [RFC 2396] using the base URI from the context of the element that bears the href attribute. The resulting URI reference must not have a fragment identifier (i.e. must not contain a # character).

4.4. externalRef element

The value of the href attribute is dereferenced and parsed into an instance of the data model. The resulting element must match the syntax for pattern. The element is transformed by recursively applying the rules from previous sections and this section. This must not result in a loop. In other words, the transformation of the referenced element must not require the dereferencing of an externalRef attribute with an href attribute with the same value.

Any ns or datatypeLibrary attribute on the externalRef element is transferred to the referenced element. The externalRef element is then replaced by the referenced element.

4.5. name attribute on start element

If a start element has a name attribute, then the name attribute is removed and a define element is added as a sibling of the start element. Thus,

<start name="n">
  p
</start>

is transformed into

<start>
  p
</start>
<define name="n">
  p
</define>

The new define element has all the attributes that the old start element had. The new start element has all the attributes that the old start element had other than the name attribute. Thus,

<start name="n" combine="c">
  p
</start>

is transformed into

<start combine="c">
  p
</start>
<define name="n" combine="c">
  p
</define>

4.6. include element

An include element is transformed as follows. The value of the href attribute is dereferenced and parsed into an instance of the data model. The resulting element must be a grammar element, matching the syntax for grammar.

This grammar element is transformed by recursively applying the rules from previous sections and this section. This must not result in a loop. In other words, the transformation of the grammar element must not require the dereferencing of an include attribute with an href attribute with the same value.

Define the components of an element to be the children of the element together with the components of any div child elements. If the include element has a start component, then the grammar element must have a start component. If the include element has a start component, then all start components are removed from the grammar element. If the include element has a define component, then the grammar element must have a define component with the same name. For every define component of the include element, all define components with the same name are removed from the grammar element.

The include element is transformed into a div element. The attributes of the div element are the attributes of the include element other then the href attribute. The children of the div element are the children of the include element followed by the children of the grammar element (after the removal of the start and define components described by the preceding paragraph).

4.7. datatypeLibrary attribute

For any data or value element that does not have a datatypeLibrary attribute, a datatypeLibrary attribute is added. The value of the added datatypeLibrary attribute is the value of the datatypeLibrary attribute of the nearest ancestor element that has a datatypeLibrary attribute, or the empty string if there is no such ancestor. Then, any datatypeLibrary attribute that is on an element other than data or value is removed.

4.8. name attribute of element and attribute elements

The name attribute on an element or attribute element is transformed into a name child element.

If an attribute element has a name attribute but no ns attribute and does not have a global attribute with the value true or 1, then an ns="" attribute is added to the name child element.

4.9. ns attribute

For any name, nsName, key, keyRef or value element that does not have an ns attribute, an ns attribute is added. The value of the added ns attribute is the value of the ns attribute of the nearest ancestor element that has an ns attribute, or the empty string if there is no such ancestor. Then, any ns attribute that is on an element other than name, nsName, key, keyRef or value is removed.

4.10. QNames

For any name element containing a prefix, the prefix is removed and an ns attribute is added replacing any existing ns attribute. For any name attribute on a key or keyRef attribute containing a prefix, the prefix is removed and an ns attribute is added replacing any existing ns attribute. The value of the added ns attribute is the value to which the prefix is mapped by the namespace map of the context of the name element or the element bearing the name attribute. The context must have a mapping for the prefix.

4.11. div element

Each div element is replaced by its children.

4.12. Number of child elements

A define, start, oneOrMore, zeroOrMore, optional or mixed element is transformed so that it has exactly one child element. If it has more than one child element, then its child elements are wrapped in a group element. Similarly, an element is transformed so that it has exactly two child elements, the first being a name class and the second being a pattern. If it has more than two child element, then the child elements other than the first are wrapped in a group element.

A except element is transformed so that it has exactly one child element. If it has more than one child element, then its child elements are wrapped in a choice element.

If an attribute element has only one child element (a name class), then a text element is added.

A choice, group or interleave element is transformed so that it has exactly two child elements. If it has one child element, then it is replaced by its child element. If it has more than two child elements, then the first two child elements are combined into a new element with the same name as the parent element and with the first two child elements as its children. For example,

<choice> p1 p2 p3 </choice>

is transformed to

<choice> <choice> p1 p2 </choice> p3 </choice>

This reduces the number of child elements by one. The transformation is applied repeatedly until it has exactly two child elements.

4.13. mixed element

A mixed element is transformed into an interleaving with a text element:

<mixed> p </mixed>

is transformed into

<interleave> p <text/> </interleave>

4.14. optional element

An optional element is transformed into a choice with empty:

<optional> p </optional>

is transformed into

<choice> p <empty/> </choice>

4.15. zeroOrMore element

A zeroOrMore element is transformed into a choice between oneOrMore and empty:

<zeroOrMore> p </zeroOrMore>

is transformed into

<choice> <oneOrMore> p </oneOrMore> <empty/> </choice>

4.16. combine attribute

For each grammar element, all define elements with the same name are combined together. For any name, there must not be more than one define element with that name that does not have a combine attribute. For any name, if there is a define element with that name that has a combine attribute with the value choice, then there must not also be a define element with that name that has a combine attribute with the value interleave. A pair of definitions

<define name="n">
  p1
</define>
<define name="n">
  p2
</define>

is combined into

<define name="n">
  <c>
    p1
    p2
  </c>
</define>

where c is the value of the combine attribute.

Similarly, for each grammar element all start elements are combined together. There must not be more than one start element that does not have a combine attribute. If there is a start element that has a combine attribute with the value choice, there must not also be a start element that has a combine attribute with the value interleave.

4.17. grammar element

In this rule, the schema is transformed so that its top-level element is grammar and so that it has no other grammar elements.

Define the in-scope grammar for an element be the nearest ancestor grammar element. A ref element refers to a define element if the value of their name attributes is the same and their in-scope grammars are the same. A parentRef element refers to a define element if the value of their name attributes is the same and the in-scope grammar of the in-scope grammar of the parentRef element is the same as the in-scope grammar of the define element. Every ref or parentRef element must refer to a define element. A grammar must have a start child element.

First, transform the top-level pattern p into <grammar><start>p</start></grammar>. Next, rename define elements so that no two define elements anywhere in the schema have the same name. To rename a define element, change the value of its name attribute and change the value of the name attribute of all ref and parentRef elements that refer to that define element. Next, move all define elements to be children of the top-level grammar element and replace each nested grammar element by the child of its start element.

4.18. define and ref elements

In this rule, the grammar is transformed so that every element element is the child of a define element, and the child of every define element is an element element.

First, remove any define element which does not have any ref element referring to it. Now, for each element element that is not the child of a define element, add a define element to the grammar element, and replace the element element by a ref element referring to the added define element. The value of the name attribute of the added define element must be different from value of the name attribute of all other define elements. The child of the added define element is the element element.

Define a ref element to be expandable if it refers to a define element whose child is not an element element. For each ref element that is expandable and is a descendant of a start element or an element element, expand it by replacing the ref element by the child of the define element to which it refers and then recursively expanding any expandable ref elements in this replacement. This must not result in a loop. In other words expanding the replacement of a ref element having a name with value n must not require the expansion of ref element also having a name with value n. Finally, remove any define element whose child is not an element element.

4.19. notAllowed element

In this rule, the grammar is transformed so that a notAllowed element occurs only as the child of a start or element element. A attribute, list, group, interleave, oneOrMore, key or keyRef element that has a notAllowed child element is transformed into a notAllowed element. A choice element that has two notAllowed child elements is transformed into a notAllowed element. A choice element that has one notAllowed child element is transformed into its other child element.

4.20. empty element

In this rule, the grammar is transformed so that an empty element does not occur as a child of a group, interleave, or oneOrMore element or as the second child of a choice element. A group, interleave or choice element that has two empty child elements is transformed into an empty element. A group or interleave element that has one empty child element is transformed into its other child element. A choice element whose second child element is an empty element is transformed by interchanging its two child elements. A oneOrMore element that has an empty child element is transformed into an empty element.

5. Simple syntax

After applying all the rules in Section 4, the schema will match the following grammar:

grammar  ::=  <grammar> <start> top </start> define* </grammar>
define  ::=  <define name="NCName"> <element> nameClass top </element> </define>
top  ::=  <notAllowed/>
| pattern
pattern  ::=  <empty/>
| nonEmptyPattern
nonEmptyPattern  ::=  <text/>
| <data type="NCName" datatypeLibrary="anyURI"> param* exceptPattern? </data>
| <value datatypeLibrary="anyURI" type="NCName" ns="anyURI"> string </value>
| <list> pattern </list>
| <key name="NCName" ns="anyURI"> pattern </key>
| <keyRef name="NCName" ns="anyURI"> pattern </keyRef>
| <attribute> nameClass pattern </attribute>
| <ref name="NCName"/>
| <oneOrMore> nonEmptyPattern </oneOrMore>
| <choice> pattern nonEmptyPattern </choice>
| <group> nonEmptyPattern nonEmptyPattern </group>
| <interleave> nonEmptyPattern nonEmptyPattern </interleave>
param  ::=  <param name="NCName"> string </param>
exceptPattern  ::=  <except> pattern </except>
nameClass  ::=  <anyName> exceptNameClass? </anyName>
| <nsName ns="anyURI"> exceptNameClass? </nsName>
| <name ns="anyURI"> NCName </name>
| <choice> nameClass nameClass </choice>
exceptNameClass  ::=  <except> nameClass </except>

With this grammar, no elements or attributes are allowed other than those explicitly shown.

6. Semantics

In this section, we define the semantics of a correct RELAX NG schema that has been transformed into the simple syntax. The semantics of a RELAX NG schema consist of a specification of what XML documents are valid with respect to that schema. The semantics are described formally as a proof system. A proof system consists of axioms and inference rules. Axioms are propositions that are provable unconditionally. An inference rule consists of one or more antecedents and exactly one consequent. If the antecedents of an inference rule are all provable, then the consequent of the inference rule is also provable. An XML document is valid with respect to a RELAX NG schema if and only if the proposition that it is valid is provable in the proof system described in this section.

The notation for inference rules separates the antecedents from the consequent by a horizontal line: the antecedents are above the line; the consequent is below the line. Both axioms and inferences rules may use variables. A variable has a name and optionally a subscript. The name of a variable is italicized. Each variable has a range that is determined by its name. Axioms and inference rules are implicitly universally quantified over the variables they contain. We explain this further below.

6.1. Name classes

The main semantic concept for name classes is that of a name belonging to a name class. A name class is an element that matches the production nameClass. A name is as defined in Section 2: it consists of a namespace URI and a local name.

We use the following notation:

n
is a variable that ranges over names
nc
ranges over name classes
n in nc
asserts that name n is a member of name class nc

We are now ready for our first axiom, which is called "anyName 1":

(anyName 1)
n in <anyName/>

This says for any name n, n belongs to the name class <anyName/>, in other words <anyName/> matches any name. Note the effect of the implicit universal quantification over the variables in the axiom: this is what makes the rule apply for any name n.

Our first inference rule is almost as simple:

(anyName 2)
not(n in nc)

n in <anyName> <except> nc </except> </anyName>

This says that for any name n and for any name class nc, if n does not belong to nc, then n belongs to <anyName> <except> nc </except> </anyName>. In other words, <anyName> <except> nc </except> </anyName> matches any name that does not match nc.

We now need the following additional notation:

ln
ranges over local names (NCNames, names without prefixes)
u
ranges over URIs
name( u, ln )
constructs a name with URI u and local name ln

The remaining axioms and inference rules for name classes are as follows:

(nsName 1)
name( u, ln ) in <nsName ns="u"/>
(nsName 2)
not(name( u, ln ) in nc)

name( u, ln ) in <nsName ns="u"> <except> nc </except> </nsName>
(name)
name( u, ln ) in <name ns="u"> ln </name>
(name choice 1)
n in nc1

n in <choice> nc1 nc2 </choice>
(name choice 2)
n in nc2

n in <choice> nc1 nc2 </choice>

6.2. Patterns

The axioms and inference rules for patterns use the following notation:

cx
ranges over contexts (as defined in Section 2)
a
ranges over bags (unordered collections) of attributes; a bag with a single member is considered the same as that member
m
ranges over sequences of elements and strings; a sequence with a single member is considered the same as that member
p
ranges over patterns (elements matching the pattern production)
k
ranges over bags of keys; a key is an object generated by matching a key element
kr
ranges over bags of key references; a key reference is an object generated by matching a keyref element
cx |- a; m =~ p => k; kr
asserts that with respect to context cx, the attributes a and the sequence of elements and strings m matches the pattern p generating the collection of keys k and key references kr

6.2.1. choice pattern

The semantics of the choice pattern are as follows:

(choice 1)
cx |- a; m =~ p1 => k; kr

cx |- a; m =~ <choice> p1 p2 </choice> => k; kr
(choice 2)
cx |- a; m =~ p2 => k; kr

cx |- a; m =~ <choice> p1 p2 </choice> => k; kr

6.2.2. group pattern

We use the following additional notation:

a1 + a2
represents the bag union of a1 and a2 (the number of occurrences of any member of a1 + a2 is the sum of the number of its occurrences in a1 and a2)
m1, m2
represents the concatenation of the sequences m1 and m2

The semantics of the group pattern are as follows:

(group)
cx |- a1; m1 =~ p1 => k1; kr1    cx |- a2; m2 =~ p2 => k2; kr2

cx |- a1 + a2; m1, m2 =~ <group> p1 p2 </group> => k1 + k2; kr1 + kr2

Note

The restriction in Section 7.2 ensures that the bag of attributes constructed in the consequent will not have multiple attributes with the same name.

6.2.3. empty pattern

We use the following additional notation:

( )
represents an empty sequence
{ }
represents an empty bag

The semantics of the empty pattern are as follows:

(empty)
cx |- { }; ( ) =~ <empty/> => { }; { }

6.2.4. text pattern

We use the following additional notation:

s
ranges over strings

The semantics of the text pattern are as follows:

(text 1)
cx |- { }; s =~ <text/> => { }; { }
(text 2)
cx |- { }; m =~ <text/> => { }; { }

cx |- { }; m, s =~ <text/> => { }; { }

The effect of the above rule is that a text element matches zero of more strings.

6.2.5. oneOrMore pattern

We use the following additional notation:

disjoint(a1, a2)
asserts that there is no name that is the name of both an attribute in a1 and of an attribute in a2

The semantics of the oneOrMore pattern are as follows:

(oneOrMore 1)
cx |- a; m =~ p => k; kr

cx |- a; m =~ <oneOrMore> p </oneOrMore> => k; kr
(oneOrMore 2)
cx |- a1; m1 =~ p => k1; kr1    cx |- a2; m2 =~ <oneOrMore> p </oneOrMore> => k2; kr2    disjoint(a1, a2)

cx |- a1 + a2; m1, m2 =~ <oneOrMore> p </oneOrMore> => k1 + k2; kr1 + kr2

6.2.6. interleave pattern

We use the following additional notation:

m1 interleaves m2; m3
asserts that m1 is an interleaving of m2 and m3

The semantics of interleaving are defined by the following rules.

(interleaves 1)
( ) interleaves ( ); ( )
(interleaves 2)
m1 interleaves m2; m3

m4, m1 interleaves m4, m2; m3
(interleaves 3)
m1 interleaves m2; m3

m4, m1 interleaves m2; m4, m3

The semantics of the interleave pattern are as follows:

(interleave)
cx |- a1; m1 =~ p1 => k1; kr1    cx |- a2; m2 =~ p2 => k2; kr2    m3 interleaves m1; m2

cx |- a1 + a2; m3 =~ <interleave> p1 p2 </interleave> => k1 + k2; kr1 + kr2

Note

The restriction in Section 7.2 ensures that the bag of attributes constructed in the consequent will not have multiple attributes with the same name.

6.2.7. attribute pattern

We use the following additional notation:

v
ranges over strings and the empty sequence; this is a subset of the range of m
toString( v )
returns an empty string if v is the empty sequence and otherwise returns v
attribute( n, s )
constructs an attribute with name n and value s

The semantics of the attribute pattern are as follows:

(attribute)
cx |- { }; v =~ p => k; kr    n in nc

cx |- attribute( n, toString( v ) ); ( ) =~ <attribute> nc p </attribute> => k; kr

6.2.8. element pattern

We use the following additional notation:

normalized(m)
asserts that the mixed sequence m is normalized: it does not contain any member that is an empty string, nor does it contain two consecutive members that are both strings
element( n, cx, a, m )
constructs an element with name n, context cx, attributes a and mixed sequence m as children
stripSpace( m )
returns the sequence m after removing any member that is a string consisting entirely of whitespace
deref(ln) = <element> nc p </element>
asserts that the grammar contains <define name="ln"> <element> nc p </element> </define>

The semantics of the element pattern are as follows:

(element)
cx1 |- a; stripSpace( m ) =~ p => k; kr    n in nc    normalized(m)    deref(ln) = <element> nc p </element>

cx2 |- { }; element( n, cx1, a, m ) =~ <ref name="ln"/> => k; kr

6.2.9. data and value pattern

RELAX NG relies on datatype libraries to perform datatyping. A datatype library is identified by a URI. A datatype within a datatype library is identified by a NCName. A datatype library provides two services.

  • It can determine whether a string is a legal representation of a datatype. This service accepts an list of zero or more parameters. For example, a string datatype might have a parameter specifying the length of a string. The datatype library determines what parameters are applicable for each datatype.
  • It can determine whether two strings represent the same datatype. This service does not have any parameters.

Both services may make use of the context of a string. For example, a datatype representing a QName would use the namespace map.

We use the following additional notation:

datatypeAllows(u, ln, params, s, cx)
asserts that in the datatype library identified by URI u, the string s interpreted with context cx is a legal value of datatype ln with parameters params
datatypeEqual(u, ln, s1, cx1, s2, cx2)
asserts that in the datatype library identified by URI u, string s1 interpreted with context cx1 represents the same value of the datatype ln as the string s2 interpreted in the context of cx2
params
ranges over sequences of parameters
[cx]
within the start-tag of a pattern refers to the context of the pattern element
""
represents an empty string
context( u, cx )
constructs a context which is the same as cx except that the default namespace is u; if u is the empty string, then there is no default namespace in the constructed context

The semantics of the data and value patterns are as follows:

(value)
datatypeEqual(u1, ln, s1, cx1, s2, context( u2, cx2 ))

cx1 |- { }; s1 =~ <value datatypeLibrary="u1" type="ln" ns="u2" [cx2]> s2 </value> => { }; { }
(data 1)
datatypeAllows(u, ln, params, s, cx)

cx |- { }; s =~ <data datatypeLibrary="u" type="ln"> params </data> => { }; { }
(data 2)
datatypeAllows(u, ln, params, s, cx)    not(cx |- a; s =~ p => k; kr)

cx |- { }; s =~ <data datatypeLibrary="u" type="ln"> params <except> p </except> </data> => { }; { }
(empty string)
cx |- { }; "" =~ p => k; kr

cx |- { }; ( ) =~ p => k; kr

Note

In the data model, an empty element such as <foo></foo> will have an empty sequence as its children, whereas an empty attribute such as occurs in <foo bar=""></foo> will have an empty string as its value. The "empty string" inference rule ensures that if a datatype allows an empty string and so matches the value of an empty attribute, then it will also match the content of an empty element.

6.2.10. Builtin datatype library

The empty URI identifies a special builtin datatype library. This provides two datatypes, string and token. No parameters are allowed for either of these datatypes.

s1 = s2
asserts that s1 and s2 are identical
normalizeWhiteSpace( s )
returns the string s, with leading and trailing whitespace characters removed, and with each other maximal sequence of whitespace characters replaced by a single space character

The semantics of the two builtin datatypes are as follows:

(string allows)
datatypeAllows("", string, ( ), s, cx)
(string equal)
datatypeEqual("", string, s, cx1, s, cx2)
(token allows)
datatypeAllows("", token, ( ), s, cx)
(token equal)
normalizeWhiteSpace( s1 ) = normalizeWhiteSpace( s2 )

datatypeEqual("", token, s1, cx1, s2, cx2)

6.2.11. list pattern

We use the following additional notation:

split( s )
returns a sequence of strings one for each whitespace delimited token of s; each string in the returned sequence will be non-empty and will not contain any whitespace

The semantics of the list pattern are as follows:

(list)
cx |- { }; split( s ) =~ p => k; kr

cx |- { }; s =~ <list> p </list> => k; kr

6.2.12. key and keyRef pattern

We use the following additional notation:

key( n, u, ln, s, cx )
constructs a key in symbol space n with datatype ln in the datatype library u with lexical value s interpreted with respect to context cx
keyref( n, u, ln, s, cx )
constructs a key reference in symbol space n with datatype ln in the datatype library u with lexical value s interpreted with respect to context cx
p :d u, ln
asserts that the pattern p has datatype ln from library u
exceptPattern
ranges over elements matching the exceptPattern production

The semantics of the key and keyRef patterns are as follows:

(key)
cx |- { }; s =~ p => { }; { }    p :d u1, ln1

cx |- { }; s =~ <key ns="u2" name="ln2"> p </key> => key( name( u2, ln2 ), u1, ln1, s, cx ); { }
(keyRef)
cx |- { }; s =~ p => { }; { }    p :d u1, ln1

cx |- { }; s =~ <keyRef ns="u2" name="ln2"> p </keyRef> => { }; keyref( name( u2, ln2 ), u1, ln1, s, cx )
(datatype data)
<data type="ln" datatypeLibrary="u"> params exceptPattern </data> :d u, ln
(datatype value)
<value type="ln" datatypeLibrary="u"> s </value> :d u, ln
(datatype choice 1)
p1 :d u, ln

<choice> p1 p2 </choice> :d u, ln
(datatype choice 2)
p2 :d u, ln

<choice> p1 p2 </choice> :d u, ln

6.3. Keys

There are two concepts relating to bags of keys and key references. One is that a bag of keys has no conflicting keys. The other is that a bag of keys has a key to satisfy every key reference in a bag of key references.

We use the following additional notation:

keyConflict(k)
asserts that there are conflicting keys in the bag of keys k
k1 subset k2
asserts that k1 is a subset of k2
keyComplete(k, kr)
asserts that the collection of keys k has a key that satisfies every key reference in the bag of key references kr
(keyConflict)
datatypeEqual(u, ln, s1, cx1, s2, cx2)    key( n, u, ln, s1, cx1 ) + key( n, u, ln, s2, cx2 ) subset k

keyConflict(k)
(keyComplete 1)
keyComplete(k + key( n, u, ln, s1, cx1 ), kr)    datatypeEqual(u, ln, s1, cx1, s2, cx2)

keyComplete(k + key( n, u, ln, s1, cx1 ), kr + keyref( n, u, ln, s2, cx2 ))
(keyComplete 2)
keyComplete(k, { })

6.4. Validity

We use the following additional notation:

e
ranges over elements
valid(e)
asserts that the element e is valid with respect to the grammar
start() = p
asserts that the grammar contains <start> p </start>

The following inference rules defines when an element is valid:

(valid)
start() = p    cx |- { }; e =~ p => k; kr    keyComplete(k, kr)    not(keyConflict(k))

valid(e)

7. Restrictions

The following constraints are all checked after the grammar has been transformed to the simple form described in Section 5. The purpose of these restrictions is to catch user errors and to facilitate implementation.

7.1. Composition

A correct RELAX NG schema must be such that when transformed into the simple syntax, it matches the following gramar.

grammar  ::=  <grammar> <start> top </start> define* </grammar>
define  ::=  <define name="NCName"> <element> nameClass top </element> </define>
top  ::=  <notAllowed/>
| pattern
pattern  ::=  <empty/>
| nonEmptyPattern
nonEmptyPattern  ::=  attributeGroup
| value
| repeatable
| <choice> pattern nonEmptyPattern </choice>
| <group> attributeGroup nonEmptyPattern </group>
| <group> nonEmptyPattern attributeGroup </group>
| <interleave> attributeGroup nonEmptyPattern </interleave>
| <interleave> nonEmptyPattern attributeGroup </interleave>
attributeGroup  ::=  singleAttributeGroup
| <oneOrMore> singleAttributeGroup </oneOrMore>
| <group> attributeGroup attributeGroup </group>
| <interleave> attributeGroup attributeGroup </interleave>
| <choice> (<empty/> | attributeGroup) attributeGroup </choice>
singleAttributeGroup  ::=  <attribute> nameClass (<empty/> | value) </attribute>
| <attribute> nameClass </attribute>
| <choice> (<empty/> | value) singleAttributeGroup </choice>
repeatable  ::=  mixed
| singleAttributeGroup
| <choice> (<empty/> | repeatable) repeatable </choice>
| <oneOrMore> repeatable </oneOrMore>
mixed  ::=  <ref name="NCName"/>
| <text/>
| <group> mixed mixed </group>
| <interleave> mixed mixed </interleave>
| <choice> (<empty/> | mixed) mixed </choice>
| <oneOrMore> mixed </oneOrMore>
value  ::=  token
| <text/>
| <list> (<empty/> | tokens) </list>
| <choice> (<empty/> | value) value </choice>
tokens  ::=  token
| <oneOrMore> tokens </oneOrMore>
| <group> tokens tokens </group>
| <interleave> tokens tokens </interleave>
| <choice> (<empty/> | tokens) tokens </choice>
token  ::=  dataValue
| <key ns="anyURI" name="NCName"> keyAtts dataValueChoice </key>
| <keyRef ns="anyURI" name="NCName"> keyAtts dataValueChoice </keyRef>
dataValueChoice  ::=  dataValue
| <choice> dataValueChoice dataValueChoice </choice>
dataValue  ::=  <data datatypeLibrary="anyURI" type="NCName"> params exceptPattern </data>
| <value datatypeLibrary="anyURI" type="NCName" ns="anyURI"> string </value>
exceptPattern  ::=  <except> dataValueChoice </except>?
nameClass  ::=  anyName
| nsName
| name
| <choice> nameClass nameClass </choice>
nsNameClass  ::=  name
| nsName
| <choice> nsNameClass nsNameClass </choice>
finiteNameClass  ::=  name
| <choice> finiteNameClass finiteNameClass </choice>
anyName  ::=  <anyName> <except> nsNameClass </except>? </anyName>
nsName  ::=  <nsName ns="anyURI"> <except> finiteNameClass </except>? </nsName>
name  ::=  <name ns="anyURI"> NCName </name>

Note

The above grammar is ambiguous. This does not matter because the only use that is made of this grammar is to test whether a schema is in the language generated by the grammar.

7.2. Duplicate attributes

Duplicate attributes are not allowed. More precisely, for a pattern <group> p1 p2 </group> or <interleave> p1 p2 </interleave>, there must not be any attribute name that matches both an attribute pattern occuring in p1 and an attribute pattern occuring in p2.

7.3. Interleave

Note

The TC believes it is desirable to have a constraint on the use of the interleave pattern in order to facilitate implementation, but does not yet have consensus on exactly what this constraint should be. Feedback from implementors is solicited.

7.4. key and keyRef

Every key symbol space must have a unique datatype. More precisely, for every local name n and namespace URI uri, there must be a unique datatype name d in datatype library L, such that for every key and keyRef element with name attribute equal to n and ns attribute equal to uri, all data and value descendant elements have a type attribute with value d and a datatypeLibrary attribute with value L.

It must be possible to determine for any element or attribute whether it is a key or key reference and, if so, the symbol space of the key or key reference, by examining just the name of the element or attribute and the names of the ancestors of that element or attribute. A RELAX NG schema that does not satisfy this constraint is said to be key-ambiguous. A RELAX NG schema that is key-ambiguous is not correct.

We now formalize key-ambiguity. First, we need a concept of containment. We use the following notation:

p1 contains p2
asserts that pattern p1 contains pattern p2

The following rules define when one pattern contains another.

(self)
p contains p
(oneOrMore)
p1 contains <oneOrMore> p2 </oneOrMore>

p1 contains p2
(group 1)
p1 contains <group> p2 p3 </group>

p1 contains p2
(group 2)
p1 contains <group> p2 p3 </group>

p1 contains p3
(interleave 1)
p1 contains <interleave> p2 p3 </interleave>

p1 contains p2
(interleave 2)
p1 contains <interleave> p2 p3 </interleave>

p1 contains p3
(choice 1)
p1 contains <choice> p2 p3 </choice>

p1 contains p2
(choice 2)
p1 contains <choice> p2 p3 </choice>

p1 contains p3

Next, we need the concept of a key-type. A key-type is one of notKey, key, keyRef, keyList or keyRefList. Each key-type other than notKey has a parameter which is a name. We use the following notation:

notKey( )
constructs a notKey key-type
key( n )
constructs a key key-type with parameter n
keyRef( n )
constructs a keyRef key-type with parameter n
keyList( n )
constructs a keyList key-type with parameter n
keyRefList( n )
constructs a keyRefList key-type with parameter n
kt
ranges over key-types
p :k kt
asserts that kt is a possible key-type for pattern p

The following rules define what the key-types of a pattern are.

(data)
<data datatypeLibrary="u" type="ln"> params </data> :k notKey( )
(value)
<value datatypeLibrary="u1" type="ln" ns="u2"> s </value> :k notKey( )
(text)
<text/> :k notKey( )
(key)
<key name="ln" ns="u"> p </key> :k key( name( u, ln ) )
(keyRef)
<keyRef name="ln" ns="u"> p </keyRef> :k keyRef( name( u, ln ) )
(keyList)
p :k key( n )

<list> p </list> :k keyList( n )
(keyRefList)
p :k keyRef( n )

<list> p </list> :k keyRefList( n )
(list)
p :k notKey( )

<list> p </list> :k notKey( )
(contains)
p1 :k kt    p2 contains p1

p2 :k kt

Next, we need the concept of a path (from the root of the tree down to an element or attribute). A path is either an element path or an attribute path. An element path is either the root path or is constructed from an element path and a name. An attribute path is constructed from an element path and a name. We use the following notation:

h
ranges over paths
eh
ranges over element paths
/
denotes the root path
eh/n
constructs an element path from eh and n
eh/@n
constructs an attribute path from eh and n
h => p
asserts that pattern p is a feasible pattern for path h

The following rules define when a pattern is feasible for a path:

(start)
start() = p

/ => p
(element)
n in nc    eh => p1    p1 contains <ref name="ln"/>    deref(ln) = <element> nc p2 </element>

eh/n => p2
(attribute)
n in nc    eh => p1    p1 contains <attribute> nc p2 </attribute>

eh/@n => p2

Finally, we can define key-ambiguity. We use the following notation:

keyAmbig()
asserts that the schema is key-ambiguous
kt1 = kt2
asserts that kt1 is identical to kt2

The following rule defines when a schema is key-ambiguous:

(keyAmbig)
h => p1    p1 :k kt1    h => p2    p2 :k kt2    not(kt1 = kt2)

keyAmbig()

8. Conformance

A conforming RELAX NG validator must be able to determine for any XML document whether it is a correct RELAX NG schema. A conforming RELAX NG validator must be able to determine for any XML document and for any correct RELAX NG schema whether the docoument is valid with respect to the schema.

A. RELAX NG schema for RELAX NG

<grammar datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
  ns="http://relaxng.org/ns/structure/0.9"
  xmlns="http://relaxng.org/ns/structure/0.9">

  <start name="pattern">
    <choice>
      <element name="element">
        <choice>
          <attribute name="name">
            <data type="QName"/>
          </attribute>
          <ref name="open-name-class"/>
        </choice>
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="attribute">
        <ref name="common-atts"/>
        <choice>
          <group>
            <attribute name="name">
              <data type="QName"/>
            </attribute>
            <optional>
              <attribute name="global">
                <data type="boolean"/>
              </attribute>
            </optional>
          </group>
          <ref name="open-name-class"/>
        </choice>
        <interleave>
          <ref name="other"/>
          <optional>
            <ref name="pattern"/>
          </optional>
        </interleave>
      </element>
      <element name="group">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="interleave">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="choice">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="optional">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="zeroOrMore">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="oneOrMore">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="list">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="mixed">
        <ref name="common-atts"/>
        <ref name="open-patterns"/>
      </element>
      <element name="ref">
        <attribute name="name">
          <data type="NCName"/>
        </attribute>
        <ref name="common-atts"/>
      </element>
      <element name="parentRef">
        <attribute name="name">
          <data type="NCName"/>
        </attribute>
        <ref name="common-atts"/>
      </element>
      <element name="empty">
        <ref name="common-atts"/>
        <ref name="other"/>
      </element>
      <element name="text">
        <ref name="common-atts"/>
        <ref name="other"/>
      </element>
      <element name="value">
        <optional>
          <attribute name="type">
            <data type="NCName"/>
          </attribute>
        </optional>
        <ref name="common-atts"/>
        <text/>
      </element>
      <element name="data">
        <attribute name="type">
          <data type="NCName"/>
        </attribute>
        <ref name="common-atts"/>
        <interleave>
          <ref name="other"/>
          <group>
            <zeroOrMore>
              <element name="param">
                <attribute name="name">
                  <data type="NCName"/>
                </attribute>
                <text/>
              </element>
            </zeroOrMore>
            <optional>
              <element name="except">
                <ref name="common-atts"/>
                <ref name="open-patterns"/>
              </element>
            </optional>
          </group>
        </interleave>
      </element>
      <element name="key">
        <attribute name="name">
          <data type="QName"/>
        </attribute>
        <ref name="common-atts"/>
        <ref name="open-pattern"/>
      </element>
      <element name="keyRef">
        <attribute name="name">
          <data type="QName"/>
        </attribute>
        <ref name="common-atts"/>
        <ref name="open-pattern"/>
      </element>
      <element name="notAllowed">
        <ref name="common-atts"/>
        <ref name="other"/>
      </element>
      <element name="externalRef">
        <attribute name="href">
          <data type="anyURI"/>
        </attribute>
        <ref name="common-atts"/>
        <ref name="other"/>
      </element>
      <element name="grammar">
        <ref name="common-atts"/>
        <ref name="grammar-content"/>
      </element>
    </choice>
  </start>

  <define name="grammar-content">
    <interleave>
      <ref name="other"/>
      <zeroOrMore>
        <choice>
          <ref name="start-element"/>
          <ref name="define-element"/>
          <element name="div">
            <ref name="common-atts"/>
            <ref name="grammar-content"/>
          </element>
          <element name="include">
            <attribute name="href">
              <data type="anyURI"/>
            </attribute>
            <ref name="common-atts"/>
            <ref name="include-content"/>
          </element>
        </choice>
      </zeroOrMore>
    </interleave>
  </define>

  <define name="include-content">
    <interleave>
      <ref name="other"/>
      <zeroOrMore>
        <choice>
          <ref name="start-element"/>
          <ref name="define-element"/>
          <element name="div">
            <ref name="common-atts"/>
            <ref name="include-content"/>
          </element>
        </choice>
      </zeroOrMore>
    </interleave>
  </define>

  <define name="start-element">
    <element name="start">
      <optional>
        <attribute name="name">
          <data type="NCName"/>
        </attribute>
      </optional>
      <ref name="combine-att"/>
      <ref name="common-atts"/>
      <ref name="open-patterns"/>
    </element>
  </define>

  <define name="define-element">
    <element name="define">
      <attribute name="name">
        <data type="NCName"/>
      </attribute>
      <ref name="combine-att"/>
      <ref name="common-atts"/>
      <ref name="open-patterns"/>
    </element>
  </define>

  <define name="combine-att">
    <optional>
      <attribute name="combine">
        <choice>
          <value>choice</value>
          <value>interleave</value>
        </choice>
      </attribute>
    </optional>
  </define>

  <define name="open-patterns">
    <interleave>
      <ref name="other"/>
      <oneOrMore>
        <ref name="pattern"/>
      </oneOrMore>
    </interleave>
  </define>

  <define name="open-pattern">
    <interleave>
      <ref name="other"/>
      <ref name="pattern"/>
    </interleave>
  </define>

  <define name="name-class">
    <choice>
      <element name="name">
        <ref name="common-atts"/>
        <text/>
      </element>
      <element name="anyName">
        <ref name="common-atts"/>
        <ref name="except-name-class"/>
      </element>
      <element name="nsName">
        <ref name="common-atts"/>
        <ref name="except-name-class"/>
      </element>
      <element name="choice">
        <ref name="common-atts"/>
        <ref name="open-name-classes"/>
      </element>
    </choice>
  </define>

  <define name="except-name-class">
    <interleave>
      <ref name="other"/>
      <optional>
        <element name="except">
          <ref name="open-name-classes"/>
        </element>
      </optional>
    </interleave>
  </define>

  <define name="open-name-classes">
    <interleave>
      <ref name="other"/>
      <oneOrMore>
        <ref name="name-class"/>
      </oneOrMore>
    </interleave>
  </define>

  <define name="open-name-class">
    <interleave>
      <ref name="other"/>
      <ref name="name-class"/>
    </interleave>
  </define>

  <define name="common-atts">
    <optional>
      <attribute name="ns">
        <data type="anyURI"/>
      </attribute>
    </optional>
    <optional>
      <attribute name="datatypeLibrary">
        <data type="anyURI"/>
      </attribute>
    </optional>
    <zeroOrMore>
      <attribute>
        <anyName>
          <except>
            <nsName/>
            <nsName ns=""/>
          </except>
        </anyName>
      </attribute>
    </zeroOrMore>
  </define>

  <define name="other">
    <zeroOrMore>
      <element>
        <anyName>
          <except>
            <nsName/>
          </except>
        </anyName>
        <zeroOrMore>
          <choice>
            <attribute>
              <anyName/>
            </attribute>
            <text/>
            <ref name="any"/>
          </choice>
        </zeroOrMore>
      </element>
    </zeroOrMore>
  </define>

  <define name="any">
    <element>
      <anyName/>
      <zeroOrMore>
        <choice>
          <attribute>
            <anyName/>
          </attribute>
          <text/>
          <ref name="any"/>
        </choice>
      </zeroOrMore>
    </element>
  </define>

</grammar>

References

Normative

Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, Eve Maler, editors. Extensible Markup Language (XML) 1.0 Second Edition. W3C (World Wide Web Consortium), 2000.

Tim Bray, Dave Hollander, and Andrew Layman, editors. Namespaces in XML. W3C (World Wide Web Consortium), 1999.

John Cowan, Richard Tobin, editors. XML Information Set. W3C (World Wide Web Consortium), 2001.

T. Berners-Lee, R. Fielding, L. Masinter. RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax. IETF (Internet Engineering Task Force). 1998.

R. Hinden, B. Carpenter, L. Masinter. RFC 2732: Format for Literal IPv6 Addresses in URL's. IETF (Internet Engineering Task Force), 1999.

Non-Normative

Paul V. Biron, Ashok Malhotra, editors. XML Schema Part 2: Datatypes. W3C (World Wide Web Consortium), 2001.

James Clark. TREX - Tree Regular Expressions for XML. Thai Open Source Software Center, 2001.

MURATA Makoto. RELAX (Regular Language description for XML). INSTAC (Information Technology Research and Standardization Center), 2001.

Allen Brown, Matthew Fuchs, Jonathan Robie, Philip Wadler, editors. XML Schema: Formal Description. W3C (World Wide Web Consortium), 2001.