The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: November 30, 2006
XML Schemas

XML Schema Definition Language: W3C XML Schema Working Group and Schema Specifications

WG Chairs

For up-to-date information, see Description from the XML Activity Statement.

[Fall 1999] The co-chairs of the XML Schema Working Group are Dave Hollander (CommerceNet) and C. M. Sperberg-McQueen (W3C).

Expected WG Deliverables

"The Schema Working group plans to deliver Requirements, Working Drafts, and Proposed Recommendations on data typing and schema language in 1999."

WG Description

"While XML 1.0 supplies a mechanism, the Document Type Definition (DTD) for declaring constraints on the use of markup, automated processing of XML documents requires more rigorous and comprehensive facilities in this area. Requirements are for constraints on how the component parts of an application fit together, the document structure, attributes, data-typing, and so on. The XML Schema Working Group is addressing means for defining the structure, content and semantics of XML documents." See XML Schema Requirements Comments mailing list: www-xml-schema-comments@w3.org and archive.

XML Schema Developers List: The xmlschema-dev@w3.org discussion list is publicly archived on the W3C server. Henry Thompson announced this public list on April 07, 2000 with a message "'XML Schema Developers List Launched': To accompany the XML Schema Last Call drafts, the W3C is pleased to announce the opening of a public mailing list for XML Schema implementation developers, xmlschema-dev@w3.org. To subscribe, send mail to xmlschema-dev-request@w3.org with 'subscribe' as the subject."

See also "XML Schema: Tools - Usage - Development".

WG Published Deliverables

XML Schema Requirements

  • "XML Schema Requirements." W3C Note 15 February 1999. Edited by Ashok Malhotra (IBM) and Murray Maloney (Veo Systems Inc.). The document 'specifies the purpose, basic usage scenarios, design principles, and base requirements for an XML schema language.[local archive copy]

XML Schema Formalization

[March 21, 2001]   W3C Publishes XML Schema Formalization.    A communiqué from Matthew Fuchs (Commerce One) highlights the technical significance of W3C's recent XML Schema formalization, published in a W3C Working Draft as XML Schema: Formal Description. The document supplies formal a description of XML types and validity as specified by the recently-issued Proposed Recommendation XML Schema Part 1: Structures. From the Introduction: "This formalization is a formal, declarative system for describing and naming XML Schema information, specifying XML instance type information, and validating instances against schemas. The goals of the formalization are to: (1) Provide a semantic framework for software systems that use the W3C XML Schema specification, such as the W3C XML Query Algebra; (2) Specify names for all components of an XML Schema, so that they can be uniquely identified by URIs. Such unique identifiers may be useful to XML Query, RDF, and topic maps, among others; (3) Formally define validation at a declarative level; (4) Define the mapping from the current XML Schema syntax onto the structures described here, as well as the mapping between the XML Schema component mode and our component model. Many potential applications of XML Schema may benefit from the definition of a formal model. We have focused on the material in Part I (Structures), as this is the most complex; a basic understanding of first-order predicate logic, which is part of most computer science curricula, is adequate to understand this document." [Full context]

XML Schema Definition Language - W3C Recommendation

  • [May 03, 2001]   W3C XML Schema Published as a W3C Recommendation.    The World Wide Web Consortium (W3C) has announced the publication of the W3C XML Schema specification as a W3C Recommendation. A W3C 'Recommendation' "indicates that a specification is stable, contributes to Web interoperability, and has been reviewed by the W3C Membership, who are in favor of supporting its adoption by academic, industry, and research communities. XML Schemas define shared markup vocabularies, the structure of XML documents which use those vocabularies, and provide hooks to associate semantics with them. With over two years of development and testing through implementation, XML Schema provides an essential piece for XML to reach its full potential. The XML Schema specification consists of three parts. One part defines a set of simple datatypes, which can be associated with XML element types and attributes; this allows XML software to do a better job of managing dates, numbers, and other special forms of information. The second part of the specification proposes methods for describing the structure and constraining the contents of XML documents, and defines the rules governing schema-validation of documents. The third part is a primer, which explains what schemas are, how they differ from DTDs, and how someone builds a schema. XML Schema introduces new levels of flexibility that may accelerate the adoption of XML for significant industrial use. For example, a schema author can build a schema that borrows from a previous schema, but overrides it where new unique features are needed. XML Schema allows the author to determine which parts of a document may be validated, or identify parts of a document where a schema may apply. XML Schema also provides a way for users of ecommerce systems to choose which XML Schema they use to validate elements in a given namespace, thus providing better assurance in ecommerce transactions and greater security against unauthorized changes to validation rules. Further, as XML Schema are XML documents themselves, they may be managed by XML authoring tools, or through XSLT." [Full context

  • In connection with the release of the Schema Recommendation, W3C has also provided for the creation of a W3C XML Schema Test Collection, announced by Henry S. Thompson (University of Edinburgh and W3C; Oriol Carbo, University of Edinburgh and W3C). "Goals and Objectives: The W3C XML Schema Test Collection work aims at coordinating test suites for W3C XML Schema processors created by different developers." The main objectives as announced 2001-05-02 are: (1) to integrate existing tests for W3C XML Schema processors in a common environment so they can be accessed publicly and shared among developers; (2) to establish a standard approach to test material IPR which meets the needs of both contributors and users; (3) to collect and develop tools to automate the execution and presentation of the test suites; (4) to offer a standard description of tests related to W3C XML Schema processors: [...]; (5) [to provide test descriptions] understandable by a developer without the need to actually view the test file(s) themselves); (6) to offer a standard presentation of test results; (7) to design additional tests and add/regularise descriptions of the existing tests; (8) in due course, to provide an XSLT-based approach to comparing XML representations of the post schema-validation infoset as produced by different processors; we will shortly announce the availability of XML Schemas for both the ordinary Infoset and the PSVInfoset. "The W3C expects to author only a small part of the collection -- we are counting on Member organisations and others to contribute the majority. To offer materials for the collection, please send e-mail to www-xml-schema-tests@w3.org." Note from Henry Thompson: "...the www-xml-schema-tests@w3.org mailing address is not a mailing list; it's for potential contributors to use to initiate discussions about contributions. For discussions of testing, I don't think we need a new mailing list; I'd expect xmlschema-dev@w3.org to be used for discussing W3C XML Schema testing..."

  • Announcement: "World Wide Web Consortium Issues XML Schema as a W3C Recommendation. Two Years of Development Produces Comprehensive Solution for XML Vocabularies."
  • Testimonials for XML Schema Recommendation - From Altova, Inc., Commerce One, IBM, IPR Systems, Lotus Development Corporation, Microsoft Corporation, Oracle Corporation, Reuters, Inc., SAP AG, webMethods, and University of Edinburgh.
  • XML Schema Part 1: Structures. W3C Recommendation 02-May-2001. [Default] namespace: http://www.w3.org/2001/XMLSchema. Latest version URL: http://www.w3.org/TR/xmlschema-1/. Edited by Henry S. Thompson (University of Edinburgh), David Beech (Oracle Corporation), Murray Maloney (for Commerce One), and Noah Mendelsohn (Lotus Development Corporation). Available in XML format (with XML DTD and XSL stylesheet); see also the separate XML Schema and XML DTD for Part 1. [cache .ZIP, local copy]
  • XML Schema Part 2: Datatypes. W3C Recommendation 02-May-2001. Latest version URL: http://www.w3.org/TR/xmlschema-2/. Edited by Paul V. Biron (Kaiser Permanente, for Health Level Seven) and Ashok Malhotra (Microsoft, formerly of IBM). Available in XML; also with a schema for built-in datatypes only. [cache]
  • XML Schema Part 0: Primer. W3C Recommendation 02-May-2001. Latest version URL: http://www.w3.org/TR/xmlschema-0/. Edited by David C. Fallside (IBM). Appendices include: A. Acknowledgements; B. Simple Types & Their Facets; C. Using Entities D. Regular Expressions E. Index. [cache]
  • Namespaces. [Default] namespace: http://www.w3.org/2001/XMLSchema. Also: "To facilitate usage in specifications other than the XML Schema definition language, such as those that do not want to know anything about aspects of the XML Schema definition language other than the datatypes, each 'built-in' datatype is also defined in the namespace whose URI is: http://www.w3.org/2001/XMLSchema-datatypes. This applies to both 'built-in primitive' and 'built-in derived' datatypes."
  • Mail Archives for W3C XML Schema development 'xmlschema-dev@w3.org'. Send subscription requests to xmlschema-dev-request@w3.org.
  • Mail Archives for 'www-xml-schema-comments'
  • XML Schema type library - Sample. See the Primer (Part 0) for reference/description. [cache]
  • Errata for W3C XML Schema Rec
  • Translations of W3C XML Schema Rec

XML Schema Definition Language - Proposed Recommendation

[March 31, 2001] XML Schema. W3C Proposed Recommendation 30-March-2001. "This version of this Proposed Recommendation replaces that published on 16-March-2001. The only change from that draft is that the type there called 'number' is here renamed 'decimal'. This type was called 'decimal' up until the draft of 16 March 2001, so this change simply restores the original name of this type." The deadline for review of this document is Monday 16 April 2001. XML Schema Part 0: Primer; XML Schema Part 1: Structures; XML Schema Part 2: Datatypes.

[March 16, 2001]   W3C Publishes XML Schema as a Proposed Recommendation.    The W3C XML Schema specification has advanced to the 'Proposed Recommendation' stage, indicating that "the specification is stable and that implementation experience has been gathered, showing that each feature of the specification can be implemented." The three-part document has been produced as part of the W3C XML Activity. This PR version replaces the Candidate Recommendation of October 24, 2000. The deadline for review of the PR specification is Monday April 16, 2001. Review comments may be sent to the publicly archived 'xmlschema-dev' mailing list. As with the Candidate Recommendation, "The XML Schema specification consists of three parts. One part defines a set of simple datatypes, which can be associated with XML element types and attributes; this allows XML software to do a better job of managing dates, numbers, and other special forms of information. The second part of the specification proposes methods for describing the structure and constraining the contents of XML documents, and defines the rules governing schema-validation of documents. The third part is a primer, which explains what schemas are, how they differ from DTDs, and how someone builds a schema." [Full context]

See also:

  • DTD changes in XML Schema from CR to PR. Provided by Robin LaFontaine, based on DeltaXML schema comparison software. See DeltaXML XML Schema [comparison] software. [source]
  • [March 31, 2001] Revised Online Validator for XML Schema (XSV) and XML Schema Update Tool (XSU). Henry S. Thompson (HCRC Language Technology Group, University of Edinburgh) has announced a revised 'beta test' release of his XSV 'Validator for XML Schema' service and corresponding XSU update tool for XML Schema CR -> PR. This version of XSV supports checking XML schema documents with the namespace URI http://www.w3.org/2001/XMLSchema, corresponding to the W3C Proposed Recommendation for XML Schema. XSV is an open source "GPLed work-in-progress attempt at a conformant schema-aware processor." The online XSV interface provides two forms for W3C XML schema checking: "(1) one for checking a schema which is accessible via the Web, and/or schema-validating an instance with a schema of your own; (2) another form for use if you are behind a firewall or have a schema to check which is not accessible via the Web." In addition to source code distributions (Python), the latest version of XSV is available in a self-installing package for Win32 platforms. The XSU transformation tool provides for automated update of XML Schema documents from the XML Schema 20000922 version to the Proposed Recommendation (20010316) version. It is a service in 'beta test' which "attempts to convert valid XML schema documents with the namespace URI http://www.w3.org/2000/10/XMLSchema to valid schema documents with the namespace URI http://www.w3.org/2001/XMLSchema" using a transform sheet. In this connection, the XSU developers request sample XML schemas to provide a testing pool: they ask that users grant permission to W3C to retain input of tested XML schemas (just tick the checkbox). [Full context]

XML Schema Definition Language - Candidate Recommendation

  • [October 24, 2000] Testimonials for XML Schema Candidate Recommendation. [cache]

  • [October 24, 2000] Announcement (in part): A W3C press release announces the publication of XML Schema as a W3C Candidate Recommendation. "The World Wide Web Consortium (W3C) has issued XML Schema as a W3C Candidate Recommendation. Advancement of the document to Candidate Recommendation is an invitation to the Web development community at large to make implementations of XML Schema and provide technical feedback. Simply defined, XML Schemas define shared markup vocabularies and allow machines to carry out rules made by people. They provide a means for defining the structure, content and semantics of XML documents. 'Databases, ERP and EDI systems all know the difference between a date and a string of text, but before today, there was no standard way to teach your XML systems the difference. Now there is,' declared Dave Hollander, co-chair of the W3C XML Schema Working Group and CTO of Contivo, Inc. 'W3C XML Schemas bring to XML the rich data descriptions that are common to other business systems but were missing from XML. Now, developers of XML ecommerce systems can test XML Schema's ability to define XML applications that are far more sophisticated in how they describe, create, manage and validate the information that fuels B2B ecommerce.' By bringing datatypes to XML, XML Schema increases XML's power and utility to the developers of electronic commerce systems, database authors and anyone interested in using and manipulating large volumes of data on the Web. By providing better integration with XML Namespaces, it makes it easier than it has ever been to define the elements and attributes in a namespace, and to validate documents which use multiple namespaces defined by different schemas. XML Schema introduces new levels of flexibility that may accelerate the adoption of XML for significant industrial use. For example, a schema author can build a schema that borrows from a previous schema, but overrides it where new unique features are needed. his principle, called inheritance, is similar to the behavior of Cascading Style Sheets, and allows the user to develop XML Schemas that best suit their needs, without building an entirely new vocabulary from scratch. XML Schema allows the author to determine which parts of a document may be validated, or identify parts of a document where a schema may apply. XML Schema also provides a way for users of ecommerce systems to choose which XML Schema they use to validate elements in a given namespace, thus providing better assurance in ecommerce transactions and greater security against unauthorized changes to validation rules. Further, as XML Schema are XML documents themselves, they may be managed by XML authoring tools, or through XSLT. . . Candidate Recommendation is W3C's public call for implementation, an explicit invitation for W3C members and the developer community at large to review the XML Schema specification and build their own XML Schemas. This period of implementations and reporting allows the editors to learn how developers outside of the Working Group might use them, and where there may be ambiguities for implementors. Public testing and implementation contribute to a more robust XML Schema, and to more widespread use." See the full text of the announcement: "World Wide Web Consortium Issues XML Schema as a Candidate Recommendation. Implementation testing the key to Interoperability." [cache]

  • [October 24, 2000] XML Schema Part 1: Structures. W3C Candidate Recommendation 24-October-2000, edited by Henry S. Thompson (University of Edinburgh), David Beech (Oracle Corp.), Murray Maloney (for Commerce One), and Noah Mendelsohn (Lotus Development Corporation). Part 1 defines the XML Schema definition language, "which offers facilities for describing the structure and constraining the contents of XML 1.0 documents, including those which exploit the XML Namespace facility. The schema language, which is itself represented in XML 1.0 and uses namespaces, substantially reconstructs and considerably extends the capabilities found in XML 1.0 document type definitions (DTDs). This specification depends on XML Schema Part 2: Datatypes. Appendix A supplies a normative "Schema for Schemas"; Appendix F contains a non-normative "DTD for Schemas"; Appendix J gives brief summaries of the substantive changes to this specification since the public working draft of 7 April 2000.

  • XSD for XML Schema, [cache]

  • DTD for XML Schema, [cache]

  • [October 24, 2000] XML Schema Part 2: Datatypes. W3C Candidate Recommendation 24-October-2000, exited by Paul V. Biron (Kaiser Permanente, for Health Level Seven) and Ashok Malhotra (IBM). Part 2 of the specification for the XML Schema language "defines facilities for defining datatypes to be used specifications. The datatype language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs) for specifying datatypes on elements and attributes." Appendix A provides the normative "Schema for Datatype Definitions" and Appendix B gives the non-normative "DTD for Datatype Definitions."

  • [October 24, 2000] XML Schema Part 0: Primer. W3C Candidate Recommendation 24-October-2000, edited by David C. Fallside (IBM). "XML Schema Part 0: Primer is a non-normative document intended to provide an easily readable description of the XML Schema facilities and is oriented towards quickly understanding how to create schemas using the XML Schema language. XML Schema Part 1: Structures and XML Schema Part 2: Datatypes provide the complete normative description of the XML Schema language -- this primer describes the language features through numerous examples which are complemented by extensive references to the normative texts."

  • [October 24, 2000] Commentary. See the longer memo from Henry S. Thompson (Janet Daly) with an explanation of why (I18N) WG dissented from the specification's treatment of dates and times, and the CR exit criteria. Also, in connection with this CR publication, Henry Thompson announced the availablility of a self-installing version of XSV, the W3C/University of Edinburgh XML Schema validator; 'WIN32 for now, UN*X coming soon'.

XML Schema Definition Language - Seventh Working Draft

  • [September 22, 2000] XML Schema Part 1: Structures. Reference: W3C Working Draft 22-September-2000, edited by Henry S. Thompson (University of Edinburgh), David Beech (Oracle Corp.), Murray Maloney (for Commerce One), and Noah Mendelsohn (Lotus Development Corporation). The most important changes for the 07-September-2000 release are found in this Structures document. The non-normative 'Appendix H' in the Structures document supplies a "Description of changes" to the working draft since the previous public version of 07-April-2000. Some eighteen (18) changes are identified here. For example: [H1] "'Equivalence classes' have been renamed 'substitution groups', to reflect the fact that their semantics is not symmetrical; [H2] "The content model of the complexType element has been significantly changed, allowing for tighter content models and a better fit between the abstract component and its XML Representation"; [H3] "Empty content models are now signalled by an explicit empty content particle, mixed content by specifying the value true for the mixed attribute on complexType or complexContent; [H6] "A new form of schema composition operation, similar to that provided by include but allowing constrained redefinition of the included components has been added, using a redefine element"; [H8] "The defaulting for the minOccurs and maxOccurs attributes of element has been simplified: it is now 1 in both cases, with no interdependencies"; [H9] "The content model for the group element when it occurs at the top level has been tightened, to allow only a single all, choice, group, or sequence child"; [H13] "Abstract types in element declarations are now allowed." See the main news entry and editorial notes provided in Henry Thompson's announcement 'New Pre-CR Public Working Drafts of XML Schema Released'. [cache]

  • [September 22, 2000] XML Schema Part 2: Datatypes. Reference: W3C Working Draft 22-September-2000, edited by Paul V. Biron (Kaiser Permanente, for Health Level Seven) and Ashok Malhotra (IBM). "XML Schema: Datatypes is part 2 of a two-part draft of the specification for the XML Schema definition language. "This document proposes facilities for defining datatypes to be used in XML Schemas as well as other XML specifications. The datatype language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs) for specifying datatypes on elements and attributes." [cache]

  • [September 22, 2000] XML Schema Part 0: Primer. Reference: W3C Working Draft, 22-September-2000, edited by David C. Fallside (IBM). "XML Schema Part 0: Primer is a non-normative document intended to provide an easily readable description of the XML Schema facilities and is oriented towards quickly understanding how to create schemas using the XML Schema language." [cache]

XML Schema Definition Language - Sixth Working Draft

XML Schema Definition Language - Fifth Working Draft

  • [February 26, 2000] XML Schema Part 0: Primer. Reference: W3C Working Draft, 25-February-2000, edited by David C. Fallside (IBM). The primer has been issued in conjunction with a new working draft [25 February 2000] of the normative tomes on XML Schema Structures and XML Schema Datatypes. Abstract: XML Schema Part 0: Primer is a non-normative document intended to provide an easily readable description of the XML Schema facilities and is oriented towards quickly understanding how to create schemas using the XML Schema language. XML Schema Part 1: Structures and XML Schema Part 2: Datatypes provide the complete normative description of the XML Schema definition language, and the primer describes the language features through numerous examples which are complemented by extensive references to the normative texts. This 'Second Torah' commentary is officially a part of the W3C XML Activity. Discrepancies between the sacred text and its commentary are noted in the Primer. Description: "This document, XML Schema Part 0: Primer, provides an easily approachable description of the XML Schema definition language, and should be used alongside the formal descriptions of the language contained in Parts 1 and 2 of the XML Schema specification. The intended audience of this document includes application developers whose programs read and write schema documents, and schema authors who need to know about the features of the language, especially features that provide functionality above and beyond what is provided by DTDs. The text assumes that you have a basic understanding of XML 1.0 and XML-Namespaces. Each major section of the primer introduces new features of the language, and describes the features in the context of concrete examples. Section 2 covers the basic mechanisms of XML Schema. It describes how to declare the elements and attributes that appear in XML documents, the distinctions between simple and complex types, defining complex types, the use of simple types for element and attribute values, schema annotation, a simple mechanism for re-using element and attribute definitions, and null values. Section 3 covers some of XML Schema's advanced features, and in particular, it describes mechanisms for deriving types from existing types, and for controlling these derivations. The section also describes mechanisms for merging together fragments of a schema from multiple sources, and for element substitution. Section 4 covers more advanced features, including a powerful mechanism for specifying uniqueness among attributes and elements, a mechanism for using types across namespaces, a mechanism for extending types based on namespaces, and a description of how documents are checked for conformance. In addition to the sections just described, the primer has a number of appendices that contain detailed reference information on simple types and an associated regular expression language. The primer is a non-normative document, which means that it does not provide a definitive (from the W3C's point of view) specification of the XML Schema language."

  • [February 26, 2000] XML Schema Part 1: Structures. Reference: W3C Working Draft 25-February-2000; edited by Henry S. Thompson (University of Edinburgh), David Beech (Oracle Corp.), Murray Maloney (Commerce One), and Noah Mendelsohn (Lotus Development Corporation). The specification itself is available in XML as well as HTML format. The release includes a formal description of schema 'structures' facilities in schema and in XML DTD notation. "Following a period of review and polishing, it is the WG's intent to issue a Last Call for Review by other W3C working groups sometime during March, 2000, and to submit this specification thereafter for publication as a Candidate Recommendation." Document abstract: "XML Schema: Structures specifies the XML Schema definition language, which offers facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs). This specification depends on XML Schema Part 2: Datatypes." Description: "The purpose of XML Schema: Structures is to define the nature of XML schemas and their component parts, provide an inventory of XML markup constructs with which to represent schemas, and define the application of schemas to XML documents. The purpose of an XML Schema: Structures schema is to define and describe a class of XML documents by using schema components to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content and attributes and their values. Schemas may also provide for the specification of additional document information, such as default values for attributes and elements. Schemas have facilities for self-documentation. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents. Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The XML Schema: Structures formalism allows a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations."

  • [February 26, 2000] XML Schema Part 2: Datatypes. Reference: W3C Working Draft 25-February-2000; edited by Paul V. Biron (Kaiser Permanente, for Health Level Seven) and Ashok Malhotra (IBM). The release includes separate documents containing the corresponding formal schema notation, an XML DTD, and schema for built-in datatypes only; there is an XML version as well. XML Schema: Datatypes is part 2 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for defining datatypes to be used in XML Schemas and other XML specifications. The datatype language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs) for specifying datatypes on elements and attributes. Rationale for Part 2: "... validity constraints exist on the content of [XML document] instances that are not expressible in XML DTDs. The limited datatyping facilities in XML have prevented validating XML processors from supplying the rigorous type checking required in these situations. The result has been that individual applications writers have had to implement type checking in an ad hoc manner. This specification addresses the need of both document authors and applications writers for a robust, extensible datatype system for XML which could be incorporated into XML processors. As discussed below, these datatypes could be used in other XML-related standards as well." The concrete requirements to be fulfilled by this specification are articulated in the XML Schema Requirements document; it states that the XML Schema Language must: (1) provide for primitive data typing, including byte, date, integer, sequence, SQL & Java primitive data types, etc.; (2) define a type system that is adequate for import/export from database systems (e.g., relational, object, OLAP); (3) distinguish requirements relating to lexical data representation vs. those governing an underlying information set; (4) allow creation of user-defined datatypes, such as datatypes that are derived from existing datatypes and which may constrain certain of its properties (e.g., range, precision, length, format). "Although the Working Group does not anticipate further substantial changes to the functionality described here, this is still a working draft, subject to change based on experience and on comment by the public and other W3C working groups. Following a period of review and polishing, it is the WG's intent to issue a Last Call for Review by other W3C working groups sometime during March, 2000, and to submit this specification thereafter for publication as a Candidate Recommendation."


XML Schema Definition Language - Fourth Working Draft

  • [December 17, 1999] XML Schema Part 1: Structures (W3C Working Draft 17-December-1999). Edited by Henry S. Thompson (University of Edinburgh), David Beech (Oracle Corp.), Murray Maloney (Commerce One), and Noah Mendelsohn (Lotus Development Corporation). XML Schema: Structures represents "part 1 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs)." The purpose of a Structures schema document is to "define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content, attributes and their values. Schema constructs may also provide for the specification of additional information such as default values. Schemas are intended to document their own meaning, usage, and function through a common documentation vocabulary. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents." [local archive copy]

  • XML schema - 'Schema for Schemas'; [local archive copy]

  • XML DTD - 'DTD for Schemas'; [local archive copy]

  • XML Schema: Structures in XML format

  • [December 20, 1999] Henry S. Thompson has made available the slides and some additional documentation from an intensive (full-day) tutorial session on the XML Schema Definition Language, presented at XML '99 in Philadelphia. The materials have been updated to match the 1999-12-17 versions of the Schema WD documents. (1) presentation slides in HTML, and (2) additional materials in HTML. Original/canonical formats: Powerpoint slides [local archive copy] and additional notes (.doc) [local archive copy]. Announced on XML-DEV 20-Dec-1999.

  • dtddiff of 1999-12-17 schema proposal vis-à-vis 1999-11 WD. "Below is the result of running Earl Hood's dtddiff script to compare the DTD's from appendix B of the November and December drafts of the W3C Schema Proposal." From Bob DuCharme

  • XML Schema Part 2: Datatypes (W3C Working Draft 17-December-1999) has been edited by Paul V. Biron (Kaiser Permanente, for Health Level Seven) and Ashok Malhotra (IBM). XML Schema: Datatypes presents part 2 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for defining datatypes to be used in XML Schemas and other XML specifications. The datatype language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs) for specifying datatypes on elements and attributes." It is well known that "validity constraints exist on the content of [XML document] instances that are not expressible in XML DTDs. The limited datatyping facilities in XML have prevented validating XML processors from supplying the rigorous type checking required in these situations. The result has been that individual applications writers have had to implement type checking in an ad hoc manner. This [Datatypes] specification addresses the need of both document authors and applications writers for a robust, extensible datatype system for XML which could be incorporated into XML processors." At the moment, these datatypes can be specified only "for element content that would be specified as #PCDATA and attribute values of various types in a DTD." [local archive copy]

  • XML schema - Schema for datatypes; [local archive copy]

  • XML DTD - DTD for datatypes; [local archive copy]

  • XML Schema: Datatypes in XML Format


XML Schema Definition Language - Third Working Draft


XML Schema Definition Language - Second Working Draft

  • XML Schema Part 1: Structures. Reference: W3C Working Draft 24-September-1999, edited by Henry S. Thompson (University of Edinburgh), David Beech (Oracle), Murray Maloney (Commerce One), and Noah Mendelsohn (Lotus). Part 1 "proposes facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs). . . The purpose of the Structures specification is to provide an inventory of XML markup constructs with which to write schemas." Such a schema is used "to define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content, attributes and their values, entities and their contents and notations. Schema constructs may also provide for the specification of additional information such as default values. Schemas are intended to document their own meaning, usage, and function through a common documentation vocabulary. Thus, the Structures specification can be used to define, describe and catalogue XML vocabularies for classes of XML documents. Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The Structures formalism will allow a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations."

  • XML Schema Part 2: Datatypes. Reference: W3C Working Draft 24-September-1999, edited by Paul V. Biron (Kaiser Permanente, for Health Level Seven) and Ashok Malhotra (IBM). The Datatypes document "specifies a language for defining datatypes to be used in XML Schemas and possibly elsewhere." As explained in the W3C's XML Schema Requirements document, the Datatypes design is motivated by the recognition that "document authors, including authors of traditional documents and those transporting data in XML, often require a high degree of type checking to ensure robustness in document understanding and data interchange." In many cases, "validity constraints exist on the content of the [XML] instances that are not expressible in XML DTDs. The limited datatyping facilities in XML have prevented validating XML processors from supplying the rigorous type checking required in these situations. The result has been that individual applications writers have had to implement type checking in an ad hoc manner. This [Datatypes] specification addresses the need of both document authors and applications writers for a robust, extensible datatype system for XML which could be incorporated into XML processors." The facilities in the Datatypes WD have been designed in light of the formal requirements, which stipulate that the XML Schema Language must: (1) provide for primitive data typing, including byte, date, integer, sequence, SQL & Java primitive data types, etc.; (2) define a type system that is adequate for import/export from database systems (e.g., relational, object, OLAP); (3) distinguish requirements relating to lexical data representation vs. those governing an underlying information set; and (4) allow creation of user-defined datatypes, such as datatypes that are derived from existing datatypes and which may constrain certain of its properties (e.g., range, precision, length, format)."

XML Schema Definition Language - First Working Draft


Related XML Schema Proposals

XML-Data (XDR)

  • XML-Data. W3C Note 05-Jan-1998. "This paper describes an XML vocabulary for schemas, that is, for defining and documenting object classes. It can be used for classes which as strictly syntactic (for example, XML) or those which indicate concepts and relations among concepts (as used in relational databases, KR graphs and RDF). The former are called 'syntactic schemas;' the latter 'conceptual schemas'."

  • XML-Data Reduced. Draft Version 3-July-1998, version 0.21. By Charles Frankston (Microsoft) and Henry S. Thompson (University of Edinburgh). "The XML-Data submission contained many new ideas that an XML schema language could support. This document refines and subsets those ideas down to a more manageable size in order to allow faster progress toward adopting a new schema language for XML. Some of the inconsistencies in the XML-Data submission are cleaned up, and some changes have been made based on comments received since the XML-Data submission was posted. This note is a refinement of the January 1998 XML-Data submission http://www.w3.org/TR/1998/NOTE-XML-data-0105/." [local archive copy]

  • "[Microsoft] XML Schema Developer's Guide" Andrew Layman [XML-DEV 18-May-1999]: "'XML-Data Reduced' (XDR) refers to a trimmed and improved version of the XML-Data schema syntax. The original XML-Data submission can be found at http://www.w3.org/TR/1998/NOTE-XML-data/. Information on XML-Data Reduced is at http://msdn.microsoft.com/xml/XMLGuide/schema-overview.asp'." Note: 'XDR' in this context not to be confused with XDR: External Data Representation Standard (Network Working Group, Request for Comments: 1832, August 1995).

  • [February 09, 2001] Microsoft XDR-XSD Converter. An XDR-XSD Converter is available for download from Microsoft's [XML] MSDN Online Development Center. The transformation tool is made available in its current beta state for experimentation purposes only. "This XSLT stylesheet transforms XML Data Reduced (XDR) Schemas as supported in MSXML to XML Schemas -- conformant to the W3C XML Schema Candidate Recommendation of October 25, 2000. MSXML 3.0 itself does not support the W3C XML Schema format. Notes on XDR-XSD Converter: XDR schemas using open content models allow more attribute extensions than the XML Schema resulting from this style sheet. Specifically, under XDR, when model="open" attributes from the target namespace may be added to an element, as long as they conform to the validity constraints for that attribute. Attributes from other namespaces may be added to an element, whether or not there are validity constraints for those attributes. It is not possible in XSD to treat attribute validation differently for attributes from the target namespace (<xsd:anyAttribute namespace="##targetNamespace" processContents="lax"/>) than for attributes from other namespaces (<xsd:anyAttribute namespace="##other" processContents="strict"/>). This is represented in the XML Schema DTD as only allowing one <xsd:anyAttribute/> element. The transformed XSD schema will not allow attributes from the target namespace. You may want to adjust the resulting schema to accommodate for this, by either not requiring attributes from the target namespace to be validated, or by adding the allowed attributes explicitly to the content model." The developers welcome feedback on problems encountered or other suggestions that would help them improve the transformation tool.

Document Content Description (DCD)

  • DCD. NOTE-dcd-19980731, Submission to the World Wide Web Consortium 31-July-1998. "This document proposes a structural schema facility, Document Content Description (DCD), for specifying rules covering the structure and content of XML documents. The DCD proposal incorporates a subset of the XML-Data Submission and expresses it in a way which is consistent with the ongoing W3C RDF (Resource Description Framework) effort; in particular, DCD is an RDF vocabulary. DCD is intended to define document constraints in an XML syntax; these constraints may be used in the same fashion as traditional XML DTDs. DCD also provides additional properties, such as basic datatypes."

Schema for Object-oriented XML (SOX)

  • SOX. Schema for Object-oriented XML. NOTE-SOX-19980930, Submitted to W3C 19980915. "This document proposes a schema facility, Schema for Object-oriented XML (SOX), for defining the structure, content and semantics of XML documents to enable XML validation and higher levels of automated content checking. The SOX proposal is informed by the XML 1.0 [XML] specification as well as the XML-Data submission (XML-Data), the Document Content Description submission (DCD) and the EXPRESS language reference manual (ISO-10303-11). SOX provides an alternative to XML DTDs for modeling markup relationships to enable more efficient software development processes for distributed applications."

  • "SOX was created by Commerce One in anticipation of the coming W3C standard XML Schema Language. SOX leverages the object-relationships between data structures to allow for the easy management of a component library, and the use of type relationships within schema-based e-commerce applications. It also provides strong datatyping capabilities, like other XML Schema languages..."

  • References:

Document Definition Markup Language (DDML)

  • DDML. Document Definition Markup Language (DDML) Specification, Version 1.0. NOTE-ddml-19990119, W3C Note, 19-Jan-1999. "This document proposes Document Definition Markup Language (DDML), a schema language for XML documents. DDML [previously 'XSchema'] encodes the logical (as opposed to physical) content of DTDs in an XML document. This allows schema information to be explored and used with widely available XML tools. DDML is deliberately simple, providing an initial base for implementations. While introducing as few complicating factors as possible, DDML has been designed with future extensions, such as data typing and schema reuse, in mind."

Schematron

See the separate document for information on Schematron. Development takes place on SourceForge. [1]

Datatypes for DTDs (DT4DTD)

  • [January 13, 2000] The W3C has acknowledged a submission request from Extensibility, Inc for publication of a NOTE: Datatypes for DTDs (DT4DTD) 1.0. Reference: W3C Note 13-January-2000, by Lee Buck (Extensibility), Charles F. Goldfarb (The XML Handbook), and Paul Prescod (ISOGEN International). The document abstract: "The presented specification allows legacy systems that may presently be unable to convert their DTD markup declarations to XML Schema, to utilize XML Schema conformant datatypes. With it, DTD creators can specify datatypes for attribute values and data content, thereby providing the foundation for a smoother future transition path. NOTE: Free open-source code that supports this specification for both SAX and DOM is available at www.extensibility.com/dt4dtd." [And:] "XML 1.0, using DTDs, provides a strong foundation for validating the syntax of a document and ensuring that all the necessary pieces of information are present (i.e. necessary elements are included, inappropriate ones are not, attributes are supplied when required, etc.). DTDs do not, however, offer much help in constraining the value of a particular attribute or element: a.k.a. datatypes to those with programming backgrounds. DT4DTD brings this important capability to XML. Specifically it: (1) Provides compatibility with XML Schema data types; (2) Provides compatibility with XML-Data data types; (3) Provides programmatic extensions for DOM and SAX; (4) Provides an extensible architecture for custom datatypes; (5) Provides runtime support for data typed schemas created in XML Authority... The techniques specified in DT4DTD are already in commercial use in several places, including the Financial Products Markup Language from JPMorgan and PriceWaterhouseCoopers, and versions of the Common Business Library from Commerce One, among others. Design-time support for DT4DTD is also available in leading commercial schema design tools such as XML Authority, which is produced by this submitting member organization." See also the W3C staff comment from Dan Connolly, which reads (in part): "W3C is pleased to receive the DT4DTD submission from Extensibility. During this transitional period of the W3C XML Activity while the XML Schema Working Group develops and deploys their work, this submission provides a valuable mechanism for addressing the lack of data types such as integer, date, etc. in XML 1.0 DTD syntax in a way that is compatible with legacy systems. However, it relies on a global convention for the interpretation of the unqualified names e-dtype and a-dtype, while use of XML Namespaces would make this unnecessary..." Local archive copy: NOTE, W3C comment, acknowledgement.

  • [December 01, 1999] Datatypes for DTDs. A draft specification residing on the Extensibility Web site (and looking suspiciously like a W3C 'NOTE') proposes a mechanism that allows the declaration of datatypes for XML content (PCDATA) and attributes. To wit: Datatypes for DTDs (DT4DTD) Version 1.0 (November 1999) is a 'Public Specification' edited by Lee Buck (Extensibility), Charles F. Goldfarb, and Paul Prescod (ISOGEN International). Initially posted as a public document: '10/31/99'. "Abstract: The presented specification allows legacy systems that may presently be unable to convert their DTD markup declarations to XML Schema, to utilize XML Schema conformant datatypes. With it, DTD creators can specify datatypes for attribute values and data content, thereby providing the foundation for a smoother future transition path... Free open-source code that supports this specification for both SAX and DOM is available at www.extensibility.com/dt4dtd." According to one of the authors, the specification represents just "a little convention for getting around the limitations of notations as applied to attributes and contents for datatyping." The markup facility uses NOTATIONs, and relies upon two 'fixed' attributes, e-dtype and a-dtype. Declaring a datatype for an element is permitted only if associated element type's content allows data but no sub-elements. The background: "XML 1.0, using DTDs, provides a strong foundation for validating the syntax of a document and ensuring that all the necessary pieces of information are present (i.e., necessary elements are included, inappropriate ones are not, attributes are supplied when required, etc.). DTDs do not, however, offer much help in constraining the value of a particular attribute or element: a.k.a. datatypes to those with programming backgrounds. DT4DTD brings this important capability to XML. Specifically it: (1) Provides compatibility with XML Schema data types; (2) Provides compatibility with XML-Data data types; (3) Provides programmatic extensions for DOM and SAX; (4) Provides an extensible architecture for custom datatypes; (5) Provides runtime support for data typed schemas created in XML Authority. The DT4DTD [package] consists of a two major parts: a) The draft specification, and b) The SDK [in Java]." Note:(?) similar datatype declaration mechanisms (appropriate for 'architecture-engine' and 'notation-engine' processing aka 'handwaving') are available in SGML, and particularly in Web SGML, where the added 'data specification declared value' allows arguments to be passed to notation processors.

  • DT4DTD Draft Specification, October/November 1999 (local archive copy)

Document Structure Description (DSD)

  • [February 24, 2000] A recent communiqué from Anders Møller reports on a new release of the Document Structure Description (DSD) specification in which free source code and Win32 executables are available for download. The Document Structure Description (DSD) is an "XML schema language developed by AT&T Labs Research and BRICS, University of Aarhus. DSDs require no specialized XML/SGML insights. The technology is based on simple, general, and familiar concepts that allow much stronger document descriptions than possible with DTDs or the current XML Schema proposal. DSD provides an alternative to XML DTDs and other XML schema languages. It adds expressive power, increases readability, and contains support for default attributes and contents. Furthermore, it guarantees linear time processing in the size of the application document. The relationship between DSDs and XML Schema is briefly described in a FAQ document."

  • [November 19, 1999] Document Structure Description (DSD): An XML Schema Language. A communiqué from Nils Klarlund forwards an announcement for Document Structure Description (DSD) as "a new and very effective way of describing XML documents." According to the text of the announcement, "This new DSD schema language is result of a research collaboration between AT&T Labs, NJ and BRICS at the University of Aarhus, Denmark. The DSD 1.0 language has been designed by Nils Klarlund, Anders Møller, and Michael I. Schwartzbach. A prototype DSD processor has been implemented. It is freely available for experimentation and further development. The DSD language has arisen out of a need to describe XML documents to Web programmers with an elementary background in computer science. DSDs have also been expressively designed to further W3C sponsored XML technologies such as Cascading Style Sheets (CSS) and XSL Transformations (XSLT). CSS is an essential part of modern HTML, but has so far not been formulated as a general style sheet mechanism for XML that works with any semantic domain. DSDs provide both a generalized semantics for a CSS-like style sheet mechanism and document processing instructions that provide the abstraction benefits of CSS in any XML document. XSLT 1.0 is a programming notation that allows transformations of classes of XML documents into semantic domains like HTML. XSLT programs are easy to write, especially if assumptions can be made about the input documents. The expressive power of DSDs allow declarative and readable specifications of XML documents that are to be subjected to XSLT processing. DSDs require no specialized XML/SGML insights. The technology is based on general and familiar concepts that allow much stronger document descriptions than possible with DTDs or the current XML Schema proposal." The Web site references the DSD 1.0 specification, an overview article, and a DSD description of DSDs.

TREX (Tree Regular Expressions for XML)

Information on "Tree Regular Expressions for XML (TREX)" is provided in a separate document. On 2001-04-24, James Clark posted a clarification confirming that he and Murata-san were working on the unification of TREX and RELAX.

RELAX NG

RELAX Core and TREX (Tree Regular Expressions for XML) are to be unified, since the two are very similar as structure-validation languages. The unified TREX/RELAX language is called RELAX NG [for "Relax Next Generation," pronounced "relaxing"]. This design work is now being conducted within the OASIS TREX Technical Committee, where a (first) specification is expected by July 1, 2001. The OASIS TC has also been renamed 'RELAX NG' [mailing list: 'relax-ng@lists.oasis-open.org'] to reflect the new name of the unified TREX/RELAX language. The RELAX NG development team plans to submit the OASIS specification to ISO, given the importance of ISO standards in Europe. See Minutes for RELAX NG TC 2001-05-17 and RELAX Core as ISO TR.

Examplotron

The Examplotron specification edited by Eric van der Vlist (Dyomedea) uses instance documents "as a lightweight schema language -- eventually adding the information needed to guide a validator in the sample documents. Examplotron may be used either as a validation language by itself, or to improve the generation of schemas expressed using other XML schema languages by providing more information to the schema translators. An XSLT transformation [in compile.xsl] transforms examplotron documents into XSLT sheets that validate 'similar' documents."

References:

Hook: A One-element validation language

Hook: A One-Element Language for Validation of XML Documents based on Partial Order. By Rick Jelliffe, Academia Sinica Computing Centre. "The Hook validation language is a thought experiment in minimalism in XML schema languages. The purpose of such a minimal language would be to provide useful but ultra-terse success/fail validation for basic incoming QA, especially of datagrams. It is like a checksum for a schema. The validation it performs can be characterized as "Does this element have a feasible name, ancestry, previous-siblings and contents?", there being some tradeoff between the how fully the later criteria are tested..." [cache]

Other XML Schema Language References

[These references need reorganization - soon, I hope. -rcc]

  • [August 2005] "How Schema-Validity is Different from Being Married." By C. M. Sperberg-McQueen. Presented at Extreme Markup Languages 2005 (August 1 - 5, 2005, Best Western Europa-Downtown, Montréal, Québec, Canada). "Validation with some schema languages (e.g. XML 1.0 and SGML DTDs) is a black-or-white question: either the document is (wholly) valid or it's not valid (at all). There are no gray areas: a document cannot be 'mostly valid' any more than one can be 'a little bit married'. The theoretical information content of a validation result in these systems is thus exactly one bit: yes/valid or no/invalid. In practice good validators try to provide a little more information about errors, but quality of error diagnostics varies and error handling is not standardized. In other languages (e.g., XML Schema), validity assessment is designed to provide more than one binary digit of useful information. XML Schema allows various forms of partial validation: Validation can start at an element other than the root. Wildcards can specify that elements they match should not be validated but skipped (aka 'black-box processing' — the data must be well-formed XML, but don't look inside). And wildcards can specify 'lax' validation (matching elements must be well-formed XML, and if the schema has declarations for them, they'll be validated, but the absence of declarations doesn't make the container invalid). In XML Schema, schema-validity is captured by three properties: The first is [validation attempted]: Did we try to validate this item? Its values are full, none, partial. The second is [validity]: Is the element valid? It takes the values valid, invalid, notKnown. The third is [schema error code]: If the item is not valid, then a list of error codes (references to XML Schema validation rules) explaining why. This paper will talk about why it can be dangerous and unhelpful to reduce validity to a single bit of information, and how it can be helpful to take a richer view of validity as a property with several values, a property not just of the document as a whole but of each element and attribute in the document..."

  • [September 07, 2005] "Change to the public schema document for the XML namespace (xml.xsd)." Posted by Henry S. Thompson to the 'public-xml-core-wg' list onbehalf of the W3C XML Core Working Group. The schema document at http://www.w3.org/2001/xml.xsd has changed, in order to (belatedly) track the change to xml:lang in XML 3rd edition, which now allows the empty string as well as a language code. Per the standard change policy, the old version is still available [at http://www.w3.org/2004/10/xml.xsd] and will not be changed. A copy of the new version which will never change is also available [at http://www.w3.org/2005/08/xml.xsd]. Send comments to public-xml-core-wg@w3.org.

  • [May 24, 2005]   W3C Workshop to Address Improved Interoperability of Schema-Aware Software.    W3C has issued a Call for Participation in connection with the June 21-22, 2005 "Workshop on XML Schema 1.0 User Experiences," to be held at the Oracle Conference Center in Redwood Shores, California. The deadline for submission of a user experience report has been extended through May 27, 2005. The purpose of this W3C Workshop is to "gather concrete reports of user experience with XML Schema 1.0, and examine the full range of usability, implementation, and interoperability problems around the specification and its test suite. Topics of discussion include, but are not limited to, the use of XML Schema in vocabulary design, Web Services description and toolkits, XHTML, XML Query, and XML Schema editors." The W3C XML Schema specification was released in a Second Edition Recommendation on October 28, 2004. This Second Edition incorporated the changes dictated by the corrections to errors found in the first edition, published as a W3C Recommendation on May 2, 2001. Since its approval as a W3C Recommendation, XML Schema 1.0 "has been widely adopted by vendors and as a foundation for other specifications in the Web Services area, in XML query systems, and elsewhere." The W3C Workshop on XML Schema 1.0 User Experiences will provide an opportunity for users to identify usability problems, to document the most serious interoperability problems users have experienced with schema-aware software, to design improvements to the XML Schema test suite, and to discuss future work to improve interoperability of schema-aware software. As with other W3C Workshops, this "Workshop on XML Schema 1.0 User Experiences" is open to the public, but will be limited to 60 attendees. Participants are required to submit a user experience report (by May 27, 2005); these papers will be included in the published proceedings of the Workshop. See the Workshop Program.

  • [April 24, 2005] "WS-I Submission for the W3C Workshop on XML Schema 1.0 Specification User Experiences." By Erik Johnson (Epicor) for the Web Services Interoperability Organization (WS-I). Version: 1.0. "The Web Services Interoperability Organization (WS-I) herein offers a submission to the W3C Workshop on XML Schema 1.0 User Experiences. The WS-I appreciates this opportunity to contribute to the Workshop and looks forward to working with the W3C in fostering broad adoption of the XML Schema 1.0 Specification. Unlike other specifications relevant to web services, the WS-I had initially felt that there were no clear ambiguities or feature pathways of the W3C XML Schema 1.0 Specification 2nd Edition itself that merited development of a WS-I XML Schema profile. In fact, the WS-I Basic Profile 1.0 expressly allows the use of all W3C XML Schema 1.0 Specification constructs and types. In 2003 however, the WS-I commissioned a Working Group to study interoperability issues with XML Schema raised by WS-I Community members, specifically end-user organizations. The XML Schema Work Plan Working Group (WS-I SWPWG) was then chartered to produce a recommendation for possible further action to the WS-I Board. The WS-I SWPWG began work in November of 2004 to study the issue claims and define how the WS-I might in fact take action. This submission summarizes portions of the conversation and consensus from the work of the WS-I XML Schema Work Plan Working Group... It defeats the purpose of XML web services if developers creating or consuming services have to understand the toolkit and platform assumptions of their counterparts. So, toolkit support of XML Schema needs to be measured in the context of suitability to purpose. But there are many permutations of platform stacks, programming languages, and toolkits in use and the idea of suitability is clearly subjective. WS-I members have discussed these issues from two viewpoints: The first is the need for guidance and clarification of the W3C XML Schema 1.0 Specification, especially around best practices for extensibility, versioning, and type composition (modularity). The second is the need for a testing capability that covers XML Schema constructs found in real-world schemas (good, bad, and ugly) rather than academic coverage of XML Schema features..." For details on the workshop, see the news story "W3C Workshop to Address Improved Interoperability of Schema-Aware Software." [cache]

  • [January 23, 2004]   IBM and X-Hive Present XML Schema API as a W3C Member Submission.    W3C has acknowledged receipt of a member submission entitled XML Schema API. The technology was submitted by from IBM Corporation and X-Hive Corporation B.V. and provides API access to properties of the XML Information Set. Specifically, the document "defines an XML Schema API, a platform- and language-neutral interface that allows programs and scripts to dynamically access and query the post-schema-validation infoset (PSVI) defined in the Normative Appendix C (Outcome Tabulations) of the W3C Recommendation XML Schema Part 1: Structures, "C.2 Contributions to the post-schema-validation infoset." The specification is implemented in Apache Xerces2 Java Parser; there is also a C++ binding and implementation for this specification in Apache Xerces C++ Parser." Section 1.2 defines interfaces which allow developers to access the XML Schema components which follow as a consequence of validation and/or assessment; Section 1.3 defines a set of interfaces for accessing the PSVI from an instance document; Section 1.4 defines a set of interfaces for loading XML Schema documents. The W3C Staff comment on the XML Schema API notes that the proposal "provides a substantial and useful addition to the DOM API, or to other existing event/pull parsing APIs such SAX/XNI. However, the XML Query and XSL Working Groups have been working on extending the work of XML Schema and adding more properties, and are working on a new version of XML Schema 1.1; therefore, while the proposal addresses today's needs, it should be noted that future extensions will still be needed to follow additions to the XML Architecture."

  • [December 2003] "Using Finite State Automata to Implement W3C XML Schema Content Model Validation and Restriction Checking." By Henry S. Thompson and Richard Tobin (University of Edinburgh Division of Informatics Scotland). Presented at XML Europe 2003. "Implementing validation and restriction checking for W3C XML Schema content models is harder than for DTDs. This paper gives complete details on how to convert W3C XML Schema content models to Finite State Automata, including handling of numeric exponents and wildcards. Enforcing the Unique Particle Attribution constraint and implementing restriction checking in polynomial time using these FSAs is also described..." See also XSV (XML Schema Validator).

  • [December 23, 2003] "XML and Information Integration: Conceptual Modeling of XML Schemas." By Bernadette Farias Lóscio, Ana Carolina Salgado, and Luciano do Rêgo Galvão (Centro de Informática, Universidade Federal de Pernambuco, Brasil). In Proceedings of the Fifth International Workshop on Web Information and Data Management (WIDM 2003) (November 7-8, 2003). "XML has become the standard format for representing structured and semi-structured data on the Web. To describe the structure and content of XML data, several XML schema languages have been proposed. Although being very useful for validating XML documents, an XML schema is not suitable for tasks requiring knowledge about the semantics of the represented data. For such tasks it is better to use a conceptual schema. This paper presents an extension of the Entity Relationship (ER) model, called X-Entity, for conceptual modeling of XML schemas. We also present the process of converting a schema, defined in the XML Schema language, to an X-Entity schema. The conversion process is based on a set of rules that consider element declarations and type definitions and generates the corresponding conceptual elements. Such representation provides a cleaner description for XML schemas by focusing only on semantically relevant concepts. The X-Entity model has been used in the context of a Web data integration system with the goal of providing a concise and semantic description for local schemas defined in XML Schema... The X-Entity representation provides a cleaner description for XML schemas hiding implementation details and focusing on semantically relevant concepts. The X-Entity model extends the ER model so that one can explicitly represent important features of XML schemas, including: element and subelement relationships, occurrence constraints of elements and attributes and choice groups. Due to space limitations, some X-Entity features were not presented in this paper. Other issues were not considered in our approach, including: hierarchy of elements and attributes, cardinality of group of elements, elements with mixed content and order of elements imposed by a sequence compositor. However, our model can be easily extended with additional features and new rules can be developed for the conversion process. We already implemented a prototype to generate XEntity schemas from XML Schemas..."

  • [November 24, 2003] "An Introduction to Schematron." By Eddie Robertsson. From XML.com (November 12, 2003). "The Schematron schema language differs from most other XML schema languages in that it is a rule-based language that uses path expressions instead of grammars. This means that instead of creating a grammar for an XML document, a Schematron schema makes assertions applied to a specific context within the document. If the assertion fails, a diagnostic message that is supplied by the author of the schema can be displayed. One advantages of a rule-based approach is that in many cases modifying the wanted constraint written in plain English can easily create the Schematron rules. In order to implement the path expressions used in the rules in Schematron, XPath is used with various extensions provided by XSLT. Since the path expressions are built on top of XPath and XSLT, it is also trivial to implement Schematron using XSLT, which is shown later in the section Schematron processing. Schematron makes various assertions based on a specific context in a document. Both the assertions and the context make up two of the four layers in Schematron's fixed four-layer hierarchy: phases (top-level), patterns, rules (defines the context), and assertions... This introduction covers only three of these layers (patterns, rules and assertions); these are most important for using embedded Schematron rules in RELAX NG... Version 1.5 of Schematron was released in early 2001 and the next version is currently being developed as an ISO standard. The new version, ISO Schematron, will also be used as one of the validation engines in the DSDL (Document Schema Definition Languages) initiative..." See: "Schematron: XML Structure Validation Language Using Patterns in Trees."

  • [August 05, 2003] "WDSL Tales From the Trenches, Part 3." By Johan Peeters. From O'Reilly WebServices.xml.com (August 05, 2003). ['Continuing the focus on sound design, we have the third and final installment of Johan Peeters' "WSDL Tales from the Trenches." Peeters concentrates on the importance of modeling the data elements involved in web services, and explains the best strategies for using W3C XML Schema to model this data.'] "I examine the type definitions and element declarations in the types element of a WSDL document. Such types and elements are for use in the abstract messages, the message elements in a WSD. WSDL does not constrain data definitions to W3C XML Schema (WXS). However, alternatives to WXS are not covered in this article: the goal of the series is to provide help and guidance with current real-world problems, and I have not seen any of the alternatives to WXS being used for web services on a significant scale to date. This may change in the future: while only the WXS implementation is discussed in the WSDL 1.1 spec, it was always the intention of the WSDL designers to provide several options. The WSDL 1.2 draft's appendix on Relax NG brings this closer to realization. Data modeling with WXS is not for the faint-hearted. It presents a lot of pitfalls. This article will point some of these out and helps you avoid them..." On WSDL 1.2, see the announcement "W3C Releases Three Web Services Description Language (WSDL) 1.2 Working Drafts." The non-normative Appendex E ('Examples of Specifications of Extension Elements for Alternative Schema Language Support') in the WSDL Part 1: Core WD includes a section on RELAX NG: "A RELAX NG schema may be used as the schema language for WSDL. It may be embedded or imported; import is preferred. A namespace must be specified; if an imported schema specifies one, then the [actual value] of the namespace attribute information item in the import element information item must match the specified namespace. RELAX NG provides both type and element definitions which appear in the {type definitions} and {element declarations} properties of [Section 2.1.1] 'Definitions Component' respectively..."

  • [July 16, 2003] "Logic Grammars and XML Schema." By C. M. Sperberg-McQueen (World Wide Web Consortium / MIT Laboratory for Computer Science, Cambridge MA). Draft version of paper prepared for Extreme Markup Languages 2003, Montréal. "This document describes some possible applications of logic grammars to schema processing as described in the XML Schema specification. The term logic grammar is used to denote grammars written in logic-programming systems; the best known logic grammars are probably definite-clause grammars (DCGs), which are a built-in part of most Prolog systems. This paper works with definite-clause translation grammars (DCTGs), which employ a similar formalism but which more closely resemble attribute grammars as described by [D. Knuth, 'Semantics of Context-Free Languages,' 1968] and later writers; it is a bit easier to handle complex specifications with DCTGs than with DCGs. Both DCGs and DCTGs can be regarded as syntactic sugar for straight Prolog; before execution, both notations are translated into Prolog clauses in the usual notation... Any schema defines a set of trees, and can thus be modeled more or less plausibly by a grammar. Schemas defined using XML Schema 1.0 impose some constraints which are not conveniently represented by pure context-free grammars, and the process of schema-validity-assessment defined by the XML Schema 1.0 specification requires implementations to produce information that goes well beyond a yes/no answer to the question 'is this tree a member of the set?' For both of these reasons, it is convenient to use a form of attribute grammar to model a schema; logic grammars are a convenient choice. In [this] paper, I introduce some basic ideas for using logic grammars as a way of animating the XML Schema specification / modeling XML Schema... The paper attempts to make plausible the claim that a similar approach can be used with the XML Schema specification, in order to provide a runnable XML Schema processor with a very close tie to the wording of the XML Schema specification. Separate papers will report on an attempt to make good on the claim by building an XML Schema processor using this approach; this paper will focus on the rationale and basic ideas, omitting many details..." See also the abstract for the Extreme Markup paper [Tuesday, August 5, 2003]: "The XML Schema specification is dense and sometimes hard to follow; some have suggested it would be better to write specifications in formal, executable languages, so that questions could be answered just by running the spec. But programs are themselves often even harder to understand. Representing schemas as logic grammars offers a better approach: logic grammars can mirror the wording of the XML Schema specification, and at the same time provide a runnable implementation of it. Logic grammars are formal grammars written in logic-programming systems; in the implementation described here, logic grammars capture both the general rules of XML Schema and the specific rules of a particular schema." Note: the paper is described as an abbreviated version of "Notes on Logic Grammars and XML Schema: A Working Paper Prepared for the W3C XML Schema Working Group"; this latter document (work in progress 2003-07) provides "an introduction to definite-clause grammars and definite-clause translation grammars and to their use as a representation for schemas."

  • [June 21, 2003] "Mobile Subset of XML Schema Part 2." ISO Document for information and review. Produced by SC34 Japan for ISO/IEC JTC 1/SC34/WG1: Information Technology -- Document Description and Processing Languages -- Information Presentation. Project 19757-5 (Project Editor, Martin Bryan). ISO Reference: ISO/IEC JTC 1/SC34 N 0410. April 22, 2003. Excerpts: "We propose to create a compact and reliable subset of W3C XML Schema Part 2 and publish it as an ISO standard. The main target of this subset is mobile devices (such as cellular phones). Mobile devices are expected to use XML in the near future. Small XML parsers have been developed already. Validators for schema languages are expected to follow, and a prototypical validator for RELAX NG on mobile phones has been developed. Such parsers and validators will hopefully be used for implementing XForms and Web Service on mobile devices. Part 2 of W3C XML Schema provides a set of datatypes and facets. Although it might not be perfect, it is likely to be widely used by many XML applications including mobile applications. We just cannot believe that an incompatible set of general-purpose datatype (e.g., int) libraries will be accepted by the market. However, datatypes and facets of W3C XML Schema Part 2 are too complicated for mobile devices. Some specifications such as XForms have already created their own subsets of W3C XML Schema Part 2. However, if different specifications introduce different subsets, incomparability will be significantly spoiled. It would be much nicer if one subset is internationally standardized... [In the] choice of datatypes we omit: (1) datatypes requiring infinite precision; (2) datatypes that do not have obvious mapping to J2ME; (3) archaic datatypes such as IDREFS, ENTITY, ENTITIES, and NOTATION; (4) unsolid datatypes -- dateTime and so forth; (5) datatypes such that validity depends on namespace declarations... [In the] choice of facets we omit: [1] the pattern facet, which requires the property list of Unicode characters; [2] whitespace, which does not affect validity but controls PSVI; [3] totalDigits and fractionDigits. Implementation considerations: We have studied the source code of Jing implementation by James Clark. We believe that if the above restrictions are accepted, an implementation of the remaining datatypes and facets will require less than 20KB as the size of a JAR file." See details for the proposed list of datatypes and factes in 'Table 1: The list of datatypes' and 'Table 2: The list of facets'. [cache]

  • [June 19, 2003]   Namespace Routing Language (NRL) Supports Multiple Independent Namespaces.    James Clark has announced the publication of a Namespace Routing Language (NRL) specification. NRL is "an XML language for combining schemas for multiple namespaces; it allow the schemas that it combines to use arbitrary schema languages." The release includes a tutorial and specification document and a sample implementation in the Jing (RELAX NG Validator in Java) distribution. NRL "is the successor to Clark's Modular Namespaces (MNS) language and is intended to be another step on the path towards Document Schema Definition Languages (DSDL) Part 4." The W3C XML Namespaces Recommendation itself "allows an XML document to be composed of elements and attributes from multiple independent namespaces: each of these namespaces may have its own schema and the schemas for different namespaces may be in different schema languages. The problem then arises of how the schemas can be composed in order to allow validation of the complete document." The Namespace Routing Language attempts to solve this problem. Among the features and benefits of NRL: it supports schema language coexistence, allows extension of schemas not designed to be extended, makes authoring of extensible schemas easier supports 'transparent' namespaces, allows contextual control of extension, and allows concurrent validation. "For RELAX NG, it can be used to provide some of the namespace-based modularity features that are built-in to XSD. NRL is designed to allow an implementation to stream, and the sample implementation does so. The sample implementation has a SAX-based plug-in architecture that allows new schema languages to be added dynamically. It comes with support for RELAX NG (both XML and compact syntax), W3C XML Schema (via a wrapper around Xerces-J), Schematron, and (recursively) NRL; it can also use any schema language with an implementation that supports the JARV interface."

  • [June 10, 2003] "Introducing Examplotron: The Fastest Road to Schema." By Uche Ogbuji (Principal Consultant, Fourthought, Inc). From IBM developerWorks, XML zone. June 10, 2003. ['A zoo of XML schema languages is out there, and although some of the beasts are bigger than others none is as friendly as Examplotron. With Examplotron, your example XML document is your schema, for the most part. It requires you to learn very little new syntax, and most of the core features of XML can be specified by providing representative examples in the source. In this article, Uche Ogbuji introduces Examplotron, providing plenty of examples.'] "At first XML had the Document Type Definition (DTD). XML 1.0 came bundled with the schema technology inherited from SGML. However, numerous XML users complained about DTDs including the fact that they use a different syntax from XML itself. The W3C developed a successor technology to DTD, W3C XML Schema, but some complained that it was too complex, and that it showed every sign of design-by-committee. Separate groups developed schema technologies that became RELAX NG and Schematron. These technologies all have their strengths and weaknesses, and their attendant factions. But for the developer with deadlines to mind, crafting schemata is often too much of an additional burden. Without a doubt, it is always a good idea to develop a schema. If for no other reason, it provides documentation of the format. But in the real world, the most common course for harassed developers is to develop a sample of the XML format to serve all purposes of a proper schema. But what if the example itself could provide the benefits of a formal schema? In particular, what if the example could be used to validate documents? Eric van der Vlist set out to develop a system that allows example documents to serve as formal schemata, and his invention is Examplotron. In this article, I introduce Examplotron. This system is simple to use, so I encourage you to follow along by downloading Examplotron 0.7 (compile.xsl) and use your favorite XSLT and RELAX NG processors... On a recent project, a client who had many XML formats hired me, through my company Fourthought, to develop schemata for documentation and validation for these XML formats. All they had to start with were sample XML documents for each format. Using Examplotron to generate the production RELAX NG schemata from these sample documents saved me perhaps over a hundred hours of effort, and thus saved them tens of thousands of dollars. I did have to augment Examplotron with document generation and other refinement code; I hope to cover the non-proprietary aspects of this refinement code in a future article. Examplotron produces RELAX NG schemata, but if you must produce W3C XML Schema, all is still well: You can use James Clark's excellent Trang tool to convert RELAX NG to WXS. I know from my overall consulting experience that sample documents are the most common form of schema in the real world, so I expect that Examplotron will be of great help to a lot of folks right away." See Examplotron above.

  • [April 30, 2003] "Clean Up Your Schema for SOAP. Updating XML Schemas to be SOAP-Friendly." By Shane Curcuru (Advisory Software Engineer, IBM Research). From IBM developerWorks, Web services. April 29, 2003. ['More and more projects are using XML schemas to define the structure of their data. As your repository of schemas grows, you need tools to manipulate and manage your schemas. The Eclipse XSD Schema Infoset Model has powerful querying and editing capabilities. In this article, Shane Curcuru will show how you can update a schema for use with SOAP by automatically converting attribute uses into element declarations.'] "If you've built a library of schemas, you might want to reuse them for new applications. If you already have a data model for an internal purchase order, as you move towards Web services you may need to update it for use with SOAP. SOAP allows you to transport an xml message across a network; the xml body can also be constrained with a schema. However a SOAP message typically uses element data for its xml body, not attribute data. You'll explore a program that can automatically update an existing schema document to convert any attribute declaration into roughly 'equivalent' element declarations... Given the complexity of XML schemas, you certainly don't want to use Notepad to edit the .xsd files. A good XML editor is not much of a step up -- while it may organize your elements and attributes nicely, it can't show the many abstract Infoset relationships that are defined in the Schema specification. That's where the Schema Infoset Model comes in; it expresses both the concrete DOM representation of a set of schema documents, and the full abstract Infoset model of a schema. Both of these representations are shown through the programmatic API of the Model as well as in the Model's built-in sample editor for schemas... If you've installed the XSD Schema Infoset Model and Eclipse Modeling Framework (EMF) plugins into Eclipse, you can see the sample editor at work in your Workbench... performing a conceptually simple editing operation on schema documents (turning attributes into elements) can entail a fair amount of work. However the power of the Schema Infoset Model's representation of both the abstract Infoset of a schema and its concrete representation of the schema documents makes this a manageable task. The Model also includes simple tools for loading and saving schema documents to a variety of sources, making it a complete solution for managing your schema repository programmatically. Some users might ask, 'Why not use XSLT or another XML-aware application to edit schema documents?' While XSLT can easily process the concrete model of a set of schema documents, it can't easily see any of the abstract relationships within the overall schema that they represent. For example, suppose that you need to update any enumerated simpleTypes to include a new UNK enumeration value meaning unknown. Of course, you only want to update enumerations that fit this format of using strings of length of three; you would not want to update numeric or other enumerations... This article presupposes an understanding of schemas in XML and how SOAP works. The sample code included in the zip file works standalone or in an Eclipse workbench..."

  • [March 24, 2003] Schema Unit Test (SUT) Framework for Testing XML Schema. Gavin Kingsley (Invensys Energy Systems Limited) announced the availability of a SourceForge project Schema Unit Test (SUT) which introduces a framework for testing XML Schema. SUT incorporates the Schematron reference implementation developed by Rick Jelliffe of the Academia Sinica Computing Centre. Problem statement: "W3C Schema can quickly become complex and difficult to determine if they are validating the correct vocabulary. The addition of embedded Schematron schema only makes this problem worse... The SUT framework has two parts. The first is a namespace and vocabulary for embedding test cases into sample XML documents, designed to highlight what is legal and what is not legal in the vocabulary defined in the schema under test. This aspect is independent of what schema language is used and can in theory be applied to any schema language with automatic validation tools. The second part is a Java implementation using JUnit for testing a W3C Schema with embedded Schematron schema. This implementation reads SUT test suite descriptions written in XML with embedded test cases and then creates a JUnit test suite that can be executed inside JUnit in the usual way. Although SUT is written to use JUnit, no specialise Java or JUnit knowledge is required to run SUT test suites. An example is provided based on the purchase order schema from the W3C primer... A SUT Test Suite is a well-formed XML file containing an example of a file to be validated. Test cases are identified by additional elements in the SUT namespace, http://www.powerware.com/nz/XMLSchemaUnitTest. The case element identifies test cases created by adding or removing elements. The attribute element identifies test cases created by adding, removing or changing attributes; detailed descriptions of these elements are available (case; attribute)." SUT has free, open source code.

  • [February 21, 2003] "Mapping Between UML and XSD." By David Carlson (Ontogenics Corp). From XMLmodeling News Volume One, Issue Two (January 28, 2003). "One of the principal advantages of using UML when designing XML vocabularies is that the model can serve as a specification which is independent of a particular schema language implementation. W3C XML Schema is the most common choice right now, but we hope that business vocabularies (and other non-business technical markup languages) have a long life and will be implemented using alternative new schema languages. To achieve this goal, we need to define a complete and flexible mapping between UML and each implementation language. Given that UML was originally intended for object-oriented analysis and design, the mapping is most straightforward for languages that have an object-oriented flavor... A bi-directional mapping between UML and schemas is specified in the form of a UML Profile. The purpose of a UML profile for this or any other use is to extend the UML modeling language with constructs unique to an implementation language, analysis method, or application domain. The profile extension mechanism is part of the UML standard; it was expanded in the recent UML version 1.4 and will be further expanded when UML 2.0 is adopted this year. A UML profile (pre version 2.0) is composed of three constructs: stereotypes, tagged value properties, and constraints. A stereotype defines a specialized kind of UML element; for example, the XSDcomplexType stereotype defines a specialized kind of UML Class, and XSDschema defines a specialized kind of UML package. Tagged values define properties of these stereotyped elements. So the XSDschema stereotype includes a targetNamespace property. By assigning this stereotype to a UML package and setting a value for this property, we have augmented the UML modeling language with information used to generate a complete XML Schema document from an abstract vocabulary model. Similar stereotypes and properties are defined for all XML Schema constructs. A profile constraint specifies rules about how and where stereotypes and their tagged values can be used in a model. These rules should include what are often called co-constraints: how the value of one property constrains the values of other properties..."

  • [January 23, 2003]   Trang Multi-Format Schema Converter Supports DTD to W3C XML Schema Conversion.    A posting from James Clark to the XML-DEV List announces a new release of Trang, Clark's Multi-Format Schema Converter based on RELAX NG. The conversion tool supports several schema languages for XML, including RELAX NG (XML syntax), RELAX NG compact syntax, XML 1.0 DTDs, W3C XML Schema. With one exception, Trang will convert between any of these formats (W3C XML Schema is supported for output only, not for input). "Trang is written in Java, and available under a BSD-style license. In this release, [Clark has] added an input module for DTDs based on his DTDinst program; this implies that Trang can now convert directly from DTDs to W3C XML Schema (XSD)." Clark identifies three unique features of Trang: "(1) it can reliably turn parameter entities into the higher-level semantic constructs available in XSD (simple types, groups, attribute groups) -- even in the presence of arbitrarily deep nesting of parameter entity references within parameter entity declarations; (2) it supports namespaces, including DTDs that