Members of the OASIS Entity Resolution Technical Committee have voted to approve the latest revision of the XML Catalogs specification as a Committee Specification and to submit the document for public review. The XML Catalogs specification describes an interoperable method for mapping the information in an XML external identifier into a URI reference for the XML external resource. An entity catalog is defined for this purpose, designed to handle "two simple cases: (1) mapping an external entity's public identifier and/or system identifier to a URI reference; (2) mapping the URI reference of a resource (a namespace name, stylesheet, image, etc.) to another URI reference." Three non-normative appendices provide formal definitions for the XML Catalog, including W3C XML Schema, RELAX NG Grammar, and XML DTD. The OASIS TC was chartered in October 2000 to provide an XML syntax for a simple entity catalog format, as envisioned in an earlier OASIS Technical Resolution. A 30-day public review of the XML Catalogs specification will take place from June 18, 2003 through July 18, 2003 in preparation for consideration of the specification as an OASIS Open standard.
Bibliographic Information
XML Catalogs. Edited by Norman Walsh (Sun Microsystems, Inc). OASIS Committee Specification 1.0. June 03, 2003. Document identifier: cs-entity-xml-catalogs-1.0. 36 pages. Produced by members of the OASIS Entity Resolution Technical Committee. Available in HTML, XML, and PDF formats. Principal contributors to the Committee Specification include Paul Grosso (Arbortext), David Leland (Elsevier Science London), Normand Montour (IBM), Norman Walsh (Sun Microsystems, Editor), and Lauren Wood (Individual Member, Chair). During the development of previous versions, Tony Coates (Reuters) and John Cowan (Reuters Health) were TC participants.
Specification Abstract
"The requirement that all external identifiers in XML documents must provide a system identifier has unquestionably been of tremendous short-term benefit to the XML community. It has allowed a whole generation of tools to be developed without the added complexity of explicit entity management.
However, the interoperability of XML documents has been impeded in several ways by the lack of entity management facilities:
External identifiers may require resources that are not always available. For example, a system identifier that points to a resource on another machine may be inaccessible if a network connection is not available.
External identifiers may require protocols that are not accessible to all of the vendors' tools on a single computer system. An external identifier that is addressed with the ftp: protocol, for example, is not accessible to a tool that does not support that protocol.
It is often convenient to access resources using system identifiers that point to local resources. Exchanging documents that refer to local resources with other systems is problematic at best and impossible at worst.
The problems involved with sharing documents, or packages of documents, across multiple systems are large and complex. While there are many important issues involved and a complete solution is beyond the current scope, the OASIS membership agrees upon the enclosed set of conventions to address a useful subset of the complete problem. To address these issues, this Committee Specification defines an entity catalog that maps both external identifiers and arbitrary URI references to URI references.
XML Catalog Sample Implementation
See "XML Catalog Implementation on Unix-like Systems", edited by Mark Johnson, 23-April-2003 or later. The document provides a sample implementation based on the draft policy for the Debian GNU/Linux implementation of XML catalogs. A snapshot version was posted to the OASIS Entity Resolution TC website by Lauren Wood because the Entity Resolution Technical Committee is "considering writing an implementation guide or tutorial, and this [XML Catalog Implementation document] is input to that discussion..."
Principal references:
- Announcement 2003-06-18: "Public Review of OASIS Entity Resolution TC Specification"
- "XML Catalogs." OASIS Committee Specification 1.0. PDF. See also the HTML format.
- OASIS Entity Resolution Technical Committee
- TC discussion list archives for 'entity-resolution'
- Contact: Lauren Wood (TC Chair, Textuality) or Norman Walsh (Sun Microsystems).
- See: Entity Management. OASIS Technical Resolution 9401:1997 (Amendment 2 to TR 9401). September 10, 1997.
- See also: "XML Catalog Implementation on Unix-like Systems." Edited by Mark Johnson.
- See also: XML Entity and URI Resolvers, by Norman Walsh. Related tools supporting the XML Catalog are referenced on the TC website, including RXP, Apache Java Resolver Classes (part of the Apache xml-commons project), the XML C library for Gnome, ElCel Technology's C++ XML Toolkit, John Cowan's TR 9401:1997 OASIS Catalog Specification to XML Catalog Converter in Perl, John Cowan's public identifier to/from URN conversion code in Perl.
- Earlier topical references: