[From: http://www.ccil.org/~cowan/XML/XCatalog.html, 980803]


XCatalog proposal, draft 0.2

1. Abstract

This is a proposal for XCatalogs, a system based on SGML/Open catalogs (Socats) for translating XML public identifiers to XML system identifiers, which are Uniform Resource Identifiers [URI].

XCatalogs are Web resources (anything from local files on up) which contain mappings from public identifiers to system identifiers (and optionally mappings from system identifiers to other system identifiers), plus references to other XCatalogs. They come in two syntaxes: one which is a subset of Socat [TR9401] syntax, and one which is an XML document instance [XML1.0].

2. Example

Here's an example XCatalog in both syntaxes, for those who learn best from examples:

-- catalog for "-//John Cowan" public Ids --
BASE "http://www.ccil.org/~cowan/"
PUBLIC "-//John Cowan//ConScript Unicode Registry//EN"
    "csur/"
PUBLIC "-//John Cowan//Essentialist Explanations//(EN,X-BRITHENIG)"
    "essential.html"
PUBLIC "-//John Cowan//Lojban"
    "http://xiron.pc.helsinki.fi/lojban/"
DELEGATE "-//John Cowan//LOC Diacritics"
    "elsie/xcatalog.soc"
CATALOG "http://www.w3.org/xcatalog/mastercat.soc"

<XCatalog>catalog for "-//John Cowan" public Ids
  <Base HRef="http://www.ccil.org/~cowan/"/>
  <Map PublicId="-//John Cowan//ConScript Unicode Registry//EN"
    HRef="csur/"/>
  <Map PublicId="-//John Cowan//Essentialist Explanations//(EN,X-BRITHENIG)"
    HRef="essential.html"/>
  <Map PublicId="-//John Cowan//Lojban"
    HRef="http://xiron.pc.helsinki.fi/lojban/"/>
  <Delegate PublicId="-//John Cowan//LOC Diacritics"
    HRef="elsie/xcatalog.xml"/>
  <Extend Href="http://www.w3.org/xcatalog/mastercat.xml"/>
</XCatalog>

3. Socat syntax

The BNF for the Socat syntax is:

Document ::= Comment? WS? (Entry (WS Entry)*)? WS? Comment?

Entry ::= Map | Delegate | Extend | Base | Other

Map ::= "PUBLIC" WS PubidLiteral WS SystemLiteral

Remap ::= "SYSTEM" WS SystemLiteral WS SystemLiteral

Delegate ::= "DELEGATE" WS PubidLiteral WS SystemLiteral

Extend ::= "CATALOG" WS SystemLiteral

Base ::= "BASE" WS SystemLiteral

Other ::= Name (WS Name)? (WS SystemLiteral)*

WS ::= S (Comment S)*

Comment ::= "--" ([^--])* "--"

where Name, PubidLiteral, SystemLiteral, and S are as in XML 1.0.

4. DTD

The DTD for the XML instance syntax is:

<!ELEMENT Map EMPTY>
<!ATTLIST Map
  Publicsystem-identifier CDATA #REQUIRED
  HRef CDATA #REQUIRED>

<!ELEMENT Remap EMPTY>
<!ATTLIST Remap
  SystemId CDATA #REQUIRED
  HRef CDATA #REQUIRED>

<!ELEMENT Delegate EMPTY>
<!ATTLIST Delegate
  PublicId CDATA #REQUIRED
  HRef CDATA #REQUIRED>

<!ELEMENT Extend EMPTY>
<!ATTLIST Extend
  HRef CDATA #REQUIRED>

<!ELEMENT Base EMPTY>
<!ATTLIST Base
  HRef CDATA #REQUIRED>

In the XML instance syntax, any #PCDATA content is considered comment, and any other elements that may be present are beyond the scope of this specification. The use of an element named "XCatalog" to contain the XCatalog elements is recommended but not required.

For uniformity below, the Map, Remap, Delegate, Extend, and Base elements are referred to as "entries".

5. Entry semantics

Map entries (which use the keyword "PUBLIC" in the Socat syntax for backward compatibility) mean that a public identifier which exactly matches the public-identifier attribute should be translated into the entry's HRef attribute.

Remap entries (which use the keyword "SYSTEM" in the Socat syntax for backward compatibility) mean that a system identifier which exactly matches the system-identifier attribute should be translated into the entry's HRef attribute. Support for System entries is recommended but not required. System entries should be used with caution in publicly accessible XCatalogs.

Delegate entries are used to delegate groups of public identifiers to other catalogs. Public identifiers for which the public-identifier attribute is an exact prefix are listed in the XCatalog specified by the system-identifier attribute.

Extend entries (which use the keyword "CATALOG" in the Socat syntax for backward compatibility) allow additional catalogs to be specified as extensions to this catalog. The system-identifier attribute specifies an XCatalog.

Base entries are used in the same way as BASE elements in HTML: to specify the base URL for any relative URLs in system-identifier attributes.

6. Search algorithm

The following algorithm may be employed in order to translate a public identifier into an URI is as follows. Any other algorithm that never generates different results may also be used.

The set of XCatalogs to be searched comprises an implementation-defined list of XCatalogs, plus all other XCatalogs referred to by Extend entries from the original list, directly or indirectly, ordered breadth-first. Within an XCatalog, Map entries are always treated as if they precede Delegate entries.

The first matching Map entry (where the public identifier being searched for is exactly the same as the PublicId attribute) causes the search to terminate: the HRef attribute specifies the desired system identifier.

The first matching Delegate entry (where the public identifier being searched for is exactly the same as the PublicId attribute within the length of the latter) causes the search to begin again in the XCatalog specified by the Href attribute and XCatalogs referred to by Extend entries, directly or indirectly, from it.

SystemId and HRef attributes which are relative URLs are understood relative to the absolute URL of the XCatalog within which they appear, unless a Base entry precedes them, in which case the HRef attribute of the most recent Base entry provides the absolute URL.

If the implementation supports Remap entries, then the whole process is repeated, except that Public and Delegate entries are ignored, and matching Remap entries (where the system identifier being searched for is exactly the same as the SystemId attribute) causes the search to terminate: the HRef attribute specifies the desired system identifier.

References