From owner-xml-dev@ic.ac.uk Fri Jul 10 11:31:19 1998 Date: Fri, 10 Jul 1998 12:20:11 -0400 From: John Cowan <cowan@locke.ccil.org> To: XML Dev <xml-dev@ic.ac.uk> Subject: XCatalog proposal draft 0.1
This is a proposal for XCatalogs, a system based on SGML/Open catalogs (Socats) for translating public identifiers to system identifiers in XML. 1. Introduction XCatalogs are Web resources (anything from local files on up) which contain mappings from public identifiers to system identifiers, plus references to other XCatalogs. They come in two syntaxes: one which is a subset of Socat syntax, and one which is an XML document instance. 2. Example Here's an example XCatalog in both syntaxes, for those who learn best from examples: -- catalog for "-//John Cowan" public IDs -- BASE "http://www.ccil.org/~cowan/" PUBLIC "-//John Cowan//ConScript Unicode Registry//EN" "csur/" PUBLIC "-//John Cowan//Essentialist Explanations//(EN,X-BRITHENIG)" "essential.html" PUBLIC "-//John Cowan//Lojban" "http://xiron.pc.helsinki.fi/lojban/" DELEGATE "-//John Cowan//LOC Diacritics" "elsie/xcatalog.soc" CATALOG "http://www.w3.org/xcatalog/mastercat.soc" <XCatalog>catalog for "-//John Cowan" public IDs <Base HRef="http://www.ccil.org/~cowan/"/> <Map PublicID="-//John Cowan//ConScript Unicode Registry//EN" HRef="csur/"/> <Map PublicID="-//John Cowan//Essentialist Explanations//(EN,X-BRITHENIG)" HRef="essential.html"/> <Map PublicID="-//John Cowan//Lojban" HRef="http://xiron.pc.helsinki.fi/lojban/"/> <Delegate PublicID="-//John Cowan//LOC Diacritics" HRef="elsie/xcatalog.xml"/> <Extend Href="http://www.w3.org/xcatalog/mastercat.xml"/> </XCatalog> 3. Socat syntax The BNF for the Socat syntax is: Document ::= Comment? WS? (Entry (WS Entry)*)? WS? Comment? Entry ::= Map | Delegate | Extend | Base | Other Map ::= "PUBLIC" WS PubidLiteral WS SystemLiteral Delegate ::= "DELEGATE" WS PubidLiteral WS SystemLiteral Extend ::= "CATALOG" WS SystemLiteral Base ::= "BASE" WS SystemLiteral Other ::= Name (WS Name)? (WS SystemLiteral)* WS ::= S (Comment S)* Comment ::= "--" ([^--])* "--" where Name, PubidLiteral, SystemLiteral, and S are as in XML 1.0. 4. DTD The DTD for the XML instance syntax is (where an XCatalog element is the root): <!ELEMENT XCatalog ANY> <!ATTLIST XCatalog Version CDATA #FIXED "1.0"> <!ELEMENT Map EMPTY> <!ATTLIST Map PublicID CDATA #REQUIRED HRef CDATA #REQUIRED> <!ELEMENT Delegate EMPTY> <!ATTLIST Delegate PublicID CDATA #REQUIRED HRef CDATA #REQUIRED> <!ELEMENT Extend EMPTY> <!ATTLIST Extend HRef CDATA #REQUIRED> <!ELEMENT Base EMPTY> <!ATTLIST Base HRef CDATA #REQUIRED> In the XML instance syntax, any #PCDATA content is considered comment, and any other elements that may be present are beyond the scope of this specification. For uniformity below, the Map, Delegate, Extend, and Base elements are referred to as "entries". 5. Entry semantics Map entries (which use the keyword "PUBLIC" in the Socat syntax for backward compatibility) mean that a public identifier which exactly matches the public-identifier attribute should be translated into the entry's system-identifier attribute. Delegate entries are used to delegate groups of public identifiers to other catalogs. Public identifiers for which the public-identifier attribute is an exact prefix are listed in the XCatalog specified by the system-identifier attribute Extend entries (which use the keyword "CATALOG" in the Socat syntax for backward compatibility) allow additional catalogs to be specified as extensions to this catalog. The system-identifier attribute specifies an XCatalog. Base entries are used in the same way as BASE elements in HTML: to specify the base URL for any relative URLs in system-identifier attributes. 7. Search algorithm The process of searching catalogs in order to translate a public identifier into an URI is as follows. A queue of XCatalog URIs is maintained, which is initialized with a system-dependent list of URIs. A current base URL is also maintained, initially null. A catalog URI is dequeued and the specified XCatalog is fetched. The current base URL is set to the base of the catalog by removing the least significant part of the URI. The XCatalog is then searched from beginning to end looking for a matching Map or Delegate entry. A matching Map entry (exact match) causes the process to terminate, returning the system-identifier attribute (modified if necessary by the current base URL). A matching Delegate entry (prefix match) causes the current queue to be cleared and the system-identifier attribute (modified if necessary by the current base URL) entered as the only entry; the rest of the current XCatalog is ignored. As Catalog entries are seen, their system-identifier attributes are appended to the catalog URI queue. As Base entries are seen, the current base URL is reset to the system-identifier attribute. Comments and Others are ignored. When an XCatalog has been completely scanned, the next XCatalog URI is dequeued and fetched and the current base URL reset, and the process repeated. When no further XCatalog URIs remain in the queue, the process fails: the public identifier cannot be translated. 8. Open questions Should compliance require support for both syntaxes? I think the Socat syntax is essential for backward compatibility with existing tools, and it is more compact (important for huge catalogs full of Delegate entries), but the XML instance syntax is more in the spirit of XML (and XSchema, etc.). If both syntaxes are supported, should Delegate and Extend entries be allowed to refer from one syntax to another, or should Socat catalogs refer only to other Socat catalogs and ditto for XML instance catalogs?
[Note: See more on the SGML Open (OASIS) 'CATALOG' and identifiers in the dedicated database entry: Catalogs, Formal Public Identifiers, Formal System Identifiers -rcc]
John Cowan http://www.ccil.org/~cowan You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)