Architectures, Schemas, and XML: Proposed Amendment to ISO/IEC 10744:1997

[Note The post by Eliot and followup by David Megginson both contain a couple corrections per (what I understand to be intended in) Eliot's re-post w/ a correction for the explicit "name" attribute -rcc].

Subject:      Architectures, Schemas, and XML: Proposed Amendment to ISO/IEC 10744:1997
From:         "W. Eliot Kimber" <eliot@isogen.com>
Date:         1997/12/12
Message-ID:   <349190F6.D0C6102D@isogen.com>
Newsgroups:   comp.text.sgml


   ---------------------------------------------------------------

At last week's WG4 (SGML) standards meeting, James Clark and I put
together a one-page proposed amendment to ISO/IEC 10744:1997 (HyTime)
that provides a PI-based syntax for declaring the use of SGML
architectures (schemas). The proposal has been submitted for immediate
ballot, which should be completed in the next three or four months.

This amendment is an implementation of the various "PI for
architectures" proposals made recently (e.g., David Megginson's paper
presented at the XML Developer's day).  The design provides more
meaningful (and obvious) names for the architecture configuration
attributes. You can find the full text of the amendment at
http://www.ornl.gov/sgml/wg8/document/1957.htm

A typical architecture use declaration within an XML document would look
like this:

<?XML version="1.0" ?>
<?IS10744:arch name="isobase"
   public-id="-//ISOGEN International Corp.//NOTATION 
              ISOGEN Base Architecture//EN"
   dtd-system-id="http://www.isogen.com/archs/isobase/isobase.dtd"
?>
<mydoc isobase="isogen-document"/>

The minimal declaration simply provides an architecture (schema) name:

<?XML version="1.0" ?>
<?IS10744:arch name="isobase" ?>
<mydoc isobase="isogen-document"/>

However, you would normally want to point to at least the DTD
declarations, if not the public name for the schema, just so it's clear
what the architecture name really refers to.  Of course, in the case of
well-known or widely-used schemas the name may be sufficient (e.g., RDF,
HyTime, etc.).

By default, the architecture name (the name following "IS10744:arch") is
the name of the attribute used to map elements to elements in the
architecture.

The attributes shown in the example are interpreted as follows:

public-id
   The globally-unique name for the architecture as an abstract concept, 
   that is, as as set of rules that govern documents.  These rules could 
   be defined in whole or in part using any schema mechanism, including
   something like the XML-Data proposal.

dtd-system-id
   The system ID for the DTD-syntax declarations for the architecture.

The architecture DTD declarations can be used by architecture-aware
processors to perform syntactic validation of the document according to
its architectural mapping.  For example, the SP parser does this today.

As this amendment simply provides an alternative syntax for an existing
facility of the standard and does not change the functionality in any
way, we do not anticipate any opposition to its approval.

----------------


Date: Sat, 13 Dec 1997 17:58:55 -0500
Message-Id: <199712132258.RAA00384@unready.microstar.com>
From: David Megginson <ak117@freenet.carleton.ca>
To: xml-dev Mailing List <xml-dev@ic.ac.uk>
Subject: XML Architectural Forms

I don't remember seeing an announcement here (apologies if I'm
mistaken), but Eliot Kimber and James Clark have announced on
comp.text.sgml a proposed amendment to ISO 10744 that will make it
possible to use Architectural Forms in XML.  You can find the text of
the amendment at the following URL:

  http://www.ornl.gov/sgml/wg8/document/1957.htm

Here's Eliot's example of a simple, well-formed XML document that uses
the base architecture "isobase":

  <?XML version="1.0" ?>
  <?IS10744:arch name="isobase" ?>
  <mydoc isobase="isogen-document"/>

This is very exciting, because if accepted, the amendment will make
it possible to solve the XML namespace problem with an International
Standard, instead of forcing the W3C to throw together a consortium
standard.  Base architectures also provide a simple and elegant
solution to multiple inheritance; for example, here's Eliot's example
modified to implement _two_ base architectures:

  <?XML version="1.0" ?>
  <?IS10744:arch name="isobase" ?>
  <?IS10744:arch name="mslbase" ?>
  <mydoc isobase="isogen-document" mslbase="microstar-document"/>

The element <mydoc> corresponds to <isogen-document> in the isobase
namespace and to <microstar-document> in the mslbase namespace at the
same time.

Even more interesting is the ability to embed the architectural
attributes in a DTD, so that they do not appear in the document
instance at all.  For example, you can create an external DTD like
this:

  <?IS10744:arch name="isobase" ?>
  <?IS10744:arch name="mslbase" ?>
  <!ELEMENT mydoc EMPTY>
  <!ATTLIST mydoc
    isobase NMTOKEN #FIXED "isogen-document"
    mslbase NMTOKEN #FIXED "microstar-document">

Now, every XML document that uses this DTD will implement the two
architectures automatically, with no additional markup required:

  <?XML version="1.0" ?>
  <!DOCTYPE mydoc SYSTEM "mydoc.dtd">
  <mydoc/>

Authors won't even have to know that they're using architectural
forms.

Congratulations are due to Eliot and James for taking the time to
start this process.

David
-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)