[Mirror copy from http://www.csi.uottawa.ca/~dduchier/misc/wohler1.html]

From: wohler@vnet.IBM.COM (Wayne L. Wohler)
Date: Wed, 16 Feb 94 13:38:28 MST
Subject: Descriptions of DTD semantics


Descriptions of DTD semantics

As I indicated a couple of days ago I mentioned that the CApH group has been working on the problem of describing the semantics of elements. We have really been working on 2 related problems:
  1. how does one communicate to other humans what the semantic for a particular SGML property is and
  2. how to communicate to a program a specific semantic for processing.

When we started we worked hard on providing syntax to associate semantics with elements and attributes. We now realize that the syntaxes we were using were simple query languages and that their simplicity was a liability since it was quite easy to theorize situations where the language would breakdown. Element context, attribute/element combinations, and many other useful combinations of properties can indicate special semantics. A general mechanism required general property query language. HyTime contains such a query language, HyQ, that CApH proposes to use (Conventions for the Application of HyTime, after all) but the concept could be used with other query languages.

The actual mechanism is quite simple: It has a container, call it a declaration, which contains two parts, a description and the target location specification. The description contains any text that is necessary to describe the semantic. The target location specification gives the location query or queries that yield the targets whose semantic matches the description. For an architecture or set of conventions, like CApH, the defining group need to define the semantics they wish to provide a convention for. They may also wish to provide queries which are recommended. For example, CApH will provide a list of semantics and a set of attribute specifications that may be the target of the query. To use the semantic, it would be a good idea to use the conventional attribute specification but would not be necessary, just recommended.

This approach has a number of advantages:

  1. it may be retrofitted to DTDs that cannot be changed or changed easily,
  2. semantics may be applied to very complex queries and so may be quite precise and detailed,
  3. Simple cases can still be quite simple, as simple as a standard attribute and value for semantics which need to be applied to an element,
  4. the same semantic may be applied to multiple DTDs in the same definition.

There are a couple of interesting points here. First, the locations define semantics for all locations that satisfy the location specification. To work at the DTD level rather than the document instance level requires a different kind of resolution of the query. Usually, one thinks about document instance locations that satisfy the query, not elements declarations themselves that satisfy the query. Second, semantics definitions would be interchanged by passing a separate document containing the semantic declarations, not within the DTD itself.

Some very simple syntax to illustrate the principle (I'll omit some of the CApH for clarity), this example defines the attribute to be used to carry CApH-defined keywords. In the example I won't define an architectural form for the elements defined but this could easily be done and is done in the CApH work:

<!element SemanticDeclarations - - (SemanticDeclaration)+ >
<!element SemanticDeclaration  - O (SemanticDescription,
                                    Location+) >
<!element SemanticDescription  - O ANY >
<!element Location             O O (Query) >
<!attlist Location
    ID       ID       #IMPLIED
    HyTime   CDATA    #FIXED    nameloc
<!element Query                O O (#PCDATA) >
<!attlist Query
    Notation NOTATION HyQ
    HyTime   CDATA    #FIXED    nmquery
<!notation WohlerQ system "" --my notation in lieu of the correct HyQ-->

Carries the semantic key name(s) applicable to this element according to
the Conventions for the Application of HyTime (CApH).
      <Query notation=WohlerQ>
        attribute name = 'CApHrole"

Wayne L. Wohler, Dept G82/025Z, Publishing Solutions Development, IBM Corporation, PO Box 1900, Boulder, Colorado 80301-9191

Internet: wohler@vnet.ibm.com
Phone: 1-303-924-5943