[Mirrored from: http://www.allette.com.au/allette/ricko/1861.htm]

N1861

ISO/IEC JTC1/SC18/WG8

Document Processing and Relating Communication—

Document Description and Processing Languages

TITLE: Proposed TC for Extended Naming Rules and Development Principles for SGML
SOURCE: WG8
PROJECT: JTC1.18.15.1
PROJECT EDITOR: Charles F. Goldfarb
STATUS:WG8 approved draft
ACTION:For balloting as TC to ISO 8879:1986
SUMMARY OF MAJOR POINTS:This Technical Corrigendum adds brief annexes to ISO 8879 for the following purposes:
  1. To meet an urgent need for extended naming rules for non-Latin scripts in support of the following statements in clause 0.2:
    1. There must be no national language bias.

      The characters used for names can be augmented by any special national characters.

    This is contradicted by the restriction, in production [189] of the current specification, that only a single parameter literal, whose length may not exceed 240 characters, can be used to specify name characters. This means that, for characters outside the ISO 646 character set which have to be specified using numeric character references, no more than 40 additional name characters can be specified. Clearly this is insufficient to support most languages, especially those with large character sets such as Japanese, Chinese and Korean.

  2. To formalize the established principles for development of SGML (WG8 N1289) that are observed by WG8. These principles ensure that existing conforming SGML documents will remain conforming when the standard is revised.

This TC does not affect existing SGML documents or products. It affects only those SGML documents and products that choose to support the extended naming rules option.

DATE:24 May 1996
DISTRIBUTION: WG8 and Liaisons
REFER TO:WG8 N1854
REPLY TO:Dr. James D. Mason
(ISO/IEC JTC1/SC18/WG8 Convenor)
Oak Ridge National Laboratory
Information Management Services
Bldg. 2506, M.S. 6302, P.O. Box 2008
Oak Ridge, TN 37831-6302 U.S.A.
Telephone: +1 423 574-6973
Facsimile: +1 423 574-6983
Network: masonjd@ornl.gov
http://www.ornl.gov/sgml/wg8/wg8home.htm
ftp://ftp.ornl.€

Proposed TC for Extended Naming Rules and SGML Development Principles

Add the following two normative annexes to ISO 8879.

Annex J (normative)
Extended Naming Rules

This annex describes an optional extension of SGML known as the "Extended Naming Rules". The extension should be used only in SGML documents for which the normal naming rules are unsuitable (usually because of the size of the natural language character set). An SGML system need not support these Extended Naming Rules in order to be a conforming SGML system.

This annex is phrased in terms of revisions to be made to the body of this International Standard. However, these revisions are applicable only when the Extended Naming Rules are in use.

To distinguish SGML declarations that use this extension from those that do not, the minimum literal in productions [171] and [200] of ISO 8879:1986 must be modified to read "ISO 8879:1986 (ENR)". To accomplish this add the following sentence to the paragraph immediately following production [171] and to the second paragraph following production [200]:

However, when extended naming rules are used, the minimum data must be "ISO 8879:1986 (ENR)".

Extended Naming Rules

For many languages the distinction made in production [189] between uppercase and lowercase is not relevant. It is, therefore, necessary to modify clause 13.4.5 to allow for both an extended character set and for the use of character sets that do not have different cases. The changes required, in the order of their occurrence in 13.4.5, are:

  1. Replace production [189] with:
    [189] naming rules =
     "NAMING", ps+,
     "LCNMSTRT", (ps+, extended naming value)+,
     "UCNMSTRT", (ps+, extended naming value)+,
     ("NAMESTRT", (ps+, extended naming value)+)?,
     "LCNMCHAR", (ps+, extended naming value)+,
     "UCNMCHAR", (ps+, extended naming value)+,
     ("NAMECHAR", (ps+, extended naming value)+)?,
     "NAMECASE", ps+,
     "GENERAL", ps+, ("NO"| "YES"), ps+,
     "ENTITY", ps+, ("NO"| "YES") 
  2. In the "where" list change each occurrence of the phrase "in the literals (if any)" to "identified by the extended naming value (if any)"
  3. Add two new keywords to the "where" list:
    NAMESTRT
    means that each character identified by the extended naming value (if any) has the same effect as a character appearing in both UCNMSTRT and LCNMSTRT.
    NAMECHAR
    means that each character identified by the extended naming value (if any) has the same effect as a character appearing in both UCNMCHAR and LCNMCHAR.
  4. At the end of the clause, add:

    [189.1] extended naming value = parameter literal | character number | character range

    A character number may be used to specify a character that is defined in the syntax-reference character set but is not permitted in an SGML declaration.

    [189.2] character range = character number, ps*, "-", ps*, character number

    Specifying a character range is equivalent to specifying every character number from (and including) the character number that starts the range to (and including) the character number that ends the range.

Annex K (normative)
SGML Development Principles

The future development of ISO 8879 shall be consistent with the following principles:

  1. Any document that is a conforming SGML document according to the current standard shall continue to be a conforming document under the provisions of future versions of the standard.
  2. The results of parsing an SGML document (that is, the element structure information set, or "ESIS") that conforms to the current standard shall be unchanged when the document is parsed under the provisions of future versions of the standard.
  3. A document that is classified as a minimal or basic conforming SGML document under the current standard shall continue to be classified as such under the provisions of future versions of the standard.

    NOTE 1 -- These principles should not be construed to mean that no changes can be made to ISO 8879. To meet evolving user requirements, for example, some changes of the following types are possible without violating the above principles:

    1. Relaxing restrictions
    2. Adding new constructs
    3. Partitioning existing optional features
    4. Introducing options to allow the suppression of troublesome existing constructs, when experience indicates that the constructs tend to induce user errors with serious consequences
  4. Future versions of the standard shall require conforming SGML parsers and systems to support conforming SGML documents, minimal conforming SGML documents, and basic conforming SGML documents to at least the same extent as the current standard.

    NOTE 2 -- Future versions of this standard can introduce additional requirements as well.

    NOTE 3 -- These principles should not be construed to mean that the definition of a "conforming SGML document" cannot be changed, only that existing conforming SGML documents will continue to be classified as such.