SGML Report: Introduction

This report has been produced by The <SGML> Project based on information and expertise that has been gained over the three years of the project. It is being published jointly by UCSG and The <SGML> Project.

The <SGML> Project

The project was set up in January 1991 with a grant for two years from the Information Systems Committee of the UFC. The project was extended in 1993 for a further 15 months. The project is located at the University of Exeter and has one full-time Officer, and a part-time Secretary. The aims of the project have been:
  1. To raise awareness of SGML in UK Academia: This the project has done by offering seminars, training courses, training materials, newsletter articles, posting to bulletin boards, and disseminating reports information and ISO standards.
  2. To collect and disseminate information on SGML and SGML- aware products: Over the three years of the project, the project staff have been collecting information on SGML- aware products from around the world and disseminating that information throughout academia — this report is a culmination of that activity.
  3. To become a centre of expertise and for the exchange of information on SGML: Project staff have spent much of the last three years answering queries about SGML and SGML- aware software by e-mail, letter, and phone; and by putting prospective users of SGML in touch with other users and with relevant software suppliers.

    SGML — The (missing) link?

    It is assumed that readers of this report will be aware of SGML. However, from our experience, many who know a little about SGML tend to mistake it for a straight replacement for the many mark- up languages (  la RUNOFF) that we are all used to. It is not! The key aspects of SGML can be described as follows:

    Standard: SGML is an International Standard (ISO 8879:1986) supported by ISO, and maintained by an official ISO Working Group (ISO-IEC/JTC1/SC18/WG8). It has been adopted as a European Norm (EN 28879), by the U.K. and by many countries around the world. The associated standards and technical reports have also been recognised and accepted throughout the world.

    Generalized: SGML is independent of system, language, application, and industry. That independence is crucial and absolute.

    Markup: Like earlier mark-up languages, the mark-up languages developed by using SGML result in clear text `tags' being inserted into the data stream to separate the various elements. For more information about the format and range of the various tags and what they represent, readers are advised to study the standard, and/or the various works listed in the bibliography.

    Language: SGML is not a language — it is a meta- language. It is a language for defining mark-up languages. This difference is crucial to the understanding of the value of SGML as a means of linking the various uses for structured information. SGML is used to produce a `Document Type Definition' (DTD) which is included or referenced at the head of each `document instance'. The DTD rigorously defines the structure of all possible documents in its class, and defines the mark-up language that is going to be used on each document. Referencing the DTD at the head of each document means that software which processes that document (or data stream) has a complete understanding of the structure, and can undertake whatever processing is required. SGML is the link (there is nothing `missing' about it!) between different software packages, doing different things, on the same information.

    Annex A contains a bibliography of SGML publications (including ISO Standards and Technical Reports) which are available, and a brief description of each publication and its relevance to academia.

    Structure of Section 2

    As should be clear from the preceding paragraphs, SGML defines nothing more than a means of encoding structured information that will be used in a variety of different ways. [Note: If it was only going to be used in one way (eg. created) then there would be no point in taking on the overhead inherent in using SGML.] Hence, for any given project, decisions on which software to acquire will require a comparison of products to satisfy a wide set of uses. Section 2 of this report which deals with individual software packages has been sub-sectioned to cover various categories of product. The categorisation, and the allocation into categories, is never going to be completely satisfactory, because many packages would fit into more than one category. This is clearly shown when we attempted to sub-divide software that claim to assist with the management of collections of documents. There is no clear and generally accepted definition of `document management', hence the decisions we took concerning categorising products must be considered to be arbitrary to a certain extent. Prospective users of SGML-aware software are advised to look at products in related categories to satisfy themselves that they have considered all the relevant products.

    For each product, we have provided information as follows:

    Product:
    As well as the name of the product, we have provided a version number wherever possible.
    Associated products:
    Many SGML software houses have developed a key product, and minor related products. This entry provides a list of other products which could be considered by prospective users, or would be available from the same developers. It is also used to indicate combinations of products that are required to work together even if the products are from different suppliers.
    Developer:
    North America, particularly Canada, realised the potential for SGML far ahead of the UK. As a result, the great majority of products were developed outside of Europe. This entry is kept brief, but a list of the full names, addresses, phone and fax numbers is provided in Annex D
    UK Supplier(s):
    See Annex D for an address for the companies named here.
    Price:
    An attempt has been made to obtain the latest pricing of each product, on each platform that it is available, and to take account of any academic discounts that are offered. Some suppliers are reluctant to publish precise prices, particularly for products that have been announced, but no copies of which have been distributed to customers.
    Platform:
    Only software running on PCs (Windows almost exclusively), Apple Macintoshes, and common flavours of Unix (mostly requiring Motif) have been considered. If your system is not shown for any particular product, then contact the supplier/developer as many indicate that they are considering further platforms.
    Description:
    The description of each product has been extracted from publicity material for that product or has been supplied by the developer following a request. The information supplied has been edited to extract unnecessary superlatives. It should be read carefully, for what is NOT said as much as for what is said, particularly for software that cannot / has not been evaluated for whatever reason.
    Evaluation:
    The evaluation has been carried out by the authors as part of their normal work, or is a report from others. It has not been possible to evaluate all products, particularly those that require considerable setting up by the supplier or the user (eg. document managment systems).

    ‘Concrete Reference Syntax’ and conformance

    ISO 8879 contains an abstract definition of the grammar of SGML, then, by way of example, it defines a `Concrete Reference Syntax' for that grammar. This syntax has become the basis upon which very nearly all software products have been based — those that allow a variation of syntax assume the concrete reference syntax if no variation is defined. Hence, throughout this report it is assumed that only the concrete reference syntax will be required, and no consideration will be given to the ability to process variant syntaxes.

    Suppliers can (and do!) claim conformance to ISO 8879 without specifying precisely what they mean. Conformance to SGML is laid down in ISO 8879. In addition to `conforming' there are three other classes of conformance: basic, minimal and variant. The definitions of the four classes of conformance are as follows:

    In addition, there are further sub-clauses in clause 15 concerning Conforming SGML Application (sub-clause 15.2), and Conforming SGML System (sub-clause 15.3) which relate conforming documents to the application and system. In reality, most SGML-aware systems will not allow a variation of the concrete syntax, hence they are not able to be classed as creating or using `conforming' SGML documents. Within ISO 8879 there are no statements giving priorities to particular features, so the four classes of conformance tend to be used as the `levels' rather than `classes'. It is crucial for users to confirm that all the aspects of SGML that they require are included in the product set that they select. Many systems conform to FIPS 152, which was developed by the US Government to define the minimum level of SGML conformance that they would require from an SGML-conforming system.