SGML 93 Appendix: The Year in Review

Authored by Tommie Usdin, Atlis Consulting Group and Yuri Rubinsky, SoftQuad Inc
Webified by Mark A. Gaither, HaL Software Systems
(c) Copyright 1993 by Tommie Usdin and Yuri Rubinsky This document may be reproduced in whole or in part provided that this copyright notice is reproduced on each copy made.

As always, we begin this article with a disclaimer: This is a highly personal whirlwind tour of SGML activities in the last year and includes only things we've heard about. Thanks to all the individuals who contributed to this document, either through the Internet, or at the SGML '93 Conference. Please note that unlike previous year's reports, this one does not include vendors' announcements of new products.

SGML 93 conference proceedings are available for sale by calling the GCA at 703/519-8160.

What is in this report:

Standards Activity

The SGML review produced its first interim report in May, which was published in . In addition to reiterating the principle for future development -- which guarantees that existing SGML documents will continue to conform -- the report indicated that the review has progressed sufficiently that participants were comfortable indicating that they know that changes will be required to SGML. The group has included an arbitrary non-representation sampling of requirements in the report. It is important to re-state that this activity is NOT the five-year review. That traditional ISO routine event was passed some time ago; The creators of SGML continue to meet with the SGML user community to consider future development based on present and anticipated uses.
HyTime, whose formal adoption as an International Standard was announced as the kickoff item in the 1992 Year in Review is gaining momentum in a variety of places. IBM is incorporating HyTime structures into its IBMIDDOC DTD (which we'll hear about later in this talk); TechnoTeacher demonstrated some of the capabilities of its HyTime engine a few weeks ago at CALS Expo and will be releasing product in the first half of next year. And it seems at least three HyTime books are heading towards publication. Yuan-Ze Institute of Technology and IBM Research have announced the formation of Project YAO, an international consortium for the production of free SGML system software. The consortium's first products are Object SGML, a C++ class library for SGML parsing with native HyTime support, and POEM, a Portable Object-oriented Entity Manager. The products are currently in alpha test and shipment is expected in the first quarter of 1994.
Sharon Adler reports that the second draft of DSSSL will be out by the end of January and will consist of three levels of complexity and conformance. It is expected that vendors will be able to move quite quickly to the first level, which is effectively equivalent to the screen and print display capabilities of most SGML authoring and browsing tools. The 2nd level will offer full compatibility with FOSIs, but, in Sharon's words, "will be better." The 3rd level will support arbitrarily complex specifications.
ISO 12083
This week marks the formal announcement of ISO 12083, the much-enhanced and internationalized version of the ANSI/NISO standard Z39.59-1988, the American National Standard for Electronic Manuscript Preparation and Markup (also known as the AAP DTD). The ISO standard differs from the ANSI standard in that all ambiguities, redundancies and formatting specific aspects of the original have been removed, element and attribute names and values have generally been lengthened for increased legibility (within the limits set by the Reference Concrete Syntax), and simple HyTime and ICADD constructs -- see separate item below -- have been incorporated into the DTDs.
Standard Page Description Language
Dr James Mason, convenor of ISO activity in SGML and related standards, reports that the SPDL now has the authorization to go to press; the committee is ironing out the small details.

User Group Activity

The Swedish SGML UG got off to a flying start in August 1993. A day with Eric van Herwijnen, several Swedish vendors and SGML pioneers attracted nearly 200 attendees/members.
Northern California
The first meeting of the Northern California SGML UG was held recently at Silicon Graphics in Mountain View, Calif.
The Japanese SGML Forum sponsored SGML Show '93 for the public and attracted an audience of 400 for product introductions, an advanced-user lecture, and product demonstrations. Participating companies (in alphabetical order) included: Aldus, Electronic Book Technologies, Fujitsu, Interleaf, Nihon Unitec, and Nissho Iwai.
The French SGML UG began in December 1992 in Paris. The President is Michel Biezunski. More than 35 people took part and included many people interested in implementing SGML as well as vendors. The plan is to organize four events each year dedicated to specific themes, such as one day of user experiences.
The first chapter located in the Middle East is the Israeli Chapter which is being organized by Nary Ratberg in Tel Aviv.
Erik Naggum has succeeded Steve Newcomb as Chairman of the SGML SIGhyper group, the long name of which is "the SGML Users' Group's Special Interest Group on Hypertext and Multimedia."
Graham Tritt reports that a successful annual meeting of the SGML Users Group Switzerland was held in November, with attendance of 35, in his words, most still in the "interest" or "evaluation" phase.

Major Public Initiatives

SGML Open, the consortium of vendors and users, was founded this year to undertake both technical and marketing activities. With nearly 40 members, the group is rapidly moving to propose techniques to support multi-vendor interoperability beyond SGML itself. Meetings at the end of SGML '93 resulted in the creation (and beginning activity) of working committees to deal with creation of marketing materials and specifications for common support for entity management and handling.
Following recent meetings at the ACM Hypertext Conference in Seattle and between the Worldwide Web and TEI communities in Cork, Ireland, renewed and vigorous interest is being shown in creating a "seriously useful" DTD for the online browsers for the WWW. An ad hoc from the ACM meeting has presented a proposed update to HTML and Dave Raggett of Hewlett Packard (who created HTML) is working on revision 2, which is known as HTMLplus.
A joint committee of the Newspaper Association of America (NAA) and the International Press Telecommunications Council (IPTC) continue to work on a device-independent file format for news service data transmission based on SGML. This standard, called the Universal Text Format, was drafted last year and known then as the NIML, or News Industry Markup Language. UTF is intended to be used in conjunction with the International Interchange Model, adopted several years ago by the NAA and IPTC. Most major news services in North America, Western Europe and Japan are involved, along with a number of large US newspapers. The working group currently is refining the UTF proposal to incorporate a DTD for news service files. The goal is to submit the standard for approval next summer to the parent organizations.
The Text Encoding Initiative reports that all major base tagsets and several additional publications, without the "Draft" status, are due shortly.
Pinnacles Group
The Electronic Component Industry group takes a giant `baby step' toward standardization of product information interchange. The Pinnacles Group, consisting of Hitachi, Intel, National Semiconductor, Philips Semiconductors, and Texas Instruments, last week (12/2/93) completed the document analysis and architectural phase of this effort. DTDs for review are expected in February 1994.
International Committee for Accessible Document Design
Last year, Texas passed legislation which specified SGML and the ICADD tagset as the favored encoding for mandatory delivery of all textbooks which have been adopted by the state education authorities. The electronic files are to be used to create Braille, large print and synthesized voice versions of the textbooks for use by the print-disabled. Eighteen other states have passed similar legislation and are expected to upgrade their legislation to match that of Texas. Exoterica has announced that it will make available an ICADD application--for free -- which will transform any file marked up according to an ICADD-enabled DTD into the recommended tagset. Berger-Levrault/AIS has announced the availability of similar support in its Balise software.
Davenport Group
The Davenport Group discovered that it had at least two crash-priority agendas that were competing for the attention of its participants, and so it splintered quite amicably into two groups. The group that kept the name "Davenport" continues to work on the "DocBook" DTD, in collaboration with the Unix International DocSIG, which is expected to allow the documentation of Unix (and Unix-like) software documentation to be entirely portable.
The other Davenport successor group continues to work on the development of conventions for the use of HyTime. This group now functions under the aegis of the Graphic Communications Association Research Institute, and it is called the "Conventions for the Application of HyTime" (or "CApH", an acronym which is pronounced like the first syllable of the word "caffeine"). The meta-DTD that used to be called "SOFABED" under the aegis of Davenport is now called "Topic Relationships" under CApH; it suggests ways of representing indexes and other navigational information using HyTime. CApH is also developing conventions for using the HyTime "activity tracking" architectural forms to represent the wishes of information can be used, for example, after royalty payments, security clearances, etc.
In Canada, a joint venture of government and industry was formed to promote the CALS vision without reference to the military domain. The conceptual framework for the initiative has been termed FUSION, an acronym that stands for the "Focused Use of Standards for Integrating Organizations and Networks." The prototype applications currently under development deploy SGML as the central tool in managing shared information holdings.
New Drug Applications
The US Federal Drug Administration (FDA) is skeptical that SGML will really work for New Drug Applications (NDAs), so a small group headed by the Graphic Communications Association (GCA) has built a demonstration Computer Aided New Drug Application (CANDA) using real drug content. The FDA has agreed to "look at it" and in fact has people in attendance at this conference. Many pharmaceutical companies have decided they can't wait, so they're implementing SGML on their own.
Common Desktop Environment (CDE)
The help system that will be distributed with CDE Unix systems from six major vendors is based on a SGML file format called SDA.
The US Securities and Exchange Commission is now accepting corporate filings in SGML as part of its long-awaited second phase EDGAR -- Electronic Data Gathering and Retrieval -- project. By 1995, all public corporations in America will be expected to file their financial disclosure information this way. Sadly, the DTD is very limited and shows how the politics of trying to please absolutely everybody can play havoc with good application design.
Air Transport Association/Aerospace Industries Association
In the ATA world, British Airways reports that it will provide an SGML solution to handle SGML aircraft maintenance and operational manuals. Phase I is for introduction into British Airways of the Boeing 777 in March 95. The German Lufthansa Airlines has released to field-usage an SGML-based central document management system using ATA DTDs. The system, which supports any DTD, is named DocMaint and will be marketed by STEP who developed the system. Pratt & Whitney is implementing a system to deliver technical publications in SGML for use in the commercial aviation industry. The first publication will be available in 1994 for the engine being developed for the Boeing 777 Aircraft. Aerospatiale, through the Airbus consortium, began this year to deliver SGML-coded maintenance and operations manuals to their customers on a routine basis. The Electronic Library System, in which SGML and associated standards will play a major role, was launched, with an initial focus on its ground-based component. Boeing itself has developed an indexing system for Service Bulletins using tagged Service Bulletins and created dynamically. This leads to very user friendly search and navigation path to get to the desired bulletin. The system was developed under Unix and the materials are downloaded to PC machines.

Recent and Forthcoming Publications

Press Coverage
  1. North American coverage of SGML in the mainstream computer press continues to grow rapidly. Articles have appeared in PC Magazine, PC Week, MacWeek, Personal Computing, Washington Technology Week, and Forbes Magazine, among others.
  2. The Taiwanese November issue of BYTE includes a piece by Charles Goldfarb on HyTime.
  3. In a highly-directed public relations misfire, SGML was mentioned on radio in Birmingham, Ala. twice and Bakersfield, Calif. once by David Silverman of DCL. David reports that no new business was received as a result of this strategic promotional activity.
Prentice Hall
Prentice Hall has announced the formation of a new series of books and multimedia publications: "The Charles F. Goldfarb Series on Open Information Management." The series will support the development and deployment of information management solutions based on open standards such as SGML and HyTime. Initial titles will address document type design methodology, the benefits of SGML-based information management solutions, the development of SGML applications, and the HyTime standard for hypermedia application development. Under Dr. Goldfarb's guidance, the series will be geared to information specialists, engineers, IS managers, systems programmers, and other computer and publishing professionals seeking to implement open information management solutions.
Kluwer Academic Publishers
Kluwer Academic Publishers are announcing the Spring 1994 publication of "Making Hypermedia Work: A User's Guide to HyTime" by Steve DeRose of Electronic Book Technologies Inc. and David Durand of Boston University. "Making Hypermedia Work" is a user's handbook for representing hypertextual information in SGML. The book fully describes the most useful parts of HyTime while providing design guidelines to help users avoid pitfalls and build effective documents for hypertext and multimedia systems.
Also from Kluwer
Eric van Herwijnen's book "Practical SGML" has gone through ten printings, and the second edition will be available in February 1994. During the last year an Electronic Book (DynaText application) version appeared with an SGML parser so you can parse the examples.
Manager's Guide from VNR
Responding to the fact that the interest in SGML far outpaces the understanding of SGML, and that the lack of an accurate, non-technical explanation remains a major impediment to its wide-scale adoption, Van Nostrand Reinhold has contracted to publish a manager's guide to SGML. The book is in progress and scheduled for publication in the fall of 1994. This book is the first in a series of books on SGML by VNR and will be followed by books on DTD writing and other technical subjects.
STC News
The Society for Technical Communication devoted special sections in two consecutive issues of the quarterly journal "Technical Communication" to a series of eight articles on SGML from a beginners' tutorial to case studies to an overview of tools.
More Milestones
The SGML Handbook just reached a milestone 3000 copies sold and is back to press for its third printing. Martin Bryan's Author's Guide to SGML is back for its sixth trip to the presses with 7500 copies in print. SoftQuad is pleased to anounce that the "SGML Primer", its thirty-six page introduction to SGML, is now in its sixth printing with 3400 copies distributed to date.
The Compleat SGML
Exoterica released "The Compleat SGML" in August, 1993. This hypertext for Microsoft Windows links the SGML standard with 2348 SGML test documents. The SGML documents are created in accordance with ANSI's Conformance Testing for SGML Systems standard and serve to provide detailed illustration of the points being made in the standard. It also includes annotations that clarify some of the more esoteric areas of the standard. In all, the hypertext links numbers in the hundreds of thousands. Exoterica will also release "The SGML Conformace Test Suite" in January. The SCTS provides the test documents from "The Compleat SGML" in extractable form with a database extraction tool. Both the National Computing Center in England and the National Institute for Standards and Technology have expressed interest in using the Exoterica Conformance Test Suite for SGML testing.
Kimber on HyQ
Eliot Kimber has written a reference guide to the all-purpose "HyQ" SGML query language that forms part of the HyTime international standard. This publication is available free via anonymous FTP from either of the two SIGhyper FTP sites (at the University of Oslo in Norway and at Florida State University in Tallahassee).

Government and Corporate Initiatives

Library of Congress
The American Memory Project of the Library of Congress is using SGML to create a text base of historical materials on subjects such as Women's Suffrage, the history of American Theatre, and abolitionism in the US.
University of Chicago Press
University of Chicago Press is implementing systems for translation to SGML, on-line editing in SGML, and output to typesetting and for electronic journals. The first implementation will be for the Astrophysical Journal.
The Information Development group within IBM is developing an internal SGML-conforming system to support the creation, management, and production of all IBM product documentation for all media types and delivery methods, using a single, comprehensive SGML application called IBMIDDOC.
OCLC and Information Dimensions Inc.
OCLC and its subsidiary, Information Dimensions, Inc. (IDI) have been selected to develop an electronic publishing system for ACM (the Association for Computing Machinery). The OCLC/IDI in-house electronic publishing system will integrate the various ACM publishing functions into a unified, automated system that will encompass the writing, editing, composition, production, archiving, and, eventually, distribution of documents and publications. ACM publishes an estimated 40,000 pages per year, including books, journals, conference proceedings, and internal publications.
British National Corpus
British National Corpus, a UK government funded academic/industrial consortium is developing a 100 million word corpus of modem English for use in lexicography and linguistic research. Due for completition in April 1994, this corpus is marked up in SGML, including part of speech codes and will be freely available for research purposes, together with a high performance SGML-aware browser/indexer developed for the project.
Springer Verlag
Springer Verlag is currently processing more than 50 journals using SGML. In 1994 the number of journals will be expanded to 150.
Brockhaus, the German Encyclopaedia and dictionary publisher, recently went into production with an SGML-based editorial sysem.
US Government Printing Office
Document analysis is in progress to place the Federal Register on-line as the GPO continues to publish on paper. GPO's typesetting programs will be made to recognize SGML codes as well as its own set of codes that are strictly procedural. The existing DTD for the Congressional Record on CD-ROM is now being used to create a database for daily retrieval on bulletin board as well as the printed document. CD-ROM may follow later.
Uniscope has developed the Japanese Academic DTD for publishing Japanese Academic Information on-line as a full-text database of journals.
US Patent and Trademark Office
The US Patent and Trademark Office (of the Department of Commerce), in cooperation with its Trilateral Partners (European Patent Office and Japanese Patent Office) in creating a new, proposed revision of World Intellectual Property Office (WIPO) Standard 32, specifying a list of tags and providing a DTD and instructions for the use of those tags in electronic exchange of SGML-coded patent and trademark data. The Trilateral partners are also expecting final delivery early in 1994 of jointly developed SGML-based CD-ROM authoring, retrieval, display, and printing software for mixed-mode (text and images, on-the-fly page construction) patent and trademark data. The The US Patent and Trademark Office is planning the conversion of its internal systems for electronic application, document markup, on-line database, document printing, CD-ROM production, and dissemination systems to SGML-based text and image storage.
European Patent Office
The European Patent Office scans, OCRs, tags, and publishes over a million pages of patent applications a year.


As a demonstration of the power of information structures to organize the brain, this year's Miscellaneous section is broken down into three sub-sections:

Last updated Tue 21 Jun 94 by