Credits

The following report was obtained from the Exeter SGML Project FTP server as Report No. 9, in UNIX "tar" and "compress" (.Z) format. It is unchanged here except for the conversion of SGML markup characters into entity references, in support of HTML.

Markup '92 Conference Report, by Michael Popham


THE SGML PROJECT                                         SGML/R12


CONFERENCE REPORT: INTERNATIONAL MARKUP `92
World Trade Center, Amsterdam, The Netherlands
10-13th May, 1992
                                                        Issued by
                                                 Michael G Popham
                                    Computing Development Officer
                                                 The SGML Project

                                                29 September 1992
_________________________________________________________________

BACKGROUND

This was the tenth annual International Conference to be
organized by the Graphic Communications Association (GCA).  In is
opening address, Norman Scharpf (President, GCA), said that this
was the largest attendance in the history of the conference   
with around 150 people present.  Scharpf attributed the high
attendance mostly to increased interest in SGML, but also to the
GCA's decision to arrange for the first in a series of
"Documentation Europe" conferences to run `back-to-back' with
International Markup.

List of Sessions Attended

1. "Up-to-speed with SGML"  -- Chair: Pam Gennusa (Consultancy
Directory of Database Publishing Systems Limited; President - The
International SGML Users' Group)

2. "SGML: Changes today for tomorrow's requirements" -- Dr.
Charles F. Goldfarb (Senior Systems Analyst, IBM Research
Division).

3."SGML and databases" -- Chair: Francois Chahuneau (Director,
Berger-Levrault/Advanced Information Systems)

    3.1 "SGML and databases: Implementation techniques, access
         methods and performance issues".

    3.2 "Relational database applications with heterogeneous SGML
         documents" -- Tibor Tscheke (President, STEP Sturtz 
         Electronic Publishing GmbH)

4. "HyTime" -- Chair Steve Newcomb (President, TechnoTeacher)

    4.1 "HyTime workshop" -- Steve Newcomb

    4.2 "Space and Time in HyTime" -- Michel Biezunski (Consultant, 
         Moderato, Paris).

5. AAP Math/Tables Update Committee

6. "SGML: an ISO standard; an ISO tool" -- Anders Berglund
(Senior Adviser, Electronic Publishing, International
Organization for Standardization).

7. "SGML -- a patently good solution" -- Terry Rowlay
(Directorate General, European Patent Office).

8. "Encoding the English poetry full-text database applying
SGML to thirteen centuries of verse" -- Stephen Pocock (Senior
Projects Editor, Chadwyck-Healey Ltd).

9. "Is SGML Bad News to your Editorial Staff" -- Koen Mulder
(Wolters Kluwer Law Publishers, Deventer)

10. "SGML in the Software Development Environment" -- Shaun
Bagley (General Manager, Exoterica Inc.)

11. "Implementing SGML at Ericsson Telecom: two perspectives"  
-- Peter Dybeck (Project Manager, Docware Development, Ericsson
Telecom AB), Nick van Heist (Technical Consultant, TeXcel AS AB)

12. Reports from SGML Users' Groups Chapters, Special Interest
Groups (SIGs) and Affiliates -- Chair: Pam Gennusa (President,
SGML Users' Group)

    12.1  The European Workgroup on SGML (EWS) -- Holger Wendt

    12.2  The SGML Project -- Michael Popham

    12.3  CALS Update -- Michael Maziarka

    12.4  Dutch Chapter, SGML Users' Group -- Jan Masden

    12.5  Norwegian Chapter, SGML Users' Group -- Jan Ordahl

    12.6  SGML for the print-disabled -- Tom Wesley

    12.7  French Chapter, SGML Users' Group -- Michel Biezunski

    12.8  SGML Forum of New York, SGML Users' Group -- Joe Davidson

    12.9  SGML SIGhyper, SGML Users' Group -- Steve Newcomb

    12.10 UK Chapter, SGML Users' Group -- Nigel Bray

13. 1992 AGM SGML Users' Group -- Chair: Pam Gennusa (President,
SGML Users' Group)

14. Keynote Address -- Ken Thompson (Commission of the European
Communities)

15. "Technical Trends Affecting Decisions about Documentation
Generation" Andrew Tribute (Consultant, Seybold Limited)

16. "Providing a strategy for the practical implementation of
document management systems" -- Martin Pearson and Graham
Bassett (Rank Xerox Limited)

17. "Technical information as an integrated part of a product or
a function" -- Goran Ramfors, (Marketing Director, Telub Inforum AB)

18. Documentation as support product -- Chair: Nick Arnold (OECD)

    18.1 "Synchronization of documentation and training for
          telecommunications products" -- Douglas N. Cross 
          (Member of Technical Staff, Technical Support
          Switching Group, Bell Communications Research Inc.(Bellcore))

    18.2 "Aerospace technical publications in the 90's on
          a multinational nacelles project" -- Paul Martin 
          (Technical Publications Coordinator Customer Support, 
          Nacelles Systems Division, Short Brothers PLC)

    18.3 "A publisher using databases: feelings and experiences"
         -- Joaquin Suraez Prado (Prepress Director, Librarie Larousse)

    18.4 "U.S. WEST's approach to object oriented information management" 
         -- Paul J Herbert and Diane H A Kaferly (U.S. WEST Communications)

    18.5 "Keeping track of law changes" -- Marc Woltering
         (Wolters Kluwer Law Publishers, Deventer)

19. Summary

Note:  I have attempted to report on the presentations that I
attended to the best of my ability, although nothing I have
written should necessarily be taken to represent the opinions of
the speakers or attributed to them.  Any mistakes are mine, and I
hope both readers and those being reported will make allowances
and accept my apologies.  Any remarks or comments that are
entirely my own are enclosed in square brackets where they occur
during the body of a report on a particular presentation.



PROGRAMME (Sunday 10th May)

1.  "Up-to-speed with SGML" -- Chair: Pam Gennusa (Consultancy
Directory of Database Publishing Systems Limited; President - The
International SGML Users' Group)

This was an informal session to encourage new and experienced
users of SGML to meet and discuss with the leading authorities in
the SGML field.  The panel included Dr. Charles Goldfarb (Editor
ISO 8879:SGML, author "The SGML Handbook"), Sharon Adler (Editor
ISO/DIS 10179:DSSSL), Anders Berglund (Editor ISO/TR 9573),
Martin Bryan (Author "An Author's Guide to SGML") and Eric Van
Herwijnen (Author "Practical SGML").

PROGRAMME (Monday 11th May)

2.  "SGML: Changes today for tomorrow's requirements" -- Dr.
Charles F. Goldfarb (Senior Systems Analyst, IBM Research
Division).

Dr. Charles Goldfarb (hereafter CG) discussed the past, present
and future of SGML.  As part of the "past", he mentioned that
DSSSL (Document Style Semantics and Specification Language) has
passed its ballot, but that the committee involved felt that
recent technological changes merited making further changes to
DSSSL and resubmitting it as a Draft International Standard (DIS)
-- due for distribution later this year.  CG also remarked that
the name DSSSL will be changed.

CG mentioned the ISO's work on ISO/TR 9573 -- a technical report
on using SGML for publishing ISO documents.  He noted that ISO
has many special problems, such as having to publish in multiple
languages, for multiple uses, on a wide range of subjects.  CG
also announced that HyTime has now been passed as an International 
Standard.

Analysing the "present" SGML situation, CG stated that SGML is
now being used very widely, in nearly all technical publishers,
large government agencies, and so forth.  However, he pointed out
that SGML faces political problems because it is under-
represented by supporters on the ISO standard committees.  The
result is that important decisions are being taking in ignorance
of SGML.  CG encouraged people to get involved with their
national standards committees.

CG noted that now the ARCSGML parser materials are available
through the International SGML Users' Group there is no excuse
for people not starting to use SGML!  He also remarked on the
existence of useful and active facilities, such as the
comp.text.sgml newsgroup.

Looking towards the SGML "future", CG reminded attendees that
running concurrently with the conference was a meeting of the ISO
SGML special working group dealing with the five year review of
the ISO 8879 (the SGML standard).  He said they would welcome any
comments from those present.

CG then spoke about HyTime, and the way it extends the abilities
of SGML to deal with compound documents, into the areas of
hypertext and multimedia.  He then made the surprising assertion
that everyone is already familiar with hypertext and multimedia
-- and supported this claim with a slide taken from the C11th
Winchester Bible (showing a page of illuminated manuscript).  CG
asserted that the information structures are similar to those
that might be found in a multimedia document, with a relationship
linking the text and the graphics.  CG followed up with an
example of a modern newspaper, which he compared to a hypertext
document -- in the sense that it offers the reader numerous
points of access (i.e. different articles, specialist pages,
table of contents etc.).

CG suggested that perhaps one of the most notable features about
electronic hypertext/multimedia documents, is the "new" emphasis
on time (e.g. having animated graphics requires using time-based
concepts and mechanisms).  However, CG countered his own
suggestion by subsequently showing a slide of an C11th early
musical manuscript; he argued that the information in the
document was essentially time-based, with the text and notation
indicating the relative duration of words the pitch of notes, and
so forth. He then showed a much later musical manuscript, which
used a different system of notation to capture the same concepts
of time and pitch.  CG reminded his audience that in music, all
values for duration and timing [and pitch?] are given relative to
one another; for example, the notation indicates not how long a
particular type of note should last in seconds, only that the
duration of this type of note is twice as long as a note of
another type.  (CG compared this relationship to the way in
which, in an SGML document, the elements of the logical structure
are all identified relative to one another).  CG concluded that
although the hypermedia model is very sophisticated, it actually
contains no concepts that are really new; the problems arise from
deciding how to implement these concepts technologically.

CG particularly wanted to emphasize that the owners of
information are the users, not the people who develop, understand
or control the technology that delivers the information.  He also
stressed that any information should not be tied to the
performance of the technology involved.  For example, your
information should not be closely tied into (the limitations of)
CD-ROM, because the technology might change but you would still
want your information to be accessible.

Using a series of slides, CG discussed some of the concepts
behind HyTime.  He outlined the notion of an event schedule in a
HyTime "finite co-ordinate space", defining it as an ordered list
of events which may be described in terms of virtual units.
Virtual units are then carefully mapped onto real units for
purposes of presentation.  CG identified the two major classes of
facilities supported by HyTime as those dealing with locations
and linking, and those concerned with scheduling and
presentation.  He gave an example of a truck driver training
system in which the underlying information/document structure is
coded using a HyTime-based approach, but this may be presented to
the trainee in a variety of different forms    e.g. a combination
of text and simple graphics, or computer animation, or perhaps as
interactive video.

CG said that it was important for people to appreciate the value
of the SGML concepts of "elements" and "entities".  When writing
complex DTDs, designers must use entities to contain groups of
elements; this is not only good practice, but it becomes vital
when documents have to be exchanged between hypertext systems.
CG stated that hypertext system designers must appreciate such
concepts    and also those to do with the notions of virtual- and
real-time.  CG put up this diagram:

    Document--->SGML <----------->HyTime <---->Application <---> User
               parser             engine
                ^                   ^
                |                   |
                -----> Entity <------
                       Manager

CG ended his presentation by suggesting that, hopefully, we will
move into time-based document processing having learnt the
lessons of printing and handling traditional documents on
computer -- i.e. the lessons learnt from using SGML.



PROGRAMME - note.

On the first day, it was intended that the conference should
split into two separate "tracks".  One track, "Making the SGML
Decision", would essentially have a management focus and would
look at the following areas (taken from the conference
literature):

    "..Why do information processing and distribution 
     organizations choose to use SGML?  What are the 
     business and cultural reasons?  What processes
     do they follow to arrive at such a decision?"

Speakers from several large companies considered these questions.
In the afternoon, the track took a more practical turn, and
looked at issues of DTD design, documentation and testing, and
there was also a presentation on DSSSL.

The second track spent the morning looking at issues relating to
the use of SGML and databases, namely (ibid):

    "..What are the considerations when creating software 
     for an SGML implementation, specifically SGML used 
     with databases?  What are the tradeoffs for performance?  
     What are the major design options?  This workshop will 
     provide a forum  for implementors to hear and discuss
     the latest research in this discipline."

In the afternoon, this track focussed on HyTime    providing an
introductory workshop, and discussion/news on the latest
developments.


3.  "SGML and databases" --  Chair: Francois Chahuneau (Director,
Berger-Levrault/Advanced Information Systems)

[Whether due to the subject matter, or Mr. Chahuneau's reputation
(or a combination of both!), this session was exceptionally well-
attended.  I would guess that about two-thirds of the delegates
were present at this session, which obliged the organizers to
hastily re-organize the allocation of rooms.]


3.1 "SGML and databases: Implementation techniques, access
methods and performance issues".

Francois Chahuneau (hereafter FC), opened the session by
presenting his own paper.  Direct extracts from this paper are
given in quotation marks    however, no permission has been
sought and I hope the author will not sue!

FC began by remarking that:

    "It has always been said that using SGML to build structured
    documents wasthe best path towards optimal use of database 
    technology to manipulate information stored in these documents.

    "However, this conviction has been interpreted in many
    different ways. One can distinguish at least four 
    kinds of applications:

       * STORING SGML documents in databases as atomic objects,
         with minimal extractions of information from the SGML 
         structure to serve as "indexing fields";

       * REPRESENTING SGML documents in databases (or turning SGML
         documents into databases), by partial or full mapping 
         of the SGML structures to database structures;

       * GENERATING SGML documents out or non- document databases,
         as a special form of report generation;

       * LINKING SGML documents to databases, to create so-called
         "active documents" (this approach is especially popular 
         in the technical documentation field).

"[In his presentation, FC restricted himself]..to the problem of
REPRESENTING SGML documents in databases in efficient ways, so
that parts of the documents can be independently accessed
(searched, retrieved and possibly modified).  This approach is
most useful in the case of large documents (such as Aircraft
Maintenance Manuals or legal textbooks), or large collections of
small homogeneous documents (such as dictionaries, collections of
forms or data sheets, etc.)

"Even with this restricted scope in mind, two additional
independent criteria must be considered to understand the variety
of existing implementation approaches:

    1.  Is the database meant to accept instances of a single
    well-known SGML DTD (or its minor variants) or instances 
    of multiple arbitrary DTDs?

    2.  Is the database to be used for consultation purposes 
    only (static database) or for information update (dynamic 
    database)?"

FC then went on to compare database systems based on specific
DTDs and those based on generic DTDs.  FC remarked that with a
specific DTD there is a strong temptation to map the DTD into the
conceptual schema for a database.  He suggested that it would be
difficult to characterise the performance of such systems, as
each would be so closely dependent upon the original DTD used,
and they would also be individually optimized to improve their
performance.  FC suggested that such systems are inherently
inflexible and over-specialized, and the approach should be
rejected if similar or better performance could be achieved using
systems based on generic DTDs.

FC suggested that generic SGML database systems could be built,
on the basis that "It is possible to consider an arbitrary SGML
document instance as a TREE OF TYPED NODES decorated with
attribute values"...Mapping this tree abstraction into database
structures (possibly with some representation of the DTD itself)
is the main idea behind generic SGML systems".  The rest of FC's
presentation was based on the adoption of this approach.

FC then compared and contrasted dynamic (editorial) databases
with static (consultation) systems.  It is possible to optimize
static databases for information search and retrieval.  Dynamic
databases consequently appear to be slower and larger than their
static counterparts, because they cannot rely on such
optimizations.  For example, when coping with SGML fragments
"..static databases tend to keep the SGML sequence unaltered in
the database, whereas dynamic databases scatter original document
content all around to allow independent updates of SGML elements:
the reconstruction of a sequential SGML fragment [in response to
a query] requires much more work".

FC reviewed some of the generic SGML database systems currently
available.  As examples of static database systems, he looked at
"SGML/Search" from AIS, and "DynaText" from Electronic Book
Technologies; as dynamic systems, he considered "BASIS PLUS
docXform" from Information Dimensions (IDI) and "SGML-DB" from
AIS.

"SGML/Search is a static database system for SGML documents (or
document collections) based on Open Text's PAT text search
engine.  It is described as an SGML object server engine, which
can be accessed either through a powerful, DSSSL-inspired query
language or through a C-callable API....An indexing module, which
includes an SGML parser, takes an SGML document with its DTD as
source data.....The database itself comprises the enriched SGML
file (which never needs to be accessed directly) and associated
full-text and structure indexes....As opposed to SQL, the
SGML/Search query language allows:

       * direct expression of the element nesting 
         relationships at any depth,

       * natural combination of primitives in a 
         functional programming style,

       * separation between SET DEFINITION (how many 
         elements of this type have such a property) 
         and DATA EXTRACTION (send me thethird member
         of this set)...

"The SGML/Search query language is entirely set-based and does
not allow NAVIGATION in the SGML document."

FC said that "DynaText was initially designed as an 'electronic
book publishing system', and is rich in navigation and hypertext-
oriented features."  It includes a query language based on
similar principles to that used in SGML/Search, and "..accessible
from the Systems Integrator Toolkit (SIT)".   Thus "..the
DynaText system could be used to build general purpose SGML
object servers.  As opposed to SGML/Search, the SIT offers a full
set of navigational primitives".

FC stated that IDI will soon be releasing extensions to BASIS
PLUS    known internally as "docXform"    which are a set of
tools and methods that "..include a general approach to mapping
large SGML documents to the BASIS PLUS sectioned document model".
FC remarked that "IDI's method curiously mixes generic and
specific approaches", also noting that "..the method uses a
distinction between 'contextual content' and 'contextual
criteria', which is reminiscent of the traditional distinction in 
text retrieval systems between 'text' and 'structured data'.  The 
reason for this is not quite clear, but is probably motivated mostly 
by performance concerns;  it is however somewhat in contradiction
with the unifying approach of the SGML language to describing
document structures".  FC had no information on the performance
of IDI's prototype implementations.

FC described SGML-DB as "..a technology developed by B.L/AIS to
decompose large SGML documents (or document collections) in a
database, so that concurrent editing of SGML fragments is
possible."  He added that "..SGML-DB allows partial rollback of
any part of the document to any point in the past (up to the last
garbage collection).  In SGML-DB, the 'tree of typed objects'..is
generalized into a multi-temporal tree."  At present, this tree
is subsequently mapped into a relational database, but AIS are
considering mapping into a fully object-oriented system, such as
that offered by "O2".  Currently, the query language used in
SGML-DB is simpler than that offered within SGML/Search, but AIS
are working to improve its performance.  "SGML-DB allows full or
partial decomposition of the SGML structure in the database: some
elements of the DTD can be declared as 'terminal', so that they
will not be decomposed when found in the instance but stored as
text strings with tags embedded".

FC then presented the results of some bench tests that AIS had
performed with the packages available to them (i.e. excluding
"docXform").  They had used two test databases, a 15Mb legal text
database (comprising 301,491 SGML elements) and a 51Mb aircraft
maintenance manual database (comprising 1,606,000 SGML
elements).

                        SGML/Search     DynaText         SGML-DB
LEGAL (15Mb)
Time to load/index      11 min.         112 min.         90 min.
Expansion factor        1.9             1.8              4
Query search time       < 1 sec.        < 1 sec.         3 sec.
Time to extract a
100Kbyte fragment       < 1 sec.        -                9 sec.

MANUAL (51Mb)
Time to load/index      74 min.         476 min.         340 min.
Expansion factor        1.95            1.8              5
Query 1 search time     < 2 sec.        < 2 sec.         5 sec.
Query 2 search time     < 2 sec.        < 2 sec.         8 sec.
Time to extract a
100Kbyte fragment       < 1 sec.        -                11 sec.

SGML-DB took up more space (and took longer to load that
SGML/Search) because the creation of its dynamic indexes takes up
more space and time than the creation of static indexes.  SGML-DB
took longer to return results from queries and to extract
fragments because the texts had been stored using SGML-DB's
"maximal decomposition" option, which meant that each piece of
found text had to be rebuilt from SGML-DB's tables.  FC commented
that "Larger granularity significantly improves performance, but
also 'hides' some SGML structures which cannot be searched
without [using] the full-text option".  He also noted that "..for
DynaText, the notion of 'time to extract' an SGML fragment in
meaningless, because the browser directly reads information from
the database in binary form without transforming data to SGML
format.  Formatted fragment display is instantaneous in this
case".

FC's conclusions were as follows:

"Solutions begin to appear which allow close integration between
database technology and SGML concepts.  Through their life cycle,
large SGML documents will exist in two isomorphic states: as
sequential tagged files for exchange purposes, and as database
structures for sophisticated processing purposes.

" Compared to the sequential form, the database form may provide
many additional facilities such as direct access, navigation and
concurrent update.  Such facilities are needed for implementing
new applications such as hypertext, but also to renew traditional
applications such as typesetting.  In particular, evolving from
FOSI-style semantics to DSSSL-style semantics will require direct
access to the document abstract tree, which implies database
mapping as soon as large documents are considered."

Contact: Francois Chahuneau, B.L/A.I.S., 34 Av. du Roule, 92200
Neuilly, France.


3.2 "Relational database applications with heterogeneous SGML
documents" -- Tibor Tscheke (President, STEP Sturtz Electronic
Publishing GmbH)

Tibor Tscheke (hereafter, TT) outlined some of the problems and
requirements underlying the theme of his talk.  Very large
documents often have a deep structure (100 elements or more), but
much of the SGML markup is redundant in terms of database queries
(e.g. it is of little or no relevance that a word might be marked
as an "emphasized phrase").  Linking a database into the life
cycle of a document can be problematic    for example there might
be different versions of the DTD, or large documents may only be
revised in part.  Many heterogeneous documents may be held in the
same database, but database users will not necessarily need or
want to know about the associated DTDs.  Whatever the complexity
of the database, users will want access and performance that is
optimized to meet their requirements.

TT suggested that it may be possible to define a "document class"
for collections of documents with 'similar' structures [i.e.
similar DTDs].  However, he noted that although documents may
share a common structure they may use different tag names and
attribute names to refer to the same thing (i.e. <chp> =
<kapitel> = <Chapter> ).  Also, a DTD may go through several
versions, in which, say, element declarations may be altered   
-- giving the 'same' element a different generic identifier or
content model.

TT then proposed that for each DTD, a "DTDspec" could be written
which would associate a document instance (DI) with a particular
DTD, and specify the normalization of data types and generic
identifiers, "selectable information units", "selectors", and
"switches".  "Selectable information units" are defined in a
specification which identifies which elements will become
selectable, and which piece of information will become the
reference/key to that selectable unit (e.g. the value assigned to
the attribute chapno within a tag such as <Chapter chapno=3>);
there should only be one constructed key per selectable unit.  A
"selector" is the element (content), or attribute (value) by
which a general/relative surrounding element becomes selectable;
many selectors may point to a selectable unit.  TT said there
would also need to be specifications for "switches", external and
internal references, linkable user-defined procedures, and so
forth.  There would also need to be one DCspec ("Document Class
specification") for each set of similar DTDs.

TT outlined some of the procedures which would be required in the
environment that he envisaged.  For every new class of documents,
it would be necessary to specify a new DCspec.  For each new DTD
that fell within that DCspec, it would be necessary to specify a
DTDspec.  Each new document instance (DI) that was to be stored
in the database would have to be passed through a tool that used
the DI, the associated DTDspec and DCspec as input.


TT said that an approach based on the formal specification of
document classes etc. would facilitate dynamic query interfaces
(using windows, buttons etc.) for database access.  Global
(database wide) queries would have ready access to the
information units available in different document classes; it
would also be possible to query information units that only occur
in a particular document class.  TT said that he was aiming for a
situation in which the database application would need to have no
knowledge of document structures.


Discussion:

There was time for limited discussion at the end of the session,
during which the following points were raised.

- TT said that his company were currently looking at HyTime's
concept of "architectural forms" to see if this would resolve
some  of the problems of documents having DTDs that are similar
in terms  of logical structure, but use different names for
generic identifiers.

- SGML-B was mentioned as a standard that is currently under
development which should provide direct access to SGML documents.
This would avoid many of the problems arising from having to map
SGML documents into database systems so that they can be searched
efficiently.

- FC doubted that SFQL will offer many additional benefits.  He
suggested that it was too close to SQL, which had not been
designed with the idea of accessing full-text documents in mind.



4.  "HyTime" -- Chair Steve Newcomb (President, TechnoTeacher)

This session focussed entirely on HyTime (the Hypermedia and
Time-Based Structuring Language).  After an introduction to
HyTime, the various business advantages and implementation issues
were considered.  The session closed with a paper comparing
HyTime's space-time model and the space-time model used in
physics.

4.1 "HyTime workshop" -- Steve Newcomb

Steve Newcomb (hereafter, SN) said that HyTime had recently
completed the final stages required for its acceptance as a full
ISO/IEC standard.  He recommended that all interested parties
should approach their national standards bodies for copies of the
standard.

SN outlined the activities of SGML SIGhyper -- the special
interest group of the International SGML Users' Group that
concerns itself with hypertext and multimedia.  He stated that
anyone interested in joining SGML SIGhyper should contact either
SN himself, or go through the International SGML Users' Group.

SN spoke briefly about "HyMinder", a HyTime-conformant engine
which is currently under development and should be in beta-
testing by July.

SN then used a series of slides produced by Mary Laplante (for a
presentation at TechDoc Winter`92), to outline the business
advantages to be gained from adopting SGML.  First, he considered
why the development of something like HyTime was both desirable
and inevitable.  There was too much traditional information being
published on paper, and it was also too complex, volatile and
vital.  SGML solved some of the problems by allowing documents to
be stored and interchanged electronically, and facilitating their
delivery on-line.  However, SGML did not address the problem of
interchanging documents written with different DTDs, and a new
generation of on-line, interactive, multimedia documents are
beginning to appear.  HyTime offers "a set of internationally
agreed conventions for interchanging arbitrarily complex online
documents, that is neutral with respect to ... all multimedia
base technologies, all other proprietary and nonproprietary
technologies, and all user interaction metaphors".

SN briefly compared and contrasted the main concepts of
traditional  markup, SGML, and HyTime.  He then discussed why
HyTime is "hot", noting the following points:  heterogeneous
computing environments are the norm, but people want to be able
to exchange information between environments/applications easily;
most people recognize the advantages of adopting a recognized
standard; there is a strong interest in, and demand for, on-line
and interactive documents; HyTime's inherently object-oriented
design makes it attractive and even trendy.

SN listed the benefits of using HyTime: data can be automatically
ported to a variety of platforms; data represented in HyTime can
survive technological evolution; producers and users of
information can acquire the hardware and software that is most
appropriate for their needs; HyTime data remain available for
unforeseen future uses; the publishing cycle is shortened because
there is no need for data translation.

SN identified a number of groups who he believed would need to
know about HyTime.  These included software and hardware vendors,
technical documentation professionals, in-house publishers,
commercial publishers, authors (including educators, musicians,
and games programmers), and consumers.  He then gave some brief
application scenarios.

SN next described some "real-world" HyTime applications.  He
discussed how HyTime would facilitate the CALS requirement for
the electronic review of documents, and how it could support the
content data model for the revisable databases required to
produce interactive electronic technical manuals (IETMs).  SN
described the work of the Davenport Group to develop a HyTime-
like meta-DTD to enable easier production, combination and
sharing of on-line documentation.

The rest of SN's presentation dealt with the prerequisites for
the spread of HyTime, how to find more information about the
standard, and looked at the potential for CD-ROM and other forms
of electronic publishing.


4.2 "Space and Time in HyTime" -- Michel Biezunski (Consultant,
Moderato, Paris).

The full title for this presentation was "HyTime: Comparison
between the space-time model and the space-time model used in
physics".  Although Dr Biezunski began his presentation by saying
that it was not going to be overly scientific or technical in
nature, he was speaking from the point-of-view of a man who holds
a PhD in Physics!  Therefore, I will not attempt to summarize his
presentation here    but suggest that interested readers contact
Dr Biezunski (through the GCA), and ask for copies of his paper
and transparencies.

5.  AAP Math/Tables Update Committee

This meeting was chaired by Paul Grosso (ArborText), who fulfils
this role on a voluntary basis;  the rest of the "Committee" is
composed of parties interested in the development of the AAP
DTDs, and membership is open to all.  The previous meeting had
been held at TechDoc Winter `92, and the purpose of this
gathering was to build upon earlier work and decisions   
focussing primarily on math.

Paul Grosso reported that at the previous meeting, they had
decided to take the DTD in ISO 9573 (part 7) as a starting point,
and to revise the AAP Math DTD in light of this.  They had also
decided to look closely at the efforts of the European Workgroup
on SGML (EWS) in relation to math.

Anders Berglund (ISO) said that ISO 9573 (part7) is supposed to
deal with math and chemistry.  Currently, the DTD looks the same
as in ISO 9573:1988,  but he proposed that the committee should
aim to develop a base level math DTD which could be combined with
a set of possible extensions to make it more suitable for the
needs of disparate groups.

Eric van Herwijnen (CERN) said that he had already done a
comparison of the existing versions of the AAP DTD and ISO 9573,
to see how they handled math.  He felt that it should be possible
to produce a single DTD to cover the structured elements of math
-- however, when he had discussed this with the people working on
the Euromath Project they had resisted his suggestion.  Euromath
opinion was that it was impractical (if not impossible) to build
a math DTD based on structure, and a presentation-oriented
approach was the only method likely to succeed.  Eric noted that
this would mean that in practice there would be three math DTDs
in circulation -- AAP, ISO 9573, and Euromath -- which would be
bound to lead to confusion and dissatisfaction.

Taking up Anders Berglund's earlier remarks, Paul Grosso
suggested that perhaps a base math DTD could be presentation-
oriented,  with the DTD extensions developed to suit the
requirements of other types of processing.

Klaus Harbo [?] (Euromath) said that there investigations had
shown that capturing the semantics of math in a single DTD would
be too difficult, if only because the subject is developing so
rapidly.  This had been the main justification behind their
decision to adopt a presentation-oriented approach when writing
their DTD.  Even if it were possible, fully marking up the
semantics of math would necessitate putting in too much markup
(in terms of the demands placed on authors) -- and/or the
semantics of the math produced by a given group might be
"incorrect" in the opinion of the DTD designers (although
perfectly acceptable to the group members themselves).  Some
participants commented that "too much" markup need not
necessarily be a problem if it was automatically inserted into
the text by authoring tools, and kept concealed form authors who
did not wish to see it.

Jamie Wallace (Mathematical Braille Project) commented that the
notes circulated by the ISO and AAP to accompany their DTDs, were
written largely on the premise that authors would be supported by
sophisticated authoring tools (i.e. they will not have to put in
much of the markup themselves).  However, Jamie's particular
concern was that many authoring tools seem to encourage a
visually-based presentation-oriented approach to markup    which
not only makes them even less accessible to the visually impaired
and print-disabled, but makes the automatic translation of math
texts for groups with special needs much more complex.  Paul
Grosso said that some of this could be attributed to the
different input and output forms represented by the tools
concerned -- and, say, the spaces that are inserted into a visual
presentation-oriented view of a text need not necessarily be
stored in the same tool's internal representation of that text.

Tom Wesley (Mathematical Braille Project) told the committee
about recent U.S. legislation requiring that all educational
texts should be available to the blind.  He noted the
implications that this would have for the publishers of math text
books, and the handling of math in electronic form.

It was agreed that the next meeting should be held in conjunction
with this summer's TechDoc conference [August 25-28th ?].  Prior
to that meeting, Eric van Herwijnen and Anders Berglund said that
they would hope to have some draft DTDs available for circulation
and comment.


Programme (12th May)

6.  "SGML: an ISO standard; an ISO tool: -- Anders Berglund
(Senior Adviser, Electronic Publishing, International
Organization for Standardization).

Anders Berglund (hereafter AB) described how the ISO operates,
and the considerations that lay behind their decision to replace
their traditional typesetting system.  The ISO's main
requirements were for a system that enabled fast, in-house
production of documents with automated generation of indexes and
tables of contents, a minimum amount of re-keying and maximum
support for the re-use of information.  A system based on SGML
seemed to meet all these requirements.

AB then discussed the design decisions behind the ISO's DTD.
They needed a set of elements rich enough to permit production of
current International Standards as hard-copy.  They chose element
names that permitted an SGML parser to verify the structure and
completeness of Standards documents (e.g. that a "Scope" section
was included in every Standard), and also permitted the easy
generation of "boiler plate" text.  Where possible, the designers
tried to use element names that reflected their information
content, with the intention that this would simplify the
subsequent production of secondary publications, database
applications, and so forth.  AB said that they were still
exploring the best ways to handle precise links between related
documents, and the tagging of tables to capture their logical
content rather than their presentation form.

AB described the existing methodology for producing International
Standards, and the workload and throughput that this involves. He
said that the probable future for the ISO would be an SGML smart
data entry system for compositors, WYSIWYG formatting on
workstations, and network access permitting electronic
communication with project editors and secretariats.  The ISO
expects to gain by avoiding the current system of manual paste-
up, and by automating the generation of numbering and referencing
during the revision of documents.  However, the ISO expects to
see their major gains coming when documents are submitted with
SGML markup according to the ISO's DTD -- since this will greatly
speed up the production of secondary publications, and allow for
re-use of information in multimedia publications etc.

AB spoke of the requirements and goals for the tools which will
be used by Project Editors and Leaders. He also talked about the
increasing extent to which paper-based products will be
supplemented by electronic products -- enabling the searching of
on-line Standards documents, the production of hypertexts, and
the creation of a full-text document database.  AB called for
cooperation and coordination amongst participating member bodies,
to support the efforts of the ISO to produce a DTD and adopt
SGML.  He concluded with some remarks on granting external
network access to the electronic texts of International
Standards, and other forms in which electronic products might be
distributed (e.g. CD-ROM, on-line access to computers in Geneva
or distributed servers).

7.  "SGML -- a patently good solution" -- Terry Rowlay
(Directorate General, European Patent Office).

Terry Rowlay (hereafter TR) briefly described the organization
and function of the European Patent Office(EPO).  The EPO has two
major roles: (i) the searching, examination and granting or
European Patents, and (ii) the subsequent dissemination of patent
information.  Two documents are fundamental to the first role   
-- the original Patent Application filed by an inventor (known as
the "A-publication") and the granted Patent Specification
document (know as the "B-publication").

The EPO currently publish a great deal of their information on
CD-ROM.  The size of an average A-publication is about thirty
pages, and each Patent Application has to be compared against a
search file of over 60 million documents (hence the need to
automate the process!)  The EPO deals with over 50,000 Patent
Applications in a year, each of which requires additional
documentation to support its passage through the approval
procedure.  In total, the EPO produces 88.1 million pages each
year.

In the early 1980's, realizing the problems that lay ahead and
the vital need to automate the process of handling A- and B-
publications, the EPO set up the DATIMTEX Project (DATa, IMages,
TEXt).  Current practice meant that the quality of the A-
publication was totally dependant on the quality of the original
application submitted by the inventor (which was highly
variable).  Although the quality of the subsequent B-publication
was much higher, the A-publication was actually the document most
widely used in the patent world.  EPO wanted the DATIMTEX Project
to devise a means of producing a high-quality A-publication which
was also available as an electronic document complete with all
the bibliographic and search report data that was required.

Two contractors (Rank Xerox U.K., and Jouve [France]) are
responsible  for capturing the text, images and data contained in
the Patent  Applications, and putting them into machine-readable,
marked up form. TR described the procedures used by Rank Xerox,
who have attempted to automate the processes of capture and
markup wherever possible; manual intervention is only required to
markup irregular document components, complex tables, and for the
inclusion of bit mapped images.  The captured text is delivered
to the EPO on magnetic tape, accompanied by an image tape
containing all the associated embedded images and drawings not
captured as text. (The image tape also contains page images of
the final A-publication, which the EPO can re-use in its CD-ROM
publications).

SGML appealed to the EPO for the DATIMTEX Project because it gave
them a machine-independent way to markup the structure and
content of captured texts, in a manner that facilitated the re-
use of information.  The World Intellectual Property Organization
(WIPO) -- an umbrella organization for the world's patent offices
-- adopted the EPO's SGML implementation, tag set and DTD, in its
own standard (WIPO ST.32).

The EPO still faces a number of problems.  The documents they
publish are highly technical, with a variety of tables,
mathematical and chemical formulae, and special characters; the
EPO uses a base character set of over 400 characters    but many
inventors like to create their own!  This problem has been
overcome by additional tags which identify standard enhancements
to existing base characters ("floating accents"), and common
constructs  used to combine base characters in new ways
("character fractions").  Tables constitute a high proportion of
the text of patent applications, so the EPO had to devise
satisfactory ways of handling them.  When DATIMTEX began, studies
on table markup using SGML were scarce; the approach devised by
the EPO is based upon the markup of complex tables suggested by
the Association of American Publishers (AAP), and allows the
contractors to mark up 80% of all the tables they encounter.  The
EPO is now participating in efforts to develop more sophisticated
approaches to table markup.

Mathematics did not constitute a high proportion of patent
applications    so the EPO has elected to adopt the tag set
devised by the ISO (in favour of the AAP's).  Capturing
bibliographic information has been resolved by creating a set of
SGML tags based on the WIPO International Identity (INID) codes.
The results of [patent?] Search Reports are now also captured
using a specially devised set of tags.  However, the EPO has yet
to find a satisfactory way of handling the markup of chemical
formulae, or a means of producing a mixed mode display that shows
the marked text with the associated graphics in-line.  Overall,
TR's feelings about SGML were very positive, and he felt that the
EPO had made the right decision.



8.  "Encoding the English poetry full-text database -- applying
SGML to thirteen centuries of verse" -- Stephen Pocock (Senior
Projects Editor, Chadwyck-Healey Ltd).

Chadwyck-Healey are a small, dynamic publishing company, about
twenty years old.  Their main market is libraries and academia,
and they specialize in data publishing (and are consequently
having to produce more material on CD-ROM).

Stephen Pocock (hereafter SP) said that Chadwyck-Healey's
decision  to use SGML to code the poetry for their CD-ROM was
fundamentally a  question of economics.  Using SGML meant that
the data would never be obsolete -- which was not only a selling
point to Chadwyck-Healey's clients (especially libraries), but
justified the efforts of setting up the original database.  SP
said that their encoding scheme had had to satisfy two basic
criteria:  practicality and utility.  They had to "..devise a
method of analysing and recording the structures of poetry that
could be applied consistently across the canon by intelligent
editorial staff with appropriate training and guidance".  The
more detailed their encoding scheme, the more work would have to
be done on marking up and interpreting the text; too much
complexity would increase the costs at every stage in the
production cycle. Their intention was to mark up four thousand
volumes in three years -- which meant that the markup should not
be too demanding to key, but it had to capture enough information
to support searching and display of the data.  The main bulk of
the keying has been contracted to outside agencies.

The utility of the marked up data was determined by the
requirements of the user groups.  Chadwyck-Healey took the
decision to provide markup that met the needs of the generalist
rather than the specialist, but which also did not preclude the
subsequent incorporation of additional, specialist tags.  They
consulted the first edition of the guidelines (TEI-P1) produced
by the Text Encoding Initiative (TEI), and also spoke to Lou
Burnard (of Oxford University Computing Service, and one of the
TEI co-ordinators).

SP noted that most texts can be viewed as multiple heirarchies,
and this is especially obvious with poetry -- which can be read
simultaneously in terms of both metrical and narrative
heirarchies. Particular types of poetry, such as dramatic verse,
have additional levels of reading.  The TEI solution to marking
up multiple heirarchies in poetry involves the use of CONCUR, but
Chadwyck-Healey felt that this would be difficult for them to
implement.  Instead, they have opted for a system that uses SGML
attributes to define various types of poetry (for example, <poem
type=prologue>).  However, whilst it is clear that poetry
operates on many levels, Chadwyck-Healey have not been able to
identify a structural element below the level of the (metrical)
line that regularly occurs in their data.

In the process of marking up their data, Chadwyck-Healey have
encountered a number of non-standard characters which they have
had to encode as entity references -- although they have not yet
decided if or how they should be displayed on screen.  They have
also deferred any decision on the encoding of `graphical' or
`shape' poems, where the physical layout of the text on the
original printed page is apparently intended to relate to the
poem's content.  SP noted that Chadwyck-Healey's decision not to
include twentieth century poetry, (fortunately!) meant that they
did not have to deal with the unusual typographical layouts of
some more recent works.


9.  "Is SGML Bad News to your Editorial Staff" -- Koen Mulder
(Wolters Kluwer Law Publishers, Deventer)

In his presentation, Koen Mulder (KM) examined how the
introduction of SGML would change the traditional publishing
process -- which KM suggested was product oriented (and tailored
to meet the demands of the market).  Changes in management
(mergers etc.), and changes in the market (e.g. demand for new
media), have caused changes in the management of publishing.  Now
there is a greater need to have more central information, more
reusable information, and more information interchange; all these
require a neutral information structure, and SGML is the obvious
answer.

KM said that when implementing SGML, it is impossible to foresee
all the consequences, and so there is sure to be a certain amount
of confusion.  Although he could offer no patent solution for
dealing with this confusion, KM suggested that managers should
pay particular attention to the physcology of their organization,
and to the adoption of good procedures.  Any changes to existing
routines are often perceived as a threat, so KM suggested that
new procedures should be introduced step by step, always
accompanied by education and training that focuses on users'
applications and working methods.  Changes in working methods
should also be carefully introduced, particularly with regard to
such issues as the separation of content and format, who decides
on the structures to be encoded, who actually does the encoding,
and who has to deal with file translation.  Many people will also
change their function, as new jobs and tasks are introduced; the
traditional publisher will become a manipulator of information
(rather than an information broker), text composition will become
a more automated process, and an integrated application will
require greater cooperation and open-mindedness from people.

KM also raised the issue of information management which,
following the introduction of SGML, becomes a much less tangible
process.  For example, how secure is information that is stored
centrally but intended for easy re-use and interchange?  Who will
physically manage the information files?  Then there is also the
question of what to do with existing information    which might
be held in files that are inconsistent, format-oriented, and
intended for different typesetting systems or applications.  If
this existing information is to be incorporated into the new
system, how is this additional workload going to be dealt with
(and paid for!)?  KM also identified some questions to be
considered when dealing with external suppliers -- such as should
their role only be that of information input (and markup), or
can/should they take some of the responsibility for information management?
Also, assuming that a publisher has produced information that has
been marked up with SGML, how many typesetters are actually
capable of working with that information?  KM concluded by
suggesting that SGML is not a threat to editorial staff but a
challenge to the whole organization.



10. "SGML in the Software Development Environment" -- Shaun
Bagley  (General Manager, Exoterica Inc.)

Shaun Bagley (SB) set out to describe how Exoterica Inc. are
using SGML as part of their internal software development
environment. He stated that "SGML is a BNF-like language which
allows its users to specify the structure of a language.
However, unlike BNF, the user does not have to worry about the
syntax of the language.  That is completely managed by SGML".  SB
described SGML's function as a meta-language, saying "SGML is a
language for describing arbitrary LL1 languages.  The syntax and
the grammar are designed from scratch by the user.  In this sense
SGML is a `BNF' for text-based languages."

SB proposed four "generations" of markup language.  The first was
characterized by typesetting codes and procedural markup, and the
second by macros and generic markup.  The third "generation" of
markup languages included WYSIWYG, hypermedia and generalized
markup -- whilst the fourth included "advanced language[s]
designed for precise expression of particular problem[s]".  SB
remarked that whereas in the third generation, documents could be
understood without access to their markup language, in the fourth
generation the markup is an essential part of the document (which
cannot be understood without it).

SB suggested that one of the major problems in software
engineering related to the integration of programs and their
documentation.  Programs start out under documented, perhaps
because their design is incomplete, or because much of the detail
remains in the engineers' heads.  Programs also lose "synch" with
their documentation, largely because whilst the programs
themselves are maintained their documentation is not -- and once
documentation has got out of synch, it becomes very costly to
resynchronize it with the program code.  SB said that there is a
gap in the processing model because, for example, C compilers do
not interface with the desktop publishing systems used to produce
documentation, and the desktop publishing systems cannot easily
be linked to the work of the compilers.  He suggested that much
of what is now perceived as coding is in fact documentation e.g.
data structures, interfaces, and control logic.  SB identified
his key considerations for a good software engineering
environment, and the benefits of using a fourth generation
"advanced language", namely: "The software can be structured from
the point of view of what it does, rather than how it is
built...Design documentation (the what) and Program documentation
(the how) talk in common terms without duplication...a finer
level of granularity [is encouraged] through a description of
what a system does....[an advanced language] allow[s] assignment
of real names to objects early on in the development cycle".

When Exoterica decided to develop a new parser (XGML Kernel) they
chose to do so using their own advanced language to abstract the
design and development procedure.  They developed the Exoterica
Coding Language (ECL), as a precise and concise language intended
for a specific purpose    and thus inappropriate for other uses.
ECL was  an application of SGML and OmniMark (Exoterica's SGML
translation tool) and it was used to describe the following:
names of procedures, data types and non-local variables; the form
in which procedure arguments and types are defined; the form in 
which data types and data structures are defined; the form in 
which variable and constant names and values are defined; the format 
of comments.

SB concluded by outlining the business advantages to be gained
from developing advanced languages such as ECL.  Concurrency of
documentation and program code is maintained.  Communication is
carried out in terms that fit the problem, not the solution.
Risks are reduced, because both developer and client can be clear
when sign-off targets have been reached.  Products can be
developed with greater speed.  Savings can be made that cannot be
achieved by traditional development methods.


11. "Implementing SGML at Ericsson Telecom: two perspectives"  
-- Peter Dybeck (Project Manager, Docware Development, Ericsson
Telecom AB), Nick van Heist (Technical Consultant, TeXcel AS AB)

This presentation gave both the client's and the developer's
perspective on implementing SGML at Ericsson Telecom.  Peter
Dybeck, who was to give Ericsson's (i.e. the client's) half of
the presentation was unfortunately unable to be present, so his
place was taken by his colleague Helena Antback[?](HA).

In February 1991, Ericsson decided to use SGML for text creation
and interchange.  They set up two nine month projects to
investigate the definition of the DTDs and the file conversions
that they would require.  They started by designing one DTD for
test purposes, but eventually decided to produce five DTDs to
cover all their document types (i.e. "one major procedural
document type, two source document types, and two auxiliary
document types").  File conversions fell into two main types: the
conversion of old procedural documents from non-structured forms
into Ericsson's own SGML DTDs, and the conversion of source
documents produced using the Ericsson Document Markup Language
(EMDL) into the same target DTDs.  HA admitted that they had
expected the conversion of old documents to be difficult -- but
it turned out to be even more difficult than they had
anticipated.  She added that they had expected the conversion of
EMDL documents to be quite straightforward -- but it became
extremely difficult because of the large number of documents
involved, and the highly variable quality of the EDML markup. HA
then summarized Ericsson's experiences.  Technical writers must
take part in designing the DTD, FOSI and working environment from
the beginning.  Introducing a SGML and a new desktop publishing
tool affects the whole organization.  The conversion of non-SGML
documents into highly structured DTDs is very difficult and
requires manual clean-up.  An accurate FOSI is difficult to
create.  The entire working environment, the DTDs and the FOSIs
require extensive testing by experienced staff. She added that
they had not found any short-cuts on the way to a successful SGML
implementation (trial and error, flexibility and hard work are
the only ways to achieve this).  Also, knowledge of computer
systems, the SGML standard, and preferably the FOSI standard are
necessary in order to implement SGML.  HA concluded by saying
that Ericsson felt that SGML requires a good deal of work in the
initial stages but is very rewarding in a long-term perspective;
it creates brilliant possibilities for handling textual
information in a worldwide company.


Nick van Heist (NvH) began the second half of the presentation,
by outlining the project requirements given to the consultants
(TeXcel).  More than one hundred writers had to be moved from a
WYSIWYG publishing tool to an SGML authoring system; these
writers knew little about SGML and were happy with their existing
system.  TeXcel were being asked to provide an introduction to
SGML, develop an SGML application, and transfer knowledge about
the system's development and maintenance all within a short
timeframe. NvH noted that the following design goals: to migrate
to a new system whilst maintaining productivity; to make it
easier to handle structured data; to automate an environment for
shared and generated data; to use internationally accepted
standards (e.g. SGML, FOSI, and PostScript).

NvH had identified three keys to the successful completion of
TeXcel's task -- all stemming from Ericsson's management.
Management were fully committed to the project and were prepared
to be flexible because of the new technology that was being
introduced. Management were also prepared to allow access to end
users. TeXcel had originally relied on project managers as
intermediaries through to the writers, but as direct contact with
writers had evolved they had found it possible to get more
accurate information about user requirements, and early testing
and feedback on the application.  NvH said that interaction with
the writers had shortened development time, produced a system
that was better suited to the writers' environment, and
encouraged the writers to feel more involved with the system
(leading to its easier acceptance).  TeXcel's overall approach
had been to analyze both the easy and the difficult aspects of
the existing system, and then to concentrate on making the
difficult aspects easy in the new SGML system.

When TeXcel began to train users for the new system their initial
approach focused in general terms on SGML and the authoring
system; real applications were not part of the training, and
writers were not made aware of the reasons behind the decision to
move to SGML (which led to additional reluctance to change).  As
the project evolved, TeXcel found that using real applications
made the training more meaningful, it gave the writers the
specifics that they needed to relate to.  Informing the writers
of the reasons for using SGML motivated them to learn.

NvH summarized what TeXcel had learnt from their experiences with
Ericsson in four points: (i) involve end-users early in the
development of the application and the system, (ii) shorten the
chain of communication from application developer to end-user,
(iii) make training specific to the application, (iv) help end-
users to understand the benefits of SGML and why they need to use
it.


12. Reports from SGML Users' Groups Chapters, Special Interest
Groups (SIGs) and Affiliates -- Chair: Pam Gennusa (President,
SGML Users' Group)

This was an open session in which people were encouraged to give
short summaries of their work with SGML, or the activities of
their local SGML User Group or particular SIG.  Many people
spoke, and I apologize in advance to anyone I may have
overlooked.

12.1 The European Workgroup on SGML (EWS) -- Holger Wendt

Holger Wendt (HW) described the rationale behind the setting up
of the EWS, and the results of their work.  They had begun by
producing the MAJOUR Header DTD which had been made available for
public comment at International Markup `91 in Lugano.  Much of
the early work had been achieved by a collaboration of Dutch and
German companies involved in publishing and typesetting.
Following Markup `91 the EWS had become increasingly multi-
national, and those involved had gone on to work on producing a
Body and Backmatter DTD which could be integrated with the
Header.  HW predicted that a complete DTD would be available for
comment by the summer.  The EWS had followed the example of
groups such as the AAP and TEI, and established sub-committees to
look at problem areas such as the markup of complex tables and
equations, document conversion,  DTD documentation and
maintenance.  HW reported that some of the publishers involved in
the EWS have already begun to use the Header DTD.


12.2 The SGML Project -- Michael Popham

I reported on the work of the Project since its inception in May
1991.  Set up to explore, encourage and support SGML use
throughout the UK's academic and research communities, The SGML
Project is now half way through its allotted lifespan of two
years.  The Project collects and disseminates information on SGML
and its use in a variety of ways: by reviewing and evaluating the
available products, reports and documentation (although this does
not involve any formal testing); by offering a free programme of
lectures, seminars and workshops; by offering support and
consultation (on a limited basis) to users of SGML.  We are in
the process of building an archive of SGML (and related)
materials -- such as parsers, translators, DTDs, entity sets,
reports, etc. -- and making these available to anyone on the UK's
academic network (JANET);  the host machine is accessible via
anonymous ftp over INTERNET, and The SGML Project is keen to
foster links with the world-wide SGML community.  For people who
have difficulties with their network connections, we will also
distribute material on disk, tape, or paper, although we must
give preference to our target communities, and we may have to
recoup the costs of supplying to other users.  The SGML Project
is also creating a database of users (both within and without the
UK), and this is proving useful when evaluating software, books
etc. because it gives an impression of what users want to do with
SGML, what hardware they have available, what users are currently
working on, and so forth.

As hoped, awareness of The SGML Project within the UK is growing
fast.  Many people are coming to SGML (and thence the Project) by
diverse routes from across a tremendous range of subject areas
and backgrounds.  Most people who encounter SGML seem to have
become quickly aware of its inherent advantages -- however, many
of these same people are put off actually using SGML because of
the initial problems involved in its implementation.  Most people
seem to want to take advantage of SGML without having to get too
involved with learning about SGML itself.  They are seeking
software that does what they want at a price they can afford.
Many want to see working examples of SGML-based work being done
in their particular area of interest before they are prepared to
make a full commitment to SGML.  Many potential users also want
to get hold of examples of good DTDs, sample marked up documents
etc., so that they can begin to really grasp for themselves what
SGML is all about.

The SGML Project's terms of reference are exceptionally broad.
The UK's academic and research community consists of more than
one hundred institutions -- which represents a potential user
base of over 400,000 people!  Happily, usage of SGML (and related
standards such as HyTime and SMDL) is developing fast, although
this makes it hard to stay abreast of the latest developments
such as work on DSSSL, SGML-B, or the activities of the Davenport
Group;  ISO 8879 is also undergoing its five-year review.  There
is clearly a great deal of interest in SGML, and a growing
requirement for the sort of information and advice service
offered by The SGML Project -- however, initial funding will be
exhausted by May 1993, and unless additional sources can be found
it is likely that the Project will be wound down.


12.3 CALS Update -- Michael Maziarka

Michael Maziarka (MM) outlined the previous goals for CALS, as
given in MIL-M-28001: it was paper-based, offered a number of
DTDs (eg. MIL-M-38784B, a general purpose DTD, and 11 other
military specification DTDs), detailed a baseline tag set from
which CALS DTDs must be built, and gave output specification[s].
The problem with the approach adopted in MIL-M-28001 is that it
did not take into account the movement towards the use of
electronic displays, that there has been a proliferation of DTDs,
that the size of the specification is increasing beyond reason,
and that DTD modifications require too long a process.  MM said
the goals of MIL-M-28001 would be revised to offer a guide for
developing SGML applications (catering for both paper and
electronic display, and detailing use of the CALS SGML Library
and CALS SGML Registry), an SGML Toolkit (for handling
architectural forms, providing electronic review capabilities,
and supporting delivery of partial documents), and output
specification[s]. MIL-M-28001B, which is currently under review,
offers most of these new goals, and includes 800 additional
elements that have been added to the baseline tag set, 14 U.S.
Air Force DTDs and 2 U.S. Navy DTDs.

MM discussed the problems of submitting a partial document (as
opposed to an entire document).  A partial document might consist
of an interim deliverable or a change package, where the
transmitted material contains only the document hierarchy and
element attributes which indicate inserted, deleted or changed
data.  Taking advantage of SGML's CONREF feature, a "stub"
attribute has been added to %bodyatt; so that when an element's
stub attribute is set equal to "stub" the element is EMPTY.

MM then spoke about the CALS approach to handling the electronic
submission of comments.  A DTD fragment for delivering review
comments in an SGML syntax has been developed, such that comments
may be submitted either separately from a document or within it.
The comments are related to the original text by reference to
element IDs. For every comment it is possible to identify the
comment source, provide a unique ID for future reference, record
comment priority, classification and category, and also comment
disposition [?].

The CALS SGML Registry (CSR) has been set up to provide the
authority and process for reviewing and approving DTDs, elements,
and their use.  The CSR will also establish the CALS SGML Library
(CSL), which will create a common repository for SGML objects,
including DTDs, elements, attributes (not values), entities,
output specification elements, FOSIs, and fragments.  The SGML
Registrar will carry out a number of tasks, including the
following: standardardizing SGML tagging, providing the authority
to determine CALS-compliance and review emerging DTDs (for
conformance to MIL-M-28001 and the CSL, not general soundness),
and providing a common site and management that is independent of
any of the armed services.

The CSL provides on-line access to approved DTDs, elements,
attributes etc., and demonstrates guidelines for applying SGML to
CALS applications.  It will also facilitate the development of
new DTDs and help to avoid any duplication of effort.  The
registration of SGML objects at the CSL will involve a number of
issues.  For example, naming conventions will require the use of
consistent, short, meaningful names, and careful terminology;
aliases must be used to ensure a one-to-one relation between
objects and concepts; a decision must be taken on whether to use
structure or content tagging; similar content/structures across
the services should be named and described generically (whereas
unique content/structures should be named specifically).

MM concluded that MIL-M-28001 will become an SGML Toolkit which
will support partial documents, electronic review, and paper and
electronic display applications.  The CALS SGML Library and
Registry will provide a guide for developing new applications,
and a central repository for information on SGML applications.
Much of the work taking place under the CALS initiative should
prove useful in other, non-military, contexts.


12.4 Dutch Chapter, SGML Users' Group -- Jan Masden

Jan described the work of one of the oldest and most active
Chapters of the international SGML Users' Group.  They have
frequent and well-attended meetings, and have become closely
involved with the work of the European Workgroup on SGML (EWS).


12.5 Norwegian Chapter, SGML Users' Group -- Jan Ordahl

This Chapter had recently celebrated its first birthday (its
appearance having been announced at last year's SGML Users' Group
AGM).  Like the Dutch, they seem to be very active and well-
organized, with frequent meetings forming part of their agenda.


12.6 SGML for the print-disabled -- Tom Wesley

Dr Tom Wesley (TW) has become closely involved in various
international efforts to improve information access for the
print-disabled.  Although he is particularly concerned with
readers who are visually impaired, he is also aware that "print-
disabled" is a catch-all term that includes people who are
dyslexic and those who find it difficult to use traditional paper
documents.  TW told how he came across SGML almost by accident,
when investigating the ways in which structured electronic texts
will enable visually impaired readers to use documents in ways
that have previously only been available to sighted readers.  For
example, sighted readers are used to being able to scan whole
documents -- or crucial parts such as tables of contents and
indexes;  they can also take advantage of multi-column texts,
tint-boxes that highlight key information, explanatory diagrams,
follow cross references, and so on.  Much of this information,
and the benefits to be gained from certain forms of presentation,
are lost when an existing text is re-keyed into a braille
edition.  Moreover, the page numbers in the braille edition bare
little or no relation to the original printed text, which makes
it difficult for a blind scholar to discuss a text in detail with
sighted colleagues, or for a blind student to turn to the same
page in his/her textbook at the other students in the class.
There are also tremendous difficulties to be overcome when
rekeying special text such as maths, chemical formulae, complex
tables etc. for a braille edition -- not least because many
countries have developed their own national schemes for
transcribing maths, which prohibits the use of math textbook
produced in American braille being used in a school in the UK!
Maths is an international language, but unfortunately the same
cannot be said for its braille transcription.

TW is hoping that the use of structural markup, such as SGML and
ODA, will offer a route to the automatic transcription of
conventional texts.  He is particularly interested in the work of
groups such as the AAP and EWS, which are seeking to standardize
the markup of a broad range of texts.  TW is keen to raise the
awareness of the issues of print disability amongst those
currently involved in designing markup systems for texts.


12.7 French Chapter, SGML Users' Group -- Michel Biezunski

This group was still in the process of setting itself up.
Although there are only about thirty members at present, they
expect  to have near one hundred.  A programme of meetings and
events was being planned.


12.8 SGML Forum of New York, SGML Users' Group -- Joe Davidson

The SGML Forum of New York is effectively a Chapter of the SGML
Users' Group operating under a regional, rather than a national
name.  Joe Davidson listed some of their recent events, and
outlined their plan to possibly establish their own electronic
bulletin board, managed by a local university.


12.9 SGML SIGhyper, SGML Users' Group -- Steve Newcomb

Steve Newcomb (SN) is Chair of SGML SIGhyper, the SGML Users'
Group Special Interest Group on Hypertext and Multimedia.  HyTime
was approved as an International Standard only two weeks prior to
International Markup `92.  It is the only International Standard
for multimedia that is currently available.  SN noted with
approval that HyTime is perhaps the only  International Standard
which is actually ahead of existing technology!  SGML SIGhyper
aims to provide "one-stop shopping" for those seeking information
on HyTime, wanting sample documents etc., and supports two
anonymous ftp sites at the universities of Florida State and
Oslo.  SGMLSIGhyper does not have meetings, but is intending to
produce a regular newsletter (of which the first issue is already
available), and keeps in touch with most of its members through
electronic mail (email).

SGML SIGhyper has also provided some support for the work of the
Davenport Group  -- a collection of major hardware and software
manufacturers and vendors who wish to collaborate on producing
on-line documentation (cf. UNIX "man" pages).  They have had
considerable success encouraging the Davenport Group to use many
of the features outlined in the HyTime standard.

Members of SGML SIGhyper are now turning their attention to work
on another prospective International Standard, SMDL (standard
music description language) which is currently still at "draft"
status.  In some ways this is ironic, since HyTime originally
grew out of work that was being undertaken as part of the
development of SMDL!


12.10 UK Chapter, SGML Users' Group -- Nigel Bray.

Nigel Bray (the acting secretary for the UK Chapter) outlined
recent activities to revive one of the oldest Chapters in the
SGML Users' Group.  During the late 1980's, activities of the UK
Chapter were synonymous with those of the international SGML
Users' Group, but initial enthusiasm for the Chapter had tailed
off -- apparently because of the slower than anticipated uptake
of SGML following the release of ISO 8879 in 1986.  At the first
meeting of the revived Chapter, there were about 90 attendees
from a variety of backgrounds (publishing, academic, software
houses etc.)  A meeting was scheduled for 6-8 weeks time, which
would allow for elections for various posts on the committee, and
serve as the first AGM.


13. 1992 AGM SGML Users' Group -- Chair: Pam Gennusa (President,
SGML Users' Group)

A full account of this meeting will probably appear in a
forthcoming edition of the SGML Users' Group Newsletter, so I
will not cover it in great detail here.  Various formal reports
were made, and for further details I would direct readers to the
Newsletter.

The setting up of some sort of foundation or trust was discussed,
to make good use of the excess monies arising from  SGML Users'
Group  (SGMLUG) members' fees.  It was suggested that SGMLUG
members could apply to the foundation/trust to cover the cost of
organizing workshops, sending key people to crucial meetings (of
the ISO, AAP, EWS etc), funding conference speakers such as
academic researchers, and so on.  The SGMLUG will investigate
this.

Possible restructuring of the SGMLUG was discussed.  For example,
the SGMLUG secretariat -- based at Swindon in the UK, with Pam
Gennusa -- could become even more secretarially-/support-
oriented than it is at present.  The exact relationship between
the various national Chapters and local groups, the Special
Interest Groups (SIGs), and the SGMLUG secretariat is still
unclear.  Pam Gennusa said she would be interested to receive any
comments or opinions on the nature of these inter-relationships,
or if/how things could be better organized.

Brian Travers asked representatives from all the Chapters, groups
and SIGs to forward announcements of meetings to him, for
inclusion in the journal <TAG>.  This would help to ensure that
many people (even those who are not members of SGMLUG) would be
aware of what events are taking place.

The next meeting of SGMLUG is scheduled to co-incide with the
SGML'92 Conference, which will be taking place in Danvers (MA)
towards the end of October.


PROGRAMME (Sunday 13th May)

This was the first day of the joint conference, when
International Markup `92 would co-incide with the first
Documentation Europe `92.  I was only scheduled to attend for the
first day of the second conference.


14. Keynote address -- Ken Thompson (Commission of the European
Communities)

Ken Thompson (KT) began by outlining the implications of the
changes arising from the introduction of the single European
market in 1992.   There will be free movement not only of goods,
services and people, but also of information.  Information is the
key to making the single European market work; for example, tax
information must flow freely between states in order to encourage
the free movement of goods and services.  Information must be
able to get to the right place, at the right time.  Open Systems
technology would seem to be the key to entering this paradise of
information sharing, but people seem strangely reluctant to take
it up!

Manufacturers want to move goods and services.  For many, moving
the information about goods and services is simply a side-line   
-- even if it is one required by European law!  Such people need
expert advice, good software, and the like to make the free
movement of information possible and easy to implement.

With the assistance of the Commission of the European Communities
(CEC), the various Public Procurers have been able to produce a
European Procurement Handbook for Open Systems (EPHOS).  This
gives advice not about the learning, the standards, or the
technology required for the movement of information, but rather
about how to specify the correct conformance to the standards in
procuring the desired information handling functionality.

The work of EPHOS is still only just beginning.  The next phase
will provide advice on the use of SGML and ODA, and it will also
link into other EC initiatives such as the TEDIS programme's work
on Electronic Data Interchange (EDI) and the guidance on
information publication originating from the Open Information
Interchange (OII) project (which is part of the IMPACT
programme).  Use of EPHOS is not limited to the Public Procurers,
and work is underway to ensure its usefulness to smaller private
companies, universities etc.  The first EPHOS handbook will be
available in all the EC languages and distributed throughout the
EC nations; it is a practical handbook that helps people to
conform to legislation -- it is not designed to give the
legislation itself.

The timely establishment of a common information interface
between all the sources of information and the information
distributors is a key issue.  SGML will clearly play a major role
because of its ability to allow the authors of information to
make the structures of their information explicit -- thereby
allowing it to be easily re-distributed in a variety of forms,
such as paper, on-line, CD-ROM, etc.

KT suggested that in general, SGML DTDs are becoming too complex,
and that this is delaying the uptake of SGML.  He said that he
would like to see simplified, more accessible forms of SGML, such
as more standardized DTDs.


15. "Technical Trends Affecting Decisions about Documentation
Generation" -- Andrew Tribute (Consultant, Seybold Limited)

Andrew Tribute (AT) began by asking the question  "Should
(electronic) documents be static?".  This question in turn
raises a number of other issues to do with document design.  For
example, whether documents should only contain fixed data, or if
they should be dynamic to meet readers' needs; should documents
only be passively read, or should the be interrogated/listened
to/watched; should documents assist users to find the information
appropriate to their tasks?

AT then turned his attention to document formats.  Paper is
expensive and time dependent.  Documentation distributed as soft
copy (i.e. viewed on-line) has some advantages over paper,
particularly if the reader has access to a massive selection of
documents and does not require the application that was used to
create the document in order to read it.  Other varieties of
document such as multimedia and email also have advantages over
traditional paper documents.  CD-ROM offers a format that can
accommodate almost all other data types and are becoming a low
cost means of mass document distribution.

AT then discussed the way technical trends have affected the
types of data held in documents.  Originally, the only data in
electronic documents was text, but more recently it has come to
include monochrome and colour photographs, vector images, sound,
animation, and video.  He described how Kodak's Photo-CD
technology will provide a cheap, mass market means of storing
high quality colour image data on CD-ROM.  Standards such as JPEG
and MPEG will facilitate the storage of still and moving colour
images.  AT anticipated that multimedia technology will capture
an increasing share of the market for information publishing, as
the general public starts to demand more multimedia products.

AT observed that if documents are going to be electronic, then
users will need to have the correct tools to be able to read and
interact with them.  Moreover, he stated that the ability to
read/interact with a document should not require readers to have
access to the same software or hardware as was used to create the
document.  It must also be possible to create universally
readable documents from a variety of separate applications.  AT
briefly discussed Adobe's "Carousel" product, which allows
documents saved in Interchange PostScript to be read, annotated
etc. (but NOT edited); this appeared to be a first step towards
using editable PostScript as an interchange format.  AT also
mentioned Interleaf's "WorldView" and "WorldView Press", which
allow documents produced in many different formats to be indexed
hyperlinked and book-marked, and then made available in finished
format on a wide variety of computers; users can read, annotate,
print, perform information retrieval and follow multimedia
document links.

In his closing remarks, AT agreed with the previous speaker that
SGML needs to be made more accessible.  He noted with regret that
many suppliers to the mass publication market -- companies such
as MicroSoft, Apple, Aldus, etc. -- were not represented at
either of the conferences.  He restated his belief that colour
will place an increasingly important role in the next generation
of publishing products to be released in the mainstream/mass
market.


16. "Providing a strategy for the practical implementation of
document management systems' -- Martin Pearson and Graham Bassett
(Rank Xerox Limited)

Martin Pearson and Graham Bassett (MP&GB) looked at the
implementation of document management systems from a strongly
business-oriented point-of-view.  They saw documents as being a
vital part of any business, forming the link between
organizations and their customers and, after staff, as the single
most important corporate resource.

MP&GB made a comparison between a typical technical publication,
and more general forms of "communication".  A technical
publication typically represents high value, little investment in
technology, is created by skilled workers (often in a vacuum),
and has low visibility.  By contrast,  most "communication" has
low inherent value, represents a massive investment in
technology, is created by most people (and often duplicated), and
has high visibility.  They also briefly looked at the issues
surrounding the management of technical publications, noting such
points as: standards such as SGML are often hard to grasp; some
information has "critical value" for the company; document
management is not a boardroom problem (and so is invisible); the
recession inhibits investment in new processes; current "cut and
paste" document-processing technology is inappropriate for
supporting integrated document databases.

MP&GB then looked at some case studies.  A cycle time and cost
reduction exercise in financial document administration, carried
out internally at Rank Xerox, using their TQM [Total Quality
Management ??] approach, had resulted in the cycle time being
reduced from 112 days to 1 day, 4 hours and 43 minutes for new
products!  A UK financial institution had asked Rank Xerox to
streamline their process of producing documentation to support
the personal financial services that they offered.  Rank Xerox's
three-month study had resulted in a 25% reduction in production
costs with no investment in technology.  They had also made
seventy-five recommendations for the future, of which seventy
were to do with process improvement (representing a potential
saving of three million pounds), and only five to do with the
technology required to support these processes.

Summarizing Rank Xerox's experiences, MP&GB noted that vision is
essential to drive the document strategy.  Whilst millions are
spent on technology, very little is spent on people and
developing their skills.  Shortage of skills is the primary
factor limiting the successful introduction of new technologies.
It is essential to achieve the correct balance between focussing
on standards/technology and the key business processes driving
document production.



17. "Technical information as an integrated part of a product or
a function" -- Goran Ramfors, (Marketing Director, Telub Inforum
AB)

Goran Ramfors (GR) suggested that those who realized the value of
efficient information handling, recognized the seriousness of the
current situation and were prepared to take action, are those
most likely to crack open what he called "today's biggest piggy
bank".

GR said that at the product level it is vital to integrate
hardware, software, and "docware".  Docware adapts information to
the users' needs an presuppositions; its aim is to make technical
information minimal, comprehensible, effective, accessible and
easily updateable.  GR cited two studies, the first showing how
ambiguous, out-dated or unavailable documentation caused 50% of
all unplanned breakdowns in a sample of 24000 high-tech systems
in  the US; another study had shown that McDonald Douglas
engineers wasted 70% of their time either searching for
information or correcting mistakes arising from the use of
incorrect or out-dated information.

At the development and production level, GR pointed-out that most
recent activities have concentrated on improving external
efficiencies, whereas studies suggest that about 70% of all
internal technical information should be re-useable within the
company.  In fact, only about 2% of the available technical
information is re-used in this way -- which has obvious financial
implications.  Nowadays, there is a trend towards improving
internal efficiency to reduce duplication of work, re-use
existing information, and shorten production lead-times. This
approach has produced great cost savings, but the crucial factor
to its success is the creation of an infrastructure for total,
integrated, document handling.

GR said that there is a need to integrate other, non-paper media
into the document-/information- handling process.  Technical
developments in the multi-media field are moving rapidly, and our
everyday lives are increasingly influenced by the use of such
things as optical disks for storage, computers and screens as
information conveyors, hyperfunctions and freetext searching for
effective access, graphic user interfaces, sound, both
interactive and animated video.  GR claimed that with multimedia
technology we can present information in a more logical and
structured format, which often makes it easier for the end-user
to find and transform just the information needed to perform an
activity.  Moreover, there is also the problem of having to cope
with an ever increasing volume of information.  For example in
1994, a company that wants to service the entire range of BMW
cars will require access to 500 A4 binders;  even if the company
can physically accommodate these files,  technicians will
probably find it difficult or time-consuming to find all the
information that they want.  Handling the same volume of
information digitally is the only sensible option.

Before commissioning a multimedia solution to any information
management problem, it is important to consider the end-user's
working environment and conditions.  The user must be able to
both work efficiently with the medium and want to do it -- this
necessarily includes structuring the information database in such
a way that it make it easy for the end-user to retrieve and use
the information s/he needs.  The solution must also be adapted to
the end-user's skill and experience to avoid getting more
information than is necessary.  Every multimedia solution must be
based on good information solutions.  GR asserted that it is the
combination of skilled technicians, the right multimedia tools
and information experts, that is the key to achieving such
factors as faster troubleshooting, shorter downtimes and repair
times, efficient information retrieval and updating, happier
clients and more motivated staff.  Achieving these will give a
business lower costs and a competitive edge.

Savings to be made from increased efficiency in information
handling arise from a strategy based firmly on standards, and
should include the following aims: to re-use current information;
to secure the communication between information islands in the
organization; to reduce time and costs for production, handling
and distribution; to increase the information quality by adapting
it to suit the users; to strengthen the resource captial when
people leave the organization; to secure investments for the
future.  As an example, GR discussed the US Department of
Defence's (DoD) CALS (Computer-aided Acquisition and Logisitics
Support) initiative.  He claimed that CALS will inevitably have
an influence on our choice of products, systems solutions and
standards (since many of the major vendors will be trying to
achieve CALS compliance).  GR said that standards will be
critical for those

        * working in an industry that has to decided to 
          follow the CALS strategy

        * who have a requirement to exchange information 
          (both internally and externally)

        * who continuously have to verify their database

        * who are working with large volumes and/or have 
          a high updating frequency

        * who want to secure their investment for the future

GR reminded his audience that the emerging standards are only
solving the problems of information management, not those of
information quality!

To carry through an integration strategy for technical
information within any organization is not easy and not free of
costs and effort; the key to success lies with top management. If
their interest cannot be captured, and they cannot be made to
understand the importance of taking strategic decisions, the risk
of bad information management, increased costs and decreased
competitive strength is obvious.


18. Documentation as support product -- Chair: Nick Arnold (OECD)

Nick Arnold opened this panel session by saying that many of
those attending the conference recognized the need and role for
documentation as a support product.  The purpose of the session
would be to hear some case studies, and consider the best way to
go about achieving this.  Fundamental questions would be raised,
such as "What are the processes and procedures that must be put
in place to ensure that documentation is synchronized with a
manufactured product?" and  "What technology is needed to achieve
this goal?".


18.1 "Synchronization of documentation and training for
telecommunications products" -- Douglas N. Cross (Member of
Technical Staff, Technical Support    Switching Group, Bell
Communications Research Inc. (Bellcore))

Douglas Cross (DN) began by stating that it is important that a
supplier document each product from the conceptual phase, through
the design and manufacturing phases, to the time that the product
is placed in operation by a customer  -- and then on an ongoing
basis, so that changes in product configuration, functionality,
and maintenance continue to be adequately addressed in product
documentation and training.  That is, changes in the product
should trigger continuous synchronization of changes in related
documentation and training.

Bell Communications Research Inc. (Bellcore) produces certain
Generic Requirements documents which are used by Bellcore's
Client Companies (BCCs) to promote the creation, provision, and
maintenance of high-quality, standardized, and synchronized
documentation and training for their use.  Generic Requirements
documents start off as Technical Advisories (TAs) which undergo a
formal process of comment and review before becoming Technical
References (TRs). TRs address the generic needs of a typical BCC
in areas such as product functionalities, technical capabilities,
features and services, and product support (documentation and
training).

Bellcore has two Generic Requirements documents that are designed
to ensure that documentation and training remain synchronized
with the products, and that they meet appropriate standards and
quality controls.  The response of suppliers to these
requirements has been overwhelmingly positive;  having the
requirements enables suppliers to know what is expected over them
(and this encourages them to take an active interest in, and
comment on, TAs).  Suppliers also appear more prepared to commit
major resources (money and people) to improving their
documentation and training to meet Bellcore's Generic
Requirements; many recognize that good documentation and training
helps sell their products.  Bellcore is actively involved in
several national standards committees, in order to encourage the
adoption of sound documentation and training practices.

BCCs are very concerned about the usability of documentation,
whether on paper or as Electronic Document Delivery (EDD).  If
documentation is sufficiently usable, BCCs may be able to reduce
their training requirements, and make cost savings as a result.
Whilst BCC network maintenance documentation users need to store,
access, retrieve and search documentation quickly and easily,
they prefer to leave the technicalities of creating and using
SGML documents to the document producers.

BCCs place great importance on linking documentation and training
support to the service life cycle of products and services they
purchase.  They expect suppliers to provide direct documentation
and training support, or to assist in obtaining these from a
third party, or to enable the BCCs to provide these themselves.

DC then turned to the issue of "bridging cultural policies", by
which he meant not just handling the differences in international
cultures, but also those aspects of corporate- and user/provider-
cultures as well.  On the international front, DC reminded the
audience that the way things are done in the supplier's country
may not meet the needs of a customer in another country.
Suppliers should be receptive to customers' needs, regardless of
the countries involved; other countries might have their own
Generic Requirements documents, and suppliers should take notice
of any international and/or inter-industry user groups where
documentation experiences are shared.  At the corporate level,
suppliers have different approaches to meeting the needs of BCCs'
for Documentation and Training.  DC emphasized the importance of
liaison with suppliers, and the need to ensure that those who
will be responsible for maintaining the documentation are
involved at an early stage in negotiations to supply initial
documentation and training.  DC suggested that users, providers,
and those involved with producing Requirements and Standards
should be brought together to achieve a number of goals.  For
example, to promote Electronic Document Delivery, to bring down
the costs of producing and purchasing documentation, to address
documentation needs that may be unique to each country/industry,
and to ensure that everyone's needs are met.  Bridging cultural
policies is good for all concerned: suppliers gain by greater
acceptance of their products (i.e. increased sales); customers
gain greater operating efficiencies and cost savings; suppliers,
customers, and requirements/standards groups gain by having
requirements and standards for documentation and training that
are more widely acceptable.

DC concluded by raising a number of questions that suppliers and
customers could usefully ask themselves.  Suppliers should
consider whether their documentation and training groups
synchronize with each other, and other groups to ensure that
documentation and training reflect the actual product
configuration and the latest product developments.  Suppliers
should ensure that they have procedures in place to provide
systematic development and review by the appropriate experts to
ensure the technical accuracy, usability and timeliness of high-
quality, standardized, and synchronized product documentation and
training.  Suppliers should also ask if their documentation and
training groups consult customers to ensure that documentation
and training meet the requirements, standards and needs of
customers.

Customers should check that there is synchronization between
their users of documentation and training, their documentation
production/procurement/distribution and training groups, and
supplier documentation and training groups, to ensure that they
obtain high-quality, standardized and synchronized documentation
and training on a timely basis.  Customers should ensure that
user feedback loops are in place to correct and improve existing
documentation and training, and to provide any additional
documentation and training that might be required.  Customers
should also check that review procedures are in place to ensure
that internally provided and externally provided documentation
and training are relevant to and usable on the job(s) for which
they are intended.


18.2 "Aerospace technical publications in the 90's on a
multinational nacelles project" -- Paul Martin (Technical
Publications Coordinator -- Customer Support, Nacelles Systems
Division, Short Brothers PLC)

Paul Martin (PM) gave a brief introduction to Short Brothers PLC;
they are an aerospace company based in Northern Ireland.  Short
is currently involved in the International Aero-Engine Project,
which is a collaboration of five major companies based in
different European nations.  Information on the engine being
developed by the project runs to more than 10,000 pages, and it
all has to be easily transferable between the different
companies, and maintained for the next twenty-five years.

Shorts provides a suite of manuals, each conforming to the ATA100
specification.  Although each manual has a very different format,
the same information may appear in several manuals
simultaneously!  Describing Short's document development and
production cycle, PM said that most documents are produced on
paper; draft copies are sent to Rolls Royce (who re-key it for
their own on-line documentation system), whilst camera-ready
paper documents are sent to Fiat in Italy, and Rohr Industries in
Germany.  In theory at least, both the camera-ready and on-line
versions of the documentation should contain exactly the same
information.

Looking to the future, PM said that Short Brothers expect that
the transfer of information in digital form (on disk, tape, or
optical disk) will become a requirement.  He anticipated that CD-
ROM would soon replace all their paper documentation, and that
SGML will be used to prepare all the manuals.  Short Brothers are
developing their own electronic publishing capability, and also a
computer network to encourage and support the free flow of
information.  However, they still have some work to do on
deciding how to best control the information flow, especially
prior to the information being put into digitial form.  Since
international collaborations will be part of the future, other
companies will soon be confronted with the types of problems that
Short Brothers and their project partners are experiencing now.
PM concluded by suggesting that successful documentation control
will only be achieved by people working to a thorough system   
-- irrespective of the introduction of ever more sophisticated
technology.


18.3 "A publisher using databases: feelings and experiences"
-- Joaquin Suarez Prado (Prepress Director, Librarie Larousse)

Joaquin Suarez (JS) briefly outlined the requirements of Librarie
Larousse, and their solution for producing documents from an
[SGML] database.  Staff at Librarie Librarie Larousse use
BASISplus (from Information Dimensions) for database management,
WriterStation (from Datalogics) for editing, and DLPager (also
from Datalogics) for page composition.  JS suggested that perhaps
their most demanding requirement was the need to keep a record of
all the editorial changes made to every document between
publication dates.

JS summarized the criticisms Librarie Larousse had of their
current system.  They felt that it had taken an unacceptable
amount of time to get the system up and running, and their only
consolation was that this long lead-time appeared to be
inevitable.  JS was slightly depressed to learn that editors at
Librarie Larousse favoured those features of the new system that
were familiar to them from the old;  they seemed almost reluctant
to take advantage of the opportunities offered by the new system.
Librarie Larousse considered that the performance of their
database had not been what they had anticipated or would have
liked.  Using an SGML-based approach had added to the complexity
of writing a dictionary entry.  JS also felt that SGML markup was
"heavy".

However, there were also a number of benefits gained from
implementing the new system.  Since its introduction, editorial
capacity had trebled, and Librarie Larousse were now producing
twice as many dictionaries and encyclopaedia as before.  The
dictionaries they are now producing are more reliable, more
accurate, and contain more information than was possible before.
The new system ensures that the information is being made
available to the maximum number of people.  Librarie Larousse are
now looking ahead to see what other advantages can be gained from
extending their use of SGML.

18.4 "U.S. WEST's approach to object oriented information
management" -- Paul J Herbert and Diane H A Kaferly (U.S. WEST
Communications)

Paul Herbert and Diane Kaferly (PH&DK) described U.S. WEST's
approach to treating information as a product.  Information is
independent of its platform, display or medium;  information
about a U.S. WEST product provides additional value (accessible
via screens, manuals, and training materials).  Information about
their customers, products, network and competitors was critical
to U.S. WEST even before the advent of computers.  The
information products that U.S. WEST produce must be designed to
cross business and geographical barriers.

PH&DK defined the "classic paradigm" that had confronted U.S.
WEST when considering its approach to information management;
they needed to work through three distinct phases -- first define
their problem clearly, then devise an appropriate solution, and
lastly to implement this solution.  In order to define their
problem, U.S. WEST set up a consortium of clients and providers
who were able to use a common semantic, work together at the
appropriate levels of detail and rigor, and focus their attention
on the problem concerned -- that is, to define the objects and
their relationships needed to build an object oriented
information management system.

In their approach to managing information, U.S. WEST were keen to
adopt a single source/multiple use model of information handling.
They wanted a system in which a single source record could be
distributed to many users in a forms that were appropriate to
their needs.  U.S. WEST found that managing information through a
heirarchy of associations linking information sources etc. worked
successfully.  They also found that when introducing changes  in
the way they handled information, it was important from a
management point of view to have devised solutions that could be
linked to particular problems.

Having talked briefly about the types of information stored in
U.S. WEST's document management system,  PH&DK concluded by
mapping their approach onto the "classic paradigm" to which they
had referred earlier.  In order to define the problem, they had
established a consortium of interested parties to establish clear
goals for the application.  Their solution had been built upon
the adoption of standards such as SGML.  The final implementation
had relied upon the adoption of a well-designed object oriented
database management system.



18.5 "Keeping track of law changes" -- Marc Woltering (Wolters
Kluwer Law Publishers, Deventer)

Marc Woltering (MW) outlined Wolters Kluwer's role in publishing
the Dutch statutes.  At present, there are around 8000 statute
laws in force, all of which must be published and made available
to the public.  Any changes, new laws, or decisions to repeal old
laws are published in journals that report rulings made by the
Government and the courts.  Once a statute has been published,
Wolters Kluwer are only required to ensure that appropriate
updates are issued.  This task has been greatly simplified with
the advent of loose-leaf publishing techniques, which has meant
that Wolters Kluwer have been able to meet their obligations
simply by issuing updates of any affected pages.

Wolters Kluwer have already implemented the decision to use SGML
to markup the text of the statutes, and to facilitate document
management using sophisticated database techniques.  MW said that
they had taken advantage of all the usual benefits of using SGML
-- such as having a neutral system of markup, which is easy to
process and allows for user-defined structures, and so on.
However, they had also encountered the typical problems of using
SGML -- such as its text-only basis, and difficulties when
processing complex tables.

MW said that there a major problems associated with the area of
law publishing, that are difficult to solve using any technique
-- and which he believed were not well resolved by using SGML,
either.  In particular he cited the problem of handling alternate
versions of the same document, maintaining comments within
clauses, and ensuring the validity of cross-references between
laws.


19. SUMMARY

The sheer number (and commercial weight) of attendees indicated
that SGML is developing well.  The fact that people were at
different stages of implementing SGML, and were doing so with
very different short- and long-term perspectives, suggested that
SGML has a bright future ahead.  The SGML user community is not
about to become a stagnant pool of a few international
organizations and government agencies.  The emergence of HyTime
and the amount of work going into developing SGML-related
standards such as DSSSL, implies that SGML is uniquely well-
placed to become the basis of future information handling
technologies.

=================================================================
For further details of any of the speakers or presentations,
please contact the conference organizers at:

Graphic Communications Association
100 Daingerfield Road, 4th Fl.
Alexandria, VA 22314-2888
United States

Phone: (703)519-8157          Fax:(703)548-2867
=================================================================
You are free to distribute this material in any form, provided
that you acknowledge the source and provide details of how to
contact The SGML Project.  None of the remarks in this report
should necessarily be taken as an accurate reflection of the
speakers' opinions, or in any way representative of their
employers' policies.  Before citing from this report, please
confirm that the original speaker has no objections and has given
permission.
=================================================================
Michael Popham
SGML Project - Computing Development Officer
Computer Unit - Laver Building
North Park Road, University of Exeter
Exeter EX4 4QE, United Kingdom

Email: sgml@exeter.ac.uk      M.G.Popham@exeter.ac.uk (INTERNET)
Phone: +44 392 263946        Fax: +44 392 211630
=================================================================