The following report was obtained from the Exeter SGML Project FTP server as Report No. 9, in UNIX "tar" and "compress" (.Z) format. It is unchanged here except for the conversion of SGML markup characters into entity references, in support of HTML.
THE SGML PROJECT SGML/R12 CONFERENCE REPORT: INTERNATIONAL MARKUP `92 World Trade Center, Amsterdam, The Netherlands 10-13th May, 1992 Issued by Michael G Popham Computing Development Officer The SGML Project 29 September 1992 _________________________________________________________________ BACKGROUND This was the tenth annual International Conference to be organized by the Graphic Communications Association (GCA). In is opening address, Norman Scharpf (President, GCA), said that this was the largest attendance in the history of the conference with around 150 people present. Scharpf attributed the high attendance mostly to increased interest in SGML, but also to the GCA's decision to arrange for the first in a series of "Documentation Europe" conferences to run `back-to-back' with International Markup. List of Sessions Attended 1. "Up-to-speed with SGML" -- Chair: Pam Gennusa (Consultancy Directory of Database Publishing Systems Limited; President - The International SGML Users' Group) 2. "SGML: Changes today for tomorrow's requirements" -- Dr. Charles F. Goldfarb (Senior Systems Analyst, IBM Research Division). 3."SGML and databases" -- Chair: Francois Chahuneau (Director, Berger-Levrault/Advanced Information Systems) 3.1 "SGML and databases: Implementation techniques, access methods and performance issues". 3.2 "Relational database applications with heterogeneous SGML documents" -- Tibor Tscheke (President, STEP Sturtz Electronic Publishing GmbH) 4. "HyTime" -- Chair Steve Newcomb (President, TechnoTeacher) 4.1 "HyTime workshop" -- Steve Newcomb 4.2 "Space and Time in HyTime" -- Michel Biezunski (Consultant, Moderato, Paris). 5. AAP Math/Tables Update Committee 6. "SGML: an ISO standard; an ISO tool" -- Anders Berglund (Senior Adviser, Electronic Publishing, International Organization for Standardization). 7. "SGML -- a patently good solution" -- Terry Rowlay (Directorate General, European Patent Office). 8. "Encoding the English poetry full-text database applying SGML to thirteen centuries of verse" -- Stephen Pocock (Senior Projects Editor, Chadwyck-Healey Ltd). 9. "Is SGML Bad News to your Editorial Staff" -- Koen Mulder (Wolters Kluwer Law Publishers, Deventer) 10. "SGML in the Software Development Environment" -- Shaun Bagley (General Manager, Exoterica Inc.) 11. "Implementing SGML at Ericsson Telecom: two perspectives" -- Peter Dybeck (Project Manager, Docware Development, Ericsson Telecom AB), Nick van Heist (Technical Consultant, TeXcel AS AB) 12. Reports from SGML Users' Groups Chapters, Special Interest Groups (SIGs) and Affiliates -- Chair: Pam Gennusa (President, SGML Users' Group) 12.1 The European Workgroup on SGML (EWS) -- Holger Wendt 12.2 The SGML Project -- Michael Popham 12.3 CALS Update -- Michael Maziarka 12.4 Dutch Chapter, SGML Users' Group -- Jan Masden 12.5 Norwegian Chapter, SGML Users' Group -- Jan Ordahl 12.6 SGML for the print-disabled -- Tom Wesley 12.7 French Chapter, SGML Users' Group -- Michel Biezunski 12.8 SGML Forum of New York, SGML Users' Group -- Joe Davidson 12.9 SGML SIGhyper, SGML Users' Group -- Steve Newcomb 12.10 UK Chapter, SGML Users' Group -- Nigel Bray 13. 1992 AGM SGML Users' Group -- Chair: Pam Gennusa (President, SGML Users' Group) 14. Keynote Address -- Ken Thompson (Commission of the European Communities) 15. "Technical Trends Affecting Decisions about Documentation Generation" Andrew Tribute (Consultant, Seybold Limited) 16. "Providing a strategy for the practical implementation of document management systems" -- Martin Pearson and Graham Bassett (Rank Xerox Limited) 17. "Technical information as an integrated part of a product or a function" -- Goran Ramfors, (Marketing Director, Telub Inforum AB) 18. Documentation as support product -- Chair: Nick Arnold (OECD) 18.1 "Synchronization of documentation and training for telecommunications products" -- Douglas N. Cross (Member of Technical Staff, Technical Support Switching Group, Bell Communications Research Inc.(Bellcore)) 18.2 "Aerospace technical publications in the 90's on a multinational nacelles project" -- Paul Martin (Technical Publications Coordinator Customer Support, Nacelles Systems Division, Short Brothers PLC) 18.3 "A publisher using databases: feelings and experiences" -- Joaquin Suraez Prado (Prepress Director, Librarie Larousse) 18.4 "U.S. WEST's approach to object oriented information management" -- Paul J Herbert and Diane H A Kaferly (U.S. WEST Communications) 18.5 "Keeping track of law changes" -- Marc Woltering (Wolters Kluwer Law Publishers, Deventer) 19. Summary Note: I have attempted to report on the presentations that I attended to the best of my ability, although nothing I have written should necessarily be taken to represent the opinions of the speakers or attributed to them. Any mistakes are mine, and I hope both readers and those being reported will make allowances and accept my apologies. Any remarks or comments that are entirely my own are enclosed in square brackets where they occur during the body of a report on a particular presentation. PROGRAMME (Sunday 10th May) 1. "Up-to-speed with SGML" -- Chair: Pam Gennusa (Consultancy Directory of Database Publishing Systems Limited; President - The International SGML Users' Group) This was an informal session to encourage new and experienced users of SGML to meet and discuss with the leading authorities in the SGML field. The panel included Dr. Charles Goldfarb (Editor ISO 8879:SGML, author "The SGML Handbook"), Sharon Adler (Editor ISO/DIS 10179:DSSSL), Anders Berglund (Editor ISO/TR 9573), Martin Bryan (Author "An Author's Guide to SGML") and Eric Van Herwijnen (Author "Practical SGML"). PROGRAMME (Monday 11th May) 2. "SGML: Changes today for tomorrow's requirements" -- Dr. Charles F. Goldfarb (Senior Systems Analyst, IBM Research Division). Dr. Charles Goldfarb (hereafter CG) discussed the past, present and future of SGML. As part of the "past", he mentioned that DSSSL (Document Style Semantics and Specification Language) has passed its ballot, but that the committee involved felt that recent technological changes merited making further changes to DSSSL and resubmitting it as a Draft International Standard (DIS) -- due for distribution later this year. CG also remarked that the name DSSSL will be changed. CG mentioned the ISO's work on ISO/TR 9573 -- a technical report on using SGML for publishing ISO documents. He noted that ISO has many special problems, such as having to publish in multiple languages, for multiple uses, on a wide range of subjects. CG also announced that HyTime has now been passed as an International Standard. Analysing the "present" SGML situation, CG stated that SGML is now being used very widely, in nearly all technical publishers, large government agencies, and so forth. However, he pointed out that SGML faces political problems because it is under- represented by supporters on the ISO standard committees. The result is that important decisions are being taking in ignorance of SGML. CG encouraged people to get involved with their national standards committees. CG noted that now the ARCSGML parser materials are available through the International SGML Users' Group there is no excuse for people not starting to use SGML! He also remarked on the existence of useful and active facilities, such as the comp.text.sgml newsgroup. Looking towards the SGML "future", CG reminded attendees that running concurrently with the conference was a meeting of the ISO SGML special working group dealing with the five year review of the ISO 8879 (the SGML standard). He said they would welcome any comments from those present. CG then spoke about HyTime, and the way it extends the abilities of SGML to deal with compound documents, into the areas of hypertext and multimedia. He then made the surprising assertion that everyone is already familiar with hypertext and multimedia -- and supported this claim with a slide taken from the C11th Winchester Bible (showing a page of illuminated manuscript). CG asserted that the information structures are similar to those that might be found in a multimedia document, with a relationship linking the text and the graphics. CG followed up with an example of a modern newspaper, which he compared to a hypertext document -- in the sense that it offers the reader numerous points of access (i.e. different articles, specialist pages, table of contents etc.). CG suggested that perhaps one of the most notable features about electronic hypertext/multimedia documents, is the "new" emphasis on time (e.g. having animated graphics requires using time-based concepts and mechanisms). However, CG countered his own suggestion by subsequently showing a slide of an C11th early musical manuscript; he argued that the information in the document was essentially time-based, with the text and notation indicating the relative duration of words the pitch of notes, and so forth. He then showed a much later musical manuscript, which used a different system of notation to capture the same concepts of time and pitch. CG reminded his audience that in music, all values for duration and timing [and pitch?] are given relative to one another; for example, the notation indicates not how long a particular type of note should last in seconds, only that the duration of this type of note is twice as long as a note of another type. (CG compared this relationship to the way in which, in an SGML document, the elements of the logical structure are all identified relative to one another). CG concluded that although the hypermedia model is very sophisticated, it actually contains no concepts that are really new; the problems arise from deciding how to implement these concepts technologically. CG particularly wanted to emphasize that the owners of information are the users, not the people who develop, understand or control the technology that delivers the information. He also stressed that any information should not be tied to the performance of the technology involved. For example, your information should not be closely tied into (the limitations of) CD-ROM, because the technology might change but you would still want your information to be accessible. Using a series of slides, CG discussed some of the concepts behind HyTime. He outlined the notion of an event schedule in a HyTime "finite co-ordinate space", defining it as an ordered list of events which may be described in terms of virtual units. Virtual units are then carefully mapped onto real units for purposes of presentation. CG identified the two major classes of facilities supported by HyTime as those dealing with locations and linking, and those concerned with scheduling and presentation. He gave an example of a truck driver training system in which the underlying information/document structure is coded using a HyTime-based approach, but this may be presented to the trainee in a variety of different forms e.g. a combination of text and simple graphics, or computer animation, or perhaps as interactive video. CG said that it was important for people to appreciate the value of the SGML concepts of "elements" and "entities". When writing complex DTDs, designers must use entities to contain groups of elements; this is not only good practice, but it becomes vital when documents have to be exchanged between hypertext systems. CG stated that hypertext system designers must appreciate such concepts and also those to do with the notions of virtual- and real-time. CG put up this diagram: Document--->SGML <----------->HyTime <---->Application <---> User parser engine ^ ^ | | -----> Entity <------ Manager CG ended his presentation by suggesting that, hopefully, we will move into time-based document processing having learnt the lessons of printing and handling traditional documents on computer -- i.e. the lessons learnt from using SGML. PROGRAMME - note. On the first day, it was intended that the conference should split into two separate "tracks". One track, "Making the SGML Decision", would essentially have a management focus and would look at the following areas (taken from the conference literature): "..Why do information processing and distribution organizations choose to use SGML? What are the business and cultural reasons? What processes do they follow to arrive at such a decision?" Speakers from several large companies considered these questions. In the afternoon, the track took a more practical turn, and looked at issues of DTD design, documentation and testing, and there was also a presentation on DSSSL. The second track spent the morning looking at issues relating to the use of SGML and databases, namely (ibid): "..What are the considerations when creating software for an SGML implementation, specifically SGML used with databases? What are the tradeoffs for performance? What are the major design options? This workshop will provide a forum for implementors to hear and discuss the latest research in this discipline." In the afternoon, this track focussed on HyTime providing an introductory workshop, and discussion/news on the latest developments. 3. "SGML and databases" -- Chair: Francois Chahuneau (Director, Berger-Levrault/Advanced Information Systems) [Whether due to the subject matter, or Mr. Chahuneau's reputation (or a combination of both!), this session was exceptionally well- attended. I would guess that about two-thirds of the delegates were present at this session, which obliged the organizers to hastily re-organize the allocation of rooms.] 3.1 "SGML and databases: Implementation techniques, access methods and performance issues". Francois Chahuneau (hereafter FC), opened the session by presenting his own paper. Direct extracts from this paper are given in quotation marks however, no permission has been sought and I hope the author will not sue! FC began by remarking that: "It has always been said that using SGML to build structured documents wasthe best path towards optimal use of database technology to manipulate information stored in these documents. "However, this conviction has been interpreted in many different ways. One can distinguish at least four kinds of applications: * STORING SGML documents in databases as atomic objects, with minimal extractions of information from the SGML structure to serve as "indexing fields"; * REPRESENTING SGML documents in databases (or turning SGML documents into databases), by partial or full mapping of the SGML structures to database structures; * GENERATING SGML documents out or non- document databases, as a special form of report generation; * LINKING SGML documents to databases, to create so-called "active documents" (this approach is especially popular in the technical documentation field). "[In his presentation, FC restricted himself]..to the problem of REPRESENTING SGML documents in databases in efficient ways, so that parts of the documents can be independently accessed (searched, retrieved and possibly modified). This approach is most useful in the case of large documents (such as Aircraft Maintenance Manuals or legal textbooks), or large collections of small homogeneous documents (such as dictionaries, collections of forms or data sheets, etc.) "Even with this restricted scope in mind, two additional independent criteria must be considered to understand the variety of existing implementation approaches: 1. Is the database meant to accept instances of a single well-known SGML DTD (or its minor variants) or instances of multiple arbitrary DTDs? 2. Is the database to be used for consultation purposes only (static database) or for information update (dynamic database)?" FC then went on to compare database systems based on specific DTDs and those based on generic DTDs. FC remarked that with a specific DTD there is a strong temptation to map the DTD into the conceptual schema for a database. He suggested that it would be difficult to characterise the performance of such systems, as each would be so closely dependent upon the original DTD used, and they would also be individually optimized to improve their performance. FC suggested that such systems are inherently inflexible and over-specialized, and the approach should be rejected if similar or better performance could be achieved using systems based on generic DTDs. FC suggested that generic SGML database systems could be built, on the basis that "It is possible to consider an arbitrary SGML document instance as a TREE OF TYPED NODES decorated with attribute values"...Mapping this tree abstraction into database structures (possibly with some representation of the DTD itself) is the main idea behind generic SGML systems". The rest of FC's presentation was based on the adoption of this approach. FC then compared and contrasted dynamic (editorial) databases with static (consultation) systems. It is possible to optimize static databases for information search and retrieval. Dynamic databases consequently appear to be slower and larger than their static counterparts, because they cannot rely on such optimizations. For example, when coping with SGML fragments "..static databases tend to keep the SGML sequence unaltered in the database, whereas dynamic databases scatter original document content all around to allow independent updates of SGML elements: the reconstruction of a sequential SGML fragment [in response to a query] requires much more work". FC reviewed some of the generic SGML database systems currently available. As examples of static database systems, he looked at "SGML/Search" from AIS, and "DynaText" from Electronic Book Technologies; as dynamic systems, he considered "BASIS PLUS docXform" from Information Dimensions (IDI) and "SGML-DB" from AIS. "SGML/Search is a static database system for SGML documents (or document collections) based on Open Text's PAT text search engine. It is described as an SGML object server engine, which can be accessed either through a powerful, DSSSL-inspired query language or through a C-callable API....An indexing module, which includes an SGML parser, takes an SGML document with its DTD as source data.....The database itself comprises the enriched SGML file (which never needs to be accessed directly) and associated full-text and structure indexes....As opposed to SQL, the SGML/Search query language allows: * direct expression of the element nesting relationships at any depth, * natural combination of primitives in a functional programming style, * separation between SET DEFINITION (how many elements of this type have such a property) and DATA EXTRACTION (send me thethird member of this set)... "The SGML/Search query language is entirely set-based and does not allow NAVIGATION in the SGML document." FC said that "DynaText was initially designed as an 'electronic book publishing system', and is rich in navigation and hypertext- oriented features." It includes a query language based on similar principles to that used in SGML/Search, and "..accessible from the Systems Integrator Toolkit (SIT)". Thus "..the DynaText system could be used to build general purpose SGML object servers. As opposed to SGML/Search, the SIT offers a full set of navigational primitives". FC stated that IDI will soon be releasing extensions to BASIS PLUS known internally as "docXform" which are a set of tools and methods that "..include a general approach to mapping large SGML documents to the BASIS PLUS sectioned document model". FC remarked that "IDI's method curiously mixes generic and specific approaches", also noting that "..the method uses a distinction between 'contextual content' and 'contextual criteria', which is reminiscent of the traditional distinction in text retrieval systems between 'text' and 'structured data'. The reason for this is not quite clear, but is probably motivated mostly by performance concerns; it is however somewhat in contradiction with the unifying approach of the SGML language to describing document structures". FC had no information on the performance of IDI's prototype implementations. FC described SGML-DB as "..a technology developed by B.L/AIS to decompose large SGML documents (or document collections) in a database, so that concurrent editing of SGML fragments is possible." He added that "..SGML-DB allows partial rollback of any part of the document to any point in the past (up to the last garbage collection). In SGML-DB, the 'tree of typed objects'..is generalized into a multi-temporal tree." At present, this tree is subsequently mapped into a relational database, but AIS are considering mapping into a fully object-oriented system, such as that offered by "O2". Currently, the query language used in SGML-DB is simpler than that offered within SGML/Search, but AIS are working to improve its performance. "SGML-DB allows full or partial decomposition of the SGML structure in the database: some elements of the DTD can be declared as 'terminal', so that they will not be decomposed when found in the instance but stored as text strings with tags embedded". FC then presented the results of some bench tests that AIS had performed with the packages available to them (i.e. excluding "docXform"). They had used two test databases, a 15Mb legal text database (comprising 301,491 SGML elements) and a 51Mb aircraft maintenance manual database (comprising 1,606,000 SGML elements). SGML/Search DynaText SGML-DB LEGAL (15Mb) Time to load/index 11 min. 112 min. 90 min. Expansion factor 1.9 1.8 4 Query search time < 1 sec. < 1 sec. 3 sec. Time to extract a 100Kbyte fragment < 1 sec. - 9 sec. MANUAL (51Mb) Time to load/index 74 min. 476 min. 340 min. Expansion factor 1.95 1.8 5 Query 1 search time < 2 sec. < 2 sec. 5 sec. Query 2 search time < 2 sec. < 2 sec. 8 sec. Time to extract a 100Kbyte fragment < 1 sec. - 11 sec. SGML-DB took up more space (and took longer to load that SGML/Search) because the creation of its dynamic indexes takes up more space and time than the creation of static indexes. SGML-DB took longer to return results from queries and to extract fragments because the texts had been stored using SGML-DB's "maximal decomposition" option, which meant that each piece of found text had to be rebuilt from SGML-DB's tables. FC commented that "Larger granularity significantly improves performance, but also 'hides' some SGML structures which cannot be searched without [using] the full-text option". He also noted that "..for DynaText, the notion of 'time to extract' an SGML fragment in meaningless, because the browser directly reads information from the database in binary form without transforming data to SGML format. Formatted fragment display is instantaneous in this case". FC's conclusions were as follows: "Solutions begin to appear which allow close integration between database technology and SGML concepts. Through their life cycle, large SGML documents will exist in two isomorphic states: as sequential tagged files for exchange purposes, and as database structures for sophisticated processing purposes. " Compared to the sequential form, the database form may provide many additional facilities such as direct access, navigation and concurrent update. Such facilities are needed for implementing new applications such as hypertext, but also to renew traditional applications such as typesetting. In particular, evolving from FOSI-style semantics to DSSSL-style semantics will require direct access to the document abstract tree, which implies database mapping as soon as large documents are considered." Contact: Francois Chahuneau, B.L/A.I.S., 34 Av. du Roule, 92200 Neuilly, France. 3.2 "Relational database applications with heterogeneous SGML documents" -- Tibor Tscheke (President, STEP Sturtz Electronic Publishing GmbH) Tibor Tscheke (hereafter, TT) outlined some of the problems and requirements underlying the theme of his talk. Very large documents often have a deep structure (100 elements or more), but much of the SGML markup is redundant in terms of database queries (e.g. it is of little or no relevance that a word might be marked as an "emphasized phrase"). Linking a database into the life cycle of a document can be problematic for example there might be different versions of the DTD, or large documents may only be revised in part. Many heterogeneous documents may be held in the same database, but database users will not necessarily need or want to know about the associated DTDs. Whatever the complexity of the database, users will want access and performance that is optimized to meet their requirements. TT suggested that it may be possible to define a "document class" for collections of documents with 'similar' structures [i.e. similar DTDs]. However, he noted that although documents may share a common structure they may use different tag names and attribute names to refer to the same thing (i.e. <chp> = <kapitel> = <Chapter> ). Also, a DTD may go through several versions, in which, say, element declarations may be altered -- giving the 'same' element a different generic identifier or content model. TT then proposed that for each DTD, a "DTDspec" could be written which would associate a document instance (DI) with a particular DTD, and specify the normalization of data types and generic identifiers, "selectable information units", "selectors", and "switches". "Selectable information units" are defined in a specification which identifies which elements will become selectable, and which piece of information will become the reference/key to that selectable unit (e.g. the value assigned to the attribute chapno within a tag such as <Chapter chapno=3>); there should only be one constructed key per selectable unit. A "selector" is the element (content), or attribute (value) by which a general/relative surrounding element becomes selectable; many selectors may point to a selectable unit. TT said there would also need to be specifications for "switches", external and internal references, linkable user-defined procedures, and so forth. There would also need to be one DCspec ("Document Class specification") for each set of similar DTDs. TT outlined some of the procedures which would be required in the environment that he envisaged. For every new class of documents, it would be necessary to specify a new DCspec. For each new DTD that fell within that DCspec, it would be necessary to specify a DTDspec. Each new document instance (DI) that was to be stored in the database would have to be passed through a tool that used the DI, the associated DTDspec and DCspec as input. TT said that an approach based on the formal specification of document classes etc. would facilitate dynamic query interfaces (using windows, buttons etc.) for database access. Global (database wide) queries would have ready access to the information units available in different document classes; it would also be possible to query information units that only occur in a particular document class. TT said that he was aiming for a situation in which the database application would need to have no knowledge of document structures. Discussion: There was time for limited discussion at the end of the session, during which the following points were raised. - TT said that his company were currently looking at HyTime's concept of "architectural forms" to see if this would resolve some of the problems of documents having DTDs that are similar in terms of logical structure, but use different names for generic identifiers. - SGML-B was mentioned as a standard that is currently under development which should provide direct access to SGML documents. This would avoid many of the problems arising from having to map SGML documents into database systems so that they can be searched efficiently. - FC doubted that SFQL will offer many additional benefits. He suggested that it was too close to SQL, which had not been designed with the idea of accessing full-text documents in mind. 4. "HyTime" -- Chair Steve Newcomb (President, TechnoTeacher) This session focussed entirely on HyTime (the Hypermedia and Time-Based Structuring Language). After an introduction to HyTime, the various business advantages and implementation issues were considered. The session closed with a paper comparing HyTime's space-time model and the space-time model used in physics. 4.1 "HyTime workshop" -- Steve Newcomb Steve Newcomb (hereafter, SN) said that HyTime had recently completed the final stages required for its acceptance as a full ISO/IEC standard. He recommended that all interested parties should approach their national standards bodies for copies of the standard. SN outlined the activities of SGML SIGhyper -- the special interest group of the International SGML Users' Group that concerns itself with hypertext and multimedia. He stated that anyone interested in joining SGML SIGhyper should contact either SN himself, or go through the International SGML Users' Group. SN spoke briefly about "HyMinder", a HyTime-conformant engine which is currently under development and should be in beta- testing by July. SN then used a series of slides produced by Mary Laplante (for a presentation at TechDoc Winter`92), to outline the business advantages to be gained from adopting SGML. First, he considered why the development of something like HyTime was both desirable and inevitable. There was too much traditional information being published on paper, and it was also too complex, volatile and vital. SGML solved some of the problems by allowing documents to be stored and interchanged electronically, and facilitating their delivery on-line. However, SGML did not address the problem of interchanging documents written with different DTDs, and a new generation of on-line, interactive, multimedia documents are beginning to appear. HyTime offers "a set of internationally agreed conventions for interchanging arbitrarily complex online documents, that is neutral with respect to ... all multimedia base technologies, all other proprietary and nonproprietary technologies, and all user interaction metaphors". SN briefly compared and contrasted the main concepts of traditional markup, SGML, and HyTime. He then discussed why HyTime is "hot", noting the following points: heterogeneous computing environments are the norm, but people want to be able to exchange information between environments/applications easily; most people recognize the advantages of adopting a recognized standard; there is a strong interest in, and demand for, on-line and interactive documents; HyTime's inherently object-oriented design makes it attractive and even trendy. SN listed the benefits of using HyTime: data can be automatically ported to a variety of platforms; data represented in HyTime can survive technological evolution; producers and users of information can acquire the hardware and software that is most appropriate for their needs; HyTime data remain available for unforeseen future uses; the publishing cycle is shortened because there is no need for data translation. SN identified a number of groups who he believed would need to know about HyTime. These included software and hardware vendors, technical documentation professionals, in-house publishers, commercial publishers, authors (including educators, musicians, and games programmers), and consumers. He then gave some brief application scenarios. SN next described some "real-world" HyTime applications. He discussed how HyTime would facilitate the CALS requirement for the electronic review of documents, and how it could support the content data model for the revisable databases required to produce interactive electronic technical manuals (IETMs). SN described the work of the Davenport Group to develop a HyTime- like meta-DTD to enable easier production, combination and sharing of on-line documentation. The rest of SN's presentation dealt with the prerequisites for the spread of HyTime, how to find more information about the standard, and looked at the potential for CD-ROM and other forms of electronic publishing. 4.2 "Space and Time in HyTime" -- Michel Biezunski (Consultant, Moderato, Paris). The full title for this presentation was "HyTime: Comparison between the space-time model and the space-time model used in physics". Although Dr Biezunski began his presentation by saying that it was not going to be overly scientific or technical in nature, he was speaking from the point-of-view of a man who holds a PhD in Physics! Therefore, I will not attempt to summarize his presentation here but suggest that interested readers contact Dr Biezunski (through the GCA), and ask for copies of his paper and transparencies. 5. AAP Math/Tables Update Committee This meeting was chaired by Paul Grosso (ArborText), who fulfils this role on a voluntary basis; the rest of the "Committee" is composed of parties interested in the development of the AAP DTDs, and membership is open to all. The previous meeting had been held at TechDoc Winter `92, and the purpose of this gathering was to build upon earlier work and decisions focussing primarily on math. Paul Grosso reported that at the previous meeting, they had decided to take the DTD in ISO 9573 (part 7) as a starting point, and to revise the AAP Math DTD in light of this. They had also decided to look closely at the efforts of the European Workgroup on SGML (EWS) in relation to math. Anders Berglund (ISO) said that ISO 9573 (part7) is supposed to deal with math and chemistry. Currently, the DTD looks the same as in ISO 9573:1988, but he proposed that the committee should aim to develop a base level math DTD which could be combined with a set of possible extensions to make it more suitable for the needs of disparate groups. Eric van Herwijnen (CERN) said that he had already done a comparison of the existing versions of the AAP DTD and ISO 9573, to see how they handled math. He felt that it should be possible to produce a single DTD to cover the structured elements of math -- however, when he had discussed this with the people working on the Euromath Project they had resisted his suggestion. Euromath opinion was that it was impractical (if not impossible) to build a math DTD based on structure, and a presentation-oriented approach was the only method likely to succeed. Eric noted that this would mean that in practice there would be three math DTDs in circulation -- AAP, ISO 9573, and Euromath -- which would be bound to lead to confusion and dissatisfaction. Taking up Anders Berglund's earlier remarks, Paul Grosso suggested that perhaps a base math DTD could be presentation- oriented, with the DTD extensions developed to suit the requirements of other types of processing. Klaus Harbo [?] (Euromath) said that there investigations had shown that capturing the semantics of math in a single DTD would be too difficult, if only because the subject is developing so rapidly. This had been the main justification behind their decision to adopt a presentation-oriented approach when writing their DTD. Even if it were possible, fully marking up the semantics of math would necessitate putting in too much markup (in terms of the demands placed on authors) -- and/or the semantics of the math produced by a given group might be "incorrect" in the opinion of the DTD designers (although perfectly acceptable to the group members themselves). Some participants commented that "too much" markup need not necessarily be a problem if it was automatically inserted into the text by authoring tools, and kept concealed form authors who did not wish to see it. Jamie Wallace (Mathematical Braille Project) commented that the notes circulated by the ISO and AAP to accompany their DTDs, were written largely on the premise that authors would be supported by sophisticated authoring tools (i.e. they will not have to put in much of the markup themselves). However, Jamie's particular concern was that many authoring tools seem to encourage a visually-based presentation-oriented approach to markup which not only makes them even less accessible to the visually impaired and print-disabled, but makes the automatic translation of math texts for groups with special needs much more complex. Paul Grosso said that some of this could be attributed to the different input and output forms represented by the tools concerned -- and, say, the spaces that are inserted into a visual presentation-oriented view of a text need not necessarily be stored in the same tool's internal representation of that text. Tom Wesley (Mathematical Braille Project) told the committee about recent U.S. legislation requiring that all educational texts should be available to the blind. He noted the implications that this would have for the publishers of math text books, and the handling of math in electronic form. It was agreed that the next meeting should be held in conjunction with this summer's TechDoc conference [August 25-28th ?]. Prior to that meeting, Eric van Herwijnen and Anders Berglund said that they would hope to have some draft DTDs available for circulation and comment. Programme (12th May) 6. "SGML: an ISO standard; an ISO tool: -- Anders Berglund (Senior Adviser, Electronic Publishing, International Organization for Standardization). Anders Berglund (hereafter AB) described how the ISO operates, and the considerations that lay behind their decision to replace their traditional typesetting system. The ISO's main requirements were for a system that enabled fast, in-house production of documents with automated generation of indexes and tables of contents, a minimum amount of re-keying and maximum support for the re-use of information. A system based on SGML seemed to meet all these requirements. AB then discussed the design decisions behind the ISO's DTD. They needed a set of elements rich enough to permit production of current International Standards as hard-copy. They chose element names that permitted an SGML parser to verify the structure and completeness of Standards documents (e.g. that a "Scope" section was included in every Standard), and also permitted the easy generation of "boiler plate" text. Where possible, the designers tried to use element names that reflected their information content, with the intention that this would simplify the subsequent production of secondary publications, database applications, and so forth. AB said that they were still exploring the best ways to handle precise links between related documents, and the tagging of tables to capture their logical content rather than their presentation form. AB described the existing methodology for producing International Standards, and the workload and throughput that this involves. He said that the probable future for the ISO would be an SGML smart data entry system for compositors, WYSIWYG formatting on workstations, and network access permitting electronic communication with project editors and secretariats. The ISO expects to gain by avoiding the current system of manual paste- up, and by automating the generation of numbering and referencing during the revision of documents. However, the ISO expects to see their major gains coming when documents are submitted with SGML markup according to the ISO's DTD -- since this will greatly speed up the production of secondary publications, and allow for re-use of information in multimedia publications etc. AB spoke of the requirements and goals for the tools which will be used by Project Editors and Leaders. He also talked about the increasing extent to which paper-based products will be supplemented by electronic products -- enabling the searching of on-line Standards documents, the production of hypertexts, and the creation of a full-text document database. AB called for cooperation and coordination amongst participating member bodies, to support the efforts of the ISO to produce a DTD and adopt SGML. He concluded with some remarks on granting external network access to the electronic texts of International Standards, and other forms in which electronic products might be distributed (e.g. CD-ROM, on-line access to computers in Geneva or distributed servers). 7. "SGML -- a patently good solution" -- Terry Rowlay (Directorate General, European Patent Office). Terry Rowlay (hereafter TR) briefly described the organization and function of the European Patent Office(EPO). The EPO has two major roles: (i) the searching, examination and granting or European Patents, and (ii) the subsequent dissemination of patent information. Two documents are fundamental to the first role -- the original Patent Application filed by an inventor (known as the "A-publication") and the granted Patent Specification document (know as the "B-publication"). The EPO currently publish a great deal of their information on CD-ROM. The size of an average A-publication is about thirty pages, and each Patent Application has to be compared against a search file of over 60 million documents (hence the need to automate the process!) The EPO deals with over 50,000 Patent Applications in a year, each of which requires additional documentation to support its passage through the approval procedure. In total, the EPO produces 88.1 million pages each year. In the early 1980's, realizing the problems that lay ahead and the vital need to automate the process of handling A- and B- publications, the EPO set up the DATIMTEX Project (DATa, IMages, TEXt). Current practice meant that the quality of the A- publication was totally dependant on the quality of the original application submitted by the inventor (which was highly variable). Although the quality of the subsequent B-publication was much higher, the A-publication was actually the document most widely used in the patent world. EPO wanted the DATIMTEX Project to devise a means of producing a high-quality A-publication which was also available as an electronic document complete with all the bibliographic and search report data that was required. Two contractors (Rank Xerox U.K., and Jouve [France]) are responsible for capturing the text, images and data contained in the Patent Applications, and putting them into machine-readable, marked up form. TR described the procedures used by Rank Xerox, who have attempted to automate the processes of capture and markup wherever possible; manual intervention is only required to markup irregular document components, complex tables, and for the inclusion of bit mapped images. The captured text is delivered to the EPO on magnetic tape, accompanied by an image tape containing all the associated embedded images and drawings not captured as text. (The image tape also contains page images of the final A-publication, which the EPO can re-use in its CD-ROM publications). SGML appealed to the EPO for the DATIMTEX Project because it gave them a machine-independent way to markup the structure and content of captured texts, in a manner that facilitated the re- use of information. The World Intellectual Property Organization (WIPO) -- an umbrella organization for the world's patent offices -- adopted the EPO's SGML implementation, tag set and DTD, in its own standard (WIPO ST.32). The EPO still faces a number of problems. The documents they publish are highly technical, with a variety of tables, mathematical and chemical formulae, and special characters; the EPO uses a base character set of over 400 characters but many inventors like to create their own! This problem has been overcome by additional tags which identify standard enhancements to existing base characters ("floating accents"), and common constructs used to combine base characters in new ways ("character fractions"). Tables constitute a high proportion of the text of patent applications, so the EPO had to devise satisfactory ways of handling them. When DATIMTEX began, studies on table markup using SGML were scarce; the approach devised by the EPO is based upon the markup of complex tables suggested by the Association of American Publishers (AAP), and allows the contractors to mark up 80% of all the tables they encounter. The EPO is now participating in efforts to develop more sophisticated approaches to table markup. Mathematics did not constitute a high proportion of patent applications so the EPO has elected to adopt the tag set devised by the ISO (in favour of the AAP's). Capturing bibliographic information has been resolved by creating a set of SGML tags based on the WIPO International Identity (INID) codes. The results of [patent?] Search Reports are now also captured using a specially devised set of tags. However, the EPO has yet to find a satisfactory way of handling the markup of chemical formulae, or a means of producing a mixed mode display that shows the marked text with the associated graphics in-line. Overall, TR's feelings about SGML were very positive, and he felt that the EPO had made the right decision. 8. "Encoding the English poetry full-text database -- applying SGML to thirteen centuries of verse" -- Stephen Pocock (Senior Projects Editor, Chadwyck-Healey Ltd). Chadwyck-Healey are a small, dynamic publishing company, about twenty years old. Their main market is libraries and academia, and they specialize in data publishing (and are consequently having to produce more material on CD-ROM). Stephen Pocock (hereafter SP) said that Chadwyck-Healey's decision to use SGML to code the poetry for their CD-ROM was fundamentally a question of economics. Using SGML meant that the data would never be obsolete -- which was not only a selling point to Chadwyck-Healey's clients (especially libraries), but justified the efforts of setting up the original database. SP said that their encoding scheme had had to satisfy two basic criteria: practicality and utility. They had to "..devise a method of analysing and recording the structures of poetry that could be applied consistently across the canon by intelligent editorial staff with appropriate training and guidance". The more detailed their encoding scheme, the more work would have to be done on marking up and interpreting the text; too much complexity would increase the costs at every stage in the production cycle. Their intention was to mark up four thousand volumes in three years -- which meant that the markup should not be too demanding to key, but it had to capture enough information to support searching and display of the data. The main bulk of the keying has been contracted to outside agencies. The utility of the marked up data was determined by the requirements of the user groups. Chadwyck-Healey took the decision to provide markup that met the needs of the generalist rather than the specialist, but which also did not preclude the subsequent incorporation of additional, specialist tags. They consulted the first edition of the guidelines (TEI-P1) produced by the Text Encoding Initiative (TEI), and also spoke to Lou Burnard (of Oxford University Computing Service, and one of the TEI co-ordinators). SP noted that most texts can be viewed as multiple heirarchies, and this is especially obvious with poetry -- which can be read simultaneously in terms of both metrical and narrative heirarchies. Particular types of poetry, such as dramatic verse, have additional levels of reading. The TEI solution to marking up multiple heirarchies in poetry involves the use of CONCUR, but Chadwyck-Healey felt that this would be difficult for them to implement. Instead, they have opted for a system that uses SGML attributes to define various types of poetry (for example, <poem type=prologue>). However, whilst it is clear that poetry operates on many levels, Chadwyck-Healey have not been able to identify a structural element below the level of the (metrical) line that regularly occurs in their data. In the process of marking up their data, Chadwyck-Healey have encountered a number of non-standard characters which they have had to encode as entity references -- although they have not yet decided if or how they should be displayed on screen. They have also deferred any decision on the encoding of `graphical' or `shape' poems, where the physical layout of the text on the original printed page is apparently intended to relate to the poem's content. SP noted that Chadwyck-Healey's decision not to include twentieth century poetry, (fortunately!) meant that they did not have to deal with the unusual typographical layouts of some more recent works. 9. "Is SGML Bad News to your Editorial Staff" -- Koen Mulder (Wolters Kluwer Law Publishers, Deventer) In his presentation, Koen Mulder (KM) examined how the introduction of SGML would change the traditional publishing process -- which KM suggested was product oriented (and tailored to meet the demands of the market). Changes in management (mergers etc.), and changes in the market (e.g. demand for new media), have caused changes in the management of publishing. Now there is a greater need to have more central information, more reusable information, and more information interchange; all these require a neutral information structure, and SGML is the obvious answer. KM said that when implementing SGML, it is impossible to foresee all the consequences, and so there is sure to be a certain amount of confusion. Although he could offer no patent solution for dealing with this confusion, KM suggested that managers should pay particular attention to the physcology of their organization, and to the adoption of good procedures. Any changes to existing routines are often perceived as a threat, so KM suggested that new procedures should be introduced step by step, always accompanied by education and training that focuses on users' applications and working methods. Changes in working methods should also be carefully introduced, particularly with regard to such issues as the separation of content and format, who decides on the structures to be encoded, who actually does the encoding, and who has to deal with file translation. Many people will also change their function, as new jobs and tasks are introduced; the traditional publisher will become a manipulator of information (rather than an information broker), text composition will become a more automated process, and an integrated application will require greater cooperation and open-mindedness from people. KM also raised the issue of information management which, following the introduction of SGML, becomes a much less tangible process. For example, how secure is information that is stored centrally but intended for easy re-use and interchange? Who will physically manage the information files? Then there is also the question of what to do with existing information which might be held in files that are inconsistent, format-oriented, and intended for different typesetting systems or applications. If this existing information is to be incorporated into the new system, how is this additional workload going to be dealt with (and paid for!)? KM also identified some questions to be considered when dealing with external suppliers -- such as should their role only be that of information input (and markup), or can/should they take some of the responsibility for information management? Also, assuming that a publisher has produced information that has been marked up with SGML, how many typesetters are actually capable of working with that information? KM concluded by suggesting that SGML is not a threat to editorial staff but a challenge to the whole organization. 10. "SGML in the Software Development Environment" -- Shaun Bagley (General Manager, Exoterica Inc.) Shaun Bagley (SB) set out to describe how Exoterica Inc. are using SGML as part of their internal software development environment. He stated that "SGML is a BNF-like language which allows its users to specify the structure of a language. However, unlike BNF, the user does not have to worry about the syntax of the language. That is completely managed by SGML". SB described SGML's function as a meta-language, saying "SGML is a language for describing arbitrary LL1 languages. The syntax and the grammar are designed from scratch by the user. In this sense SGML is a `BNF' for text-based languages." SB proposed four "generations" of markup language. The first was characterized by typesetting codes and procedural markup, and the second by macros and generic markup. The third "generation" of markup languages included WYSIWYG, hypermedia and generalized markup -- whilst the fourth included "advanced language[s] designed for precise expression of particular problem[s]". SB remarked that whereas in the third generation, documents could be understood without access to their markup language, in the fourth generation the markup is an essential part of the document (which cannot be understood without it). SB suggested that one of the major problems in software engineering related to the integration of programs and their documentation. Programs start out under documented, perhaps because their design is incomplete, or because much of the detail remains in the engineers' heads. Programs also lose "synch" with their documentation, largely because whilst the programs themselves are maintained their documentation is not -- and once documentation has got out of synch, it becomes very costly to resynchronize it with the program code. SB said that there is a gap in the processing model because, for example, C compilers do not interface with the desktop publishing systems used to produce documentation, and the desktop publishing systems cannot easily be linked to the work of the compilers. He suggested that much of what is now perceived as coding is in fact documentation e.g. data structures, interfaces, and control logic. SB identified his key considerations for a good software engineering environment, and the benefits of using a fourth generation "advanced language", namely: "The software can be structured from the point of view of what it does, rather than how it is built...Design documentation (the what) and Program documentation (the how) talk in common terms without duplication...a finer level of granularity [is encouraged] through a description of what a system does....[an advanced language] allow[s] assignment of real names to objects early on in the development cycle". When Exoterica decided to develop a new parser (XGML Kernel) they chose to do so using their own advanced language to abstract the design and development procedure. They developed the Exoterica Coding Language (ECL), as a precise and concise language intended for a specific purpose and thus inappropriate for other uses. ECL was an application of SGML and OmniMark (Exoterica's SGML translation tool) and it was used to describe the following: names of procedures, data types and non-local variables; the form in which procedure arguments and types are defined; the form in which data types and data structures are defined; the form in which variable and constant names and values are defined; the format of comments. SB concluded by outlining the business advantages to be gained from developing advanced languages such as ECL. Concurrency of documentation and program code is maintained. Communication is carried out in terms that fit the problem, not the solution. Risks are reduced, because both developer and client can be clear when sign-off targets have been reached. Products can be developed with greater speed. Savings can be made that cannot be achieved by traditional development methods. 11. "Implementing SGML at Ericsson Telecom: two perspectives" -- Peter Dybeck (Project Manager, Docware Development, Ericsson Telecom AB), Nick van Heist (Technical Consultant, TeXcel AS AB) This presentation gave both the client's and the developer's perspective on implementing SGML at Ericsson Telecom. Peter Dybeck, who was to give Ericsson's (i.e. the client's) half of the presentation was unfortunately unable to be present, so his place was taken by his colleague Helena Antback[?](HA). In February 1991, Ericsson decided to use SGML for text creation and interchange. They set up two nine month projects to investigate the definition of the DTDs and the file conversions that they would require. They started by designing one DTD for test purposes, but eventually decided to produce five DTDs to cover all their document types (i.e. "one major procedural document type, two source document types, and two auxiliary document types"). File conversions fell into two main types: the conversion of old procedural documents from non-structured forms into Ericsson's own SGML DTDs, and the conversion of source documents produced using the Ericsson Document Markup Language (EMDL) into the same target DTDs. HA admitted that they had expected the conversion of old documents to be difficult -- but it turned out to be even more difficult than they had anticipated. She added that they had expected the conversion of EMDL documents to be quite straightforward -- but it became extremely difficult because of the large number of documents involved, and the highly variable quality of the EDML markup. HA then summarized Ericsson's experiences. Technical writers must take part in designing the DTD, FOSI and working environment from the beginning. Introducing a SGML and a new desktop publishing tool affects the whole organization. The conversion of non-SGML documents into highly structured DTDs is very difficult and requires manual clean-up. An accurate FOSI is difficult to create. The entire working environment, the DTDs and the FOSIs require extensive testing by experienced staff. She added that they had not found any short-cuts on the way to a successful SGML implementation (trial and error, flexibility and hard work are the only ways to achieve this). Also, knowledge of computer systems, the SGML standard, and preferably the FOSI standard are necessary in order to implement SGML. HA concluded by saying that Ericsson felt that SGML requires a good deal of work in the initial stages but is very rewarding in a long-term perspective; it creates brilliant possibilities for handling textual information in a worldwide company. Nick van Heist (NvH) began the second half of the presentation, by outlining the project requirements given to the consultants (TeXcel). More than one hundred writers had to be moved from a WYSIWYG publishing tool to an SGML authoring system; these writers knew little about SGML and were happy with their existing system. TeXcel were being asked to provide an introduction to SGML, develop an SGML application, and transfer knowledge about the system's development and maintenance all within a short timeframe. NvH noted that the following design goals: to migrate to a new system whilst maintaining productivity; to make it easier to handle structured data; to automate an environment for shared and generated data; to use internationally accepted standards (e.g. SGML, FOSI, and PostScript). NvH had identified three keys to the successful completion of TeXcel's task -- all stemming from Ericsson's management. Management were fully committed to the project and were prepared to be flexible because of the new technology that was being introduced. Management were also prepared to allow access to end users. TeXcel had originally relied on project managers as intermediaries through to the writers, but as direct contact with writers had evolved they had found it possible to get more accurate information about user requirements, and early testing and feedback on the application. NvH said that interaction with the writers had shortened development time, produced a system that was better suited to the writers' environment, and encouraged the writers to feel more involved with the system (leading to its easier acceptance). TeXcel's overall approach had been to analyze both the easy and the difficult aspects of the existing system, and then to concentrate on making the difficult aspects easy in the new SGML system. When TeXcel began to train users for the new system their initial approach focused in general terms on SGML and the authoring system; real applications were not part of the training, and writers were not made aware of the reasons behind the decision to move to SGML (which led to additional reluctance to change). As the project evolved, TeXcel found that using real applications made the training more meaningful, it gave the writers the specifics that they needed to relate to. Informing the writers of the reasons for using SGML motivated them to learn. NvH summarized what TeXcel had learnt from their experiences with Ericsson in four points: (i) involve end-users early in the development of the application and the system, (ii) shorten the chain of communication from application developer to end-user, (iii) make training specific to the application, (iv) help end- users to understand the benefits of SGML and why they need to use it. 12. Reports from SGML Users' Groups Chapters, Special Interest Groups (SIGs) and Affiliates -- Chair: Pam Gennusa (President, SGML Users' Group) This was an open session in which people were encouraged to give short summaries of their work with SGML, or the activities of their local SGML User Group or particular SIG. Many people spoke, and I apologize in advance to anyone I may have overlooked. 12.1 The European Workgroup on SGML (EWS) -- Holger Wendt Holger Wendt (HW) described the rationale behind the setting up of the EWS, and the results of their work. They had begun by producing the MAJOUR Header DTD which had been made available for public comment at International Markup `91 in Lugano. Much of the early work had been achieved by a collaboration of Dutch and German companies involved in publishing and typesetting. Following Markup `91 the EWS had become increasingly multi- national, and those involved had gone on to work on producing a Body and Backmatter DTD which could be integrated with the Header. HW predicted that a complete DTD would be available for comment by the summer. The EWS had followed the example of groups such as the AAP and TEI, and established sub-committees to look at problem areas such as the markup of complex tables and equations, document conversion, DTD documentation and maintenance. HW reported that some of the publishers involved in the EWS have already begun to use the Header DTD. 12.2 The SGML Project -- Michael Popham I reported on the work of the Project since its inception in May 1991. Set up to explore, encourage and support SGML use throughout the UK's academic and research communities, The SGML Project is now half way through its allotted lifespan of two years. The Project collects and disseminates information on SGML and its use in a variety of ways: by reviewing and evaluating the available products, reports and documentation (although this does not involve any formal testing); by offering a free programme of lectures, seminars and workshops; by offering support and consultation (on a limited basis) to users of SGML. We are in the process of building an archive of SGML (and related) materials -- such as parsers, translators, DTDs, entity sets, reports, etc. -- and making these available to anyone on the UK's academic network (JANET); the host machine is accessible via anonymous ftp over INTERNET, and The SGML Project is keen to foster links with the world-wide SGML community. For people who have difficulties with their network connections, we will also distribute material on disk, tape, or paper, although we must give preference to our target communities, and we may have to recoup the costs of supplying to other users. The SGML Project is also creating a database of users (both within and without the UK), and this is proving useful when evaluating software, books etc. because it gives an impression of what users want to do with SGML, what hardware they have available, what users are currently working on, and so forth. As hoped, awareness of The SGML Project within the UK is growing fast. Many people are coming to SGML (and thence the Project) by diverse routes from across a tremendous range of subject areas and backgrounds. Most people who encounter SGML seem to have become quickly aware of its inherent advantages -- however, many of these same people are put off actually using SGML because of the initial problems involved in its implementation. Most people seem to want to take advantage of SGML without having to get too involved with learning about SGML itself. They are seeking software that does what they want at a price they can afford. Many want to see working examples of SGML-based work being done in their particular area of interest before they are prepared to make a full commitment to SGML. Many potential users also want to get hold of examples of good DTDs, sample marked up documents etc., so that they can begin to really grasp for themselves what SGML is all about. The SGML Project's terms of reference are exceptionally broad. The UK's academic and research community consists of more than one hundred institutions -- which represents a potential user base of over 400,000 people! Happily, usage of SGML (and related standards such as HyTime and SMDL) is developing fast, although this makes it hard to stay abreast of the latest developments such as work on DSSSL, SGML-B, or the activities of the Davenport Group; ISO 8879 is also undergoing its five-year review. There is clearly a great deal of interest in SGML, and a growing requirement for the sort of information and advice service offered by The SGML Project -- however, initial funding will be exhausted by May 1993, and unless additional sources can be found it is likely that the Project will be wound down. 12.3 CALS Update -- Michael Maziarka Michael Maziarka (MM) outlined the previous goals for CALS, as given in MIL-M-28001: it was paper-based, offered a number of DTDs (eg. MIL-M-38784B, a general purpose DTD, and 11 other military specification DTDs), detailed a baseline tag set from which CALS DTDs must be built, and gave output specification[s]. The problem with the approach adopted in MIL-M-28001 is that it did not take into account the movement towards the use of electronic displays, that there has been a proliferation of DTDs, that the size of the specification is increasing beyond reason, and that DTD modifications require too long a process. MM said the goals of MIL-M-28001 would be revised to offer a guide for developing SGML applications (catering for both paper and electronic display, and detailing use of the CALS SGML Library and CALS SGML Registry), an SGML Toolkit (for handling architectural forms, providing electronic review capabilities, and supporting delivery of partial documents), and output specification[s]. MIL-M-28001B, which is currently under review, offers most of these new goals, and includes 800 additional elements that have been added to the baseline tag set, 14 U.S. Air Force DTDs and 2 U.S. Navy DTDs. MM discussed the problems of submitting a partial document (as opposed to an entire document). A partial document might consist of an interim deliverable or a change package, where the transmitted material contains only the document hierarchy and element attributes which indicate inserted, deleted or changed data. Taking advantage of SGML's CONREF feature, a "stub" attribute has been added to %bodyatt; so that when an element's stub attribute is set equal to "stub" the element is EMPTY. MM then spoke about the CALS approach to handling the electronic submission of comments. A DTD fragment for delivering review comments in an SGML syntax has been developed, such that comments may be submitted either separately from a document or within it. The comments are related to the original text by reference to element IDs. For every comment it is possible to identify the comment source, provide a unique ID for future reference, record comment priority, classification and category, and also comment disposition [?]. The CALS SGML Registry (CSR) has been set up to provide the authority and process for reviewing and approving DTDs, elements, and their use. The CSR will also establish the CALS SGML Library (CSL), which will create a common repository for SGML objects, including DTDs, elements, attributes (not values), entities, output specification elements, FOSIs, and fragments. The SGML Registrar will carry out a number of tasks, including the following: standardardizing SGML tagging, providing the authority to determine CALS-compliance and review emerging DTDs (for conformance to MIL-M-28001 and the CSL, not general soundness), and providing a common site and management that is independent of any of the armed services. The CSL provides on-line access to approved DTDs, elements, attributes etc., and demonstrates guidelines for applying SGML to CALS applications. It will also facilitate the development of new DTDs and help to avoid any duplication of effort. The registration of SGML objects at the CSL will involve a number of issues. For example, naming conventions will require the use of consistent, short, meaningful names, and careful terminology; aliases must be used to ensure a one-to-one relation between objects and concepts; a decision must be taken on whether to use structure or content tagging; similar content/structures across the services should be named and described generically (whereas unique content/structures should be named specifically). MM concluded that MIL-M-28001 will become an SGML Toolkit which will support partial documents, electronic review, and paper and electronic display applications. The CALS SGML Library and Registry will provide a guide for developing new applications, and a central repository for information on SGML applications. Much of the work taking place under the CALS initiative should prove useful in other, non-military, contexts. 12.4 Dutch Chapter, SGML Users' Group -- Jan Masden Jan described the work of one of the oldest and most active Chapters of the international SGML Users' Group. They have frequent and well-attended meetings, and have become closely involved with the work of the European Workgroup on SGML (EWS). 12.5 Norwegian Chapter, SGML Users' Group -- Jan Ordahl This Chapter had recently celebrated its first birthday (its appearance having been announced at last year's SGML Users' Group AGM). Like the Dutch, they seem to be very active and well- organized, with frequent meetings forming part of their agenda. 12.6 SGML for the print-disabled -- Tom Wesley Dr Tom Wesley (TW) has become closely involved in various international efforts to improve information access for the print-disabled. Although he is particularly concerned with readers who are visually impaired, he is also aware that "print- disabled" is a catch-all term that includes people who are dyslexic and those who find it difficult to use traditional paper documents. TW told how he came across SGML almost by accident, when investigating the ways in which structured electronic texts will enable visually impaired readers to use documents in ways that have previously only been available to sighted readers. For example, sighted readers are used to being able to scan whole documents -- or crucial parts such as tables of contents and indexes; they can also take advantage of multi-column texts, tint-boxes that highlight key information, explanatory diagrams, follow cross references, and so on. Much of this information, and the benefits to be gained from certain forms of presentation, are lost when an existing text is re-keyed into a braille edition. Moreover, the page numbers in the braille edition bare little or no relation to the original printed text, which makes it difficult for a blind scholar to discuss a text in detail with sighted colleagues, or for a blind student to turn to the same page in his/her textbook at the other students in the class. There are also tremendous difficulties to be overcome when rekeying special text such as maths, chemical formulae, complex tables etc. for a braille edition -- not least because many countries have developed their own national schemes for transcribing maths, which prohibits the use of math textbook produced in American braille being used in a school in the UK! Maths is an international language, but unfortunately the same cannot be said for its braille transcription. TW is hoping that the use of structural markup, such as SGML and ODA, will offer a route to the automatic transcription of conventional texts. He is particularly interested in the work of groups such as the AAP and EWS, which are seeking to standardize the markup of a broad range of texts. TW is keen to raise the awareness of the issues of print disability amongst those currently involved in designing markup systems for texts. 12.7 French Chapter, SGML Users' Group -- Michel Biezunski This group was still in the process of setting itself up. Although there are only about thirty members at present, they expect to have near one hundred. A programme of meetings and events was being planned. 12.8 SGML Forum of New York, SGML Users' Group -- Joe Davidson The SGML Forum of New York is effectively a Chapter of the SGML Users' Group operating under a regional, rather than a national name. Joe Davidson listed some of their recent events, and outlined their plan to possibly establish their own electronic bulletin board, managed by a local university. 12.9 SGML SIGhyper, SGML Users' Group -- Steve Newcomb Steve Newcomb (SN) is Chair of SGML SIGhyper, the SGML Users' Group Special Interest Group on Hypertext and Multimedia. HyTime was approved as an International Standard only two weeks prior to International Markup `92. It is the only International Standard for multimedia that is currently available. SN noted with approval that HyTime is perhaps the only International Standard which is actually ahead of existing technology! SGML SIGhyper aims to provide "one-stop shopping" for those seeking information on HyTime, wanting sample documents etc., and supports two anonymous ftp sites at the universities of Florida State and Oslo. SGMLSIGhyper does not have meetings, but is intending to produce a regular newsletter (of which the first issue is already available), and keeps in touch with most of its members through electronic mail (email). SGML SIGhyper has also provided some support for the work of the Davenport Group -- a collection of major hardware and software manufacturers and vendors who wish to collaborate on producing on-line documentation (cf. UNIX "man" pages). They have had considerable success encouraging the Davenport Group to use many of the features outlined in the HyTime standard. Members of SGML SIGhyper are now turning their attention to work on another prospective International Standard, SMDL (standard music description language) which is currently still at "draft" status. In some ways this is ironic, since HyTime originally grew out of work that was being undertaken as part of the development of SMDL! 12.10 UK Chapter, SGML Users' Group -- Nigel Bray. Nigel Bray (the acting secretary for the UK Chapter) outlined recent activities to revive one of the oldest Chapters in the SGML Users' Group. During the late 1980's, activities of the UK Chapter were synonymous with those of the international SGML Users' Group, but initial enthusiasm for the Chapter had tailed off -- apparently because of the slower than anticipated uptake of SGML following the release of ISO 8879 in 1986. At the first meeting of the revived Chapter, there were about 90 attendees from a variety of backgrounds (publishing, academic, software houses etc.) A meeting was scheduled for 6-8 weeks time, which would allow for elections for various posts on the committee, and serve as the first AGM. 13. 1992 AGM SGML Users' Group -- Chair: Pam Gennusa (President, SGML Users' Group) A full account of this meeting will probably appear in a forthcoming edition of the SGML Users' Group Newsletter, so I will not cover it in great detail here. Various formal reports were made, and for further details I would direct readers to the Newsletter. The setting up of some sort of foundation or trust was discussed, to make good use of the excess monies arising from SGML Users' Group (SGMLUG) members' fees. It was suggested that SGMLUG members could apply to the foundation/trust to cover the cost of organizing workshops, sending key people to crucial meetings (of the ISO, AAP, EWS etc), funding conference speakers such as academic researchers, and so on. The SGMLUG will investigate this. Possible restructuring of the SGMLUG was discussed. For example, the SGMLUG secretariat -- based at Swindon in the UK, with Pam Gennusa -- could become even more secretarially-/support- oriented than it is at present. The exact relationship between the various national Chapters and local groups, the Special Interest Groups (SIGs), and the SGMLUG secretariat is still unclear. Pam Gennusa said she would be interested to receive any comments or opinions on the nature of these inter-relationships, or if/how things could be better organized. Brian Travers asked representatives from all the Chapters, groups and SIGs to forward announcements of meetings to him, for inclusion in the journal <TAG>. This would help to ensure that many people (even those who are not members of SGMLUG) would be aware of what events are taking place. The next meeting of SGMLUG is scheduled to co-incide with the SGML'92 Conference, which will be taking place in Danvers (MA) towards the end of October. PROGRAMME (Sunday 13th May) This was the first day of the joint conference, when International Markup `92 would co-incide with the first Documentation Europe `92. I was only scheduled to attend for the first day of the second conference. 14. Keynote address -- Ken Thompson (Commission of the European Communities) Ken Thompson (KT) began by outlining the implications of the changes arising from the introduction of the single European market in 1992. There will be free movement not only of goods, services and people, but also of information. Information is the key to making the single European market work; for example, tax information must flow freely between states in order to encourage the free movement of goods and services. Information must be able to get to the right place, at the right time. Open Systems technology would seem to be the key to entering this paradise of information sharing, but people seem strangely reluctant to take it up! Manufacturers want to move goods and services. For many, moving the information about goods and services is simply a side-line -- even if it is one required by European law! Such people need expert advice, good software, and the like to make the free movement of information possible and easy to implement. With the assistance of the Commission of the European Communities (CEC), the various Public Procurers have been able to produce a European Procurement Handbook for Open Systems (EPHOS). This gives advice not about the learning, the standards, or the technology required for the movement of information, but rather about how to specify the correct conformance to the standards in procuring the desired information handling functionality. The work of EPHOS is still only just beginning. The next phase will provide advice on the use of SGML and ODA, and it will also link into other EC initiatives such as the TEDIS programme's work on Electronic Data Interchange (EDI) and the guidance on information publication originating from the Open Information Interchange (OII) project (which is part of the IMPACT programme). Use of EPHOS is not limited to the Public Procurers, and work is underway to ensure its usefulness to smaller private companies, universities etc. The first EPHOS handbook will be available in all the EC languages and distributed throughout the EC nations; it is a practical handbook that helps people to conform to legislation -- it is not designed to give the legislation itself. The timely establishment of a common information interface between all the sources of information and the information distributors is a key issue. SGML will clearly play a major role because of its ability to allow the authors of information to make the structures of their information explicit -- thereby allowing it to be easily re-distributed in a variety of forms, such as paper, on-line, CD-ROM, etc. KT suggested that in general, SGML DTDs are becoming too complex, and that this is delaying the uptake of SGML. He said that he would like to see simplified, more accessible forms of SGML, such as more standardized DTDs. 15. "Technical Trends Affecting Decisions about Documentation Generation" -- Andrew Tribute (Consultant, Seybold Limited) Andrew Tribute (AT) began by asking the question "Should (electronic) documents be static?". This question in turn raises a number of other issues to do with document design. For example, whether documents should only contain fixed data, or if they should be dynamic to meet readers' needs; should documents only be passively read, or should the be interrogated/listened to/watched; should documents assist users to find the information appropriate to their tasks? AT then turned his attention to document formats. Paper is expensive and time dependent. Documentation distributed as soft copy (i.e. viewed on-line) has some advantages over paper, particularly if the reader has access to a massive selection of documents and does not require the application that was used to create the document in order to read it. Other varieties of document such as multimedia and email also have advantages over traditional paper documents. CD-ROM offers a format that can accommodate almost all other data types and are becoming a low cost means of mass document distribution. AT then discussed the way technical trends have affected the types of data held in documents. Originally, the only data in electronic documents was text, but more recently it has come to include monochrome and colour photographs, vector images, sound, animation, and video. He described how Kodak's Photo-CD technology will provide a cheap, mass market means of storing high quality colour image data on CD-ROM. Standards such as JPEG and MPEG will facilitate the storage of still and moving colour images. AT anticipated that multimedia technology will capture an increasing share of the market for information publishing, as the general public starts to demand more multimedia products. AT observed that if documents are going to be electronic, then users will need to have the correct tools to be able to read and interact with them. Moreover, he stated that the ability to read/interact with a document should not require readers to have access to the same software or hardware as was used to create the document. It must also be possible to create universally readable documents from a variety of separate applications. AT briefly discussed Adobe's "Carousel" product, which allows documents saved in Interchange PostScript to be read, annotated etc. (but NOT edited); this appeared to be a first step towards using editable PostScript as an interchange format. AT also mentioned Interleaf's "WorldView" and "WorldView Press", which allow documents produced in many different formats to be indexed hyperlinked and book-marked, and then made available in finished format on a wide variety of computers; users can read, annotate, print, perform information retrieval and follow multimedia document links. In his closing remarks, AT agreed with the previous speaker that SGML needs to be made more accessible. He noted with regret that many suppliers to the mass publication market -- companies such as MicroSoft, Apple, Aldus, etc. -- were not represented at either of the conferences. He restated his belief that colour will place an increasingly important role in the next generation of publishing products to be released in the mainstream/mass market. 16. "Providing a strategy for the practical implementation of document management systems' -- Martin Pearson and Graham Bassett (Rank Xerox Limited) Martin Pearson and Graham Bassett (MP&GB) looked at the implementation of document management systems from a strongly business-oriented point-of-view. They saw documents as being a vital part of any business, forming the link between organizations and their customers and, after staff, as the single most important corporate resource. MP&GB made a comparison between a typical technical publication, and more general forms of "communication". A technical publication typically represents high value, little investment in technology, is created by skilled workers (often in a vacuum), and has low visibility. By contrast, most "communication" has low inherent value, represents a massive investment in technology, is created by most people (and often duplicated), and has high visibility. They also briefly looked at the issues surrounding the management of technical publications, noting such points as: standards such as SGML are often hard to grasp; some information has "critical value" for the company; document management is not a boardroom problem (and so is invisible); the recession inhibits investment in new processes; current "cut and paste" document-processing technology is inappropriate for supporting integrated document databases. MP&GB then looked at some case studies. A cycle time and cost reduction exercise in financial document administration, carried out internally at Rank Xerox, using their TQM [Total Quality Management ??] approach, had resulted in the cycle time being reduced from 112 days to 1 day, 4 hours and 43 minutes for new products! A UK financial institution had asked Rank Xerox to streamline their process of producing documentation to support the personal financial services that they offered. Rank Xerox's three-month study had resulted in a 25% reduction in production costs with no investment in technology. They had also made seventy-five recommendations for the future, of which seventy were to do with process improvement (representing a potential saving of three million pounds), and only five to do with the technology required to support these processes. Summarizing Rank Xerox's experiences, MP&GB noted that vision is essential to drive the document strategy. Whilst millions are spent on technology, very little is spent on people and developing their skills. Shortage of skills is the primary factor limiting the successful introduction of new technologies. It is essential to achieve the correct balance between focussing on standards/technology and the key business processes driving document production. 17. "Technical information as an integrated part of a product or a function" -- Goran Ramfors, (Marketing Director, Telub Inforum AB) Goran Ramfors (GR) suggested that those who realized the value of efficient information handling, recognized the seriousness of the current situation and were prepared to take action, are those most likely to crack open what he called "today's biggest piggy bank". GR said that at the product level it is vital to integrate hardware, software, and "docware". Docware adapts information to the users' needs an presuppositions; its aim is to make technical information minimal, comprehensible, effective, accessible and easily updateable. GR cited two studies, the first showing how ambiguous, out-dated or unavailable documentation caused 50% of all unplanned breakdowns in a sample of 24000 high-tech systems in the US; another study had shown that McDonald Douglas engineers wasted 70% of their time either searching for information or correcting mistakes arising from the use of incorrect or out-dated information. At the development and production level, GR pointed-out that most recent activities have concentrated on improving external efficiencies, whereas studies suggest that about 70% of all internal technical information should be re-useable within the company. In fact, only about 2% of the available technical information is re-used in this way -- which has obvious financial implications. Nowadays, there is a trend towards improving internal efficiency to reduce duplication of work, re-use existing information, and shorten production lead-times. This approach has produced great cost savings, but the crucial factor to its success is the creation of an infrastructure for total, integrated, document handling. GR said that there is a need to integrate other, non-paper media into the document-/information- handling process. Technical developments in the multi-media field are moving rapidly, and our everyday lives are increasingly influenced by the use of such things as optical disks for storage, computers and screens as information conveyors, hyperfunctions and freetext searching for effective access, graphic user interfaces, sound, both interactive and animated video. GR claimed that with multimedia technology we can present information in a more logical and structured format, which often makes it easier for the end-user to find and transform just the information needed to perform an activity. Moreover, there is also the problem of having to cope with an ever increasing volume of information. For example in 1994, a company that wants to service the entire range of BMW cars will require access to 500 A4 binders; even if the company can physically accommodate these files, technicians will probably find it difficult or time-consuming to find all the information that they want. Handling the same volume of information digitally is the only sensible option. Before commissioning a multimedia solution to any information management problem, it is important to consider the end-user's working environment and conditions. The user must be able to both work efficiently with the medium and want to do it -- this necessarily includes structuring the information database in such a way that it make it easy for the end-user to retrieve and use the information s/he needs. The solution must also be adapted to the end-user's skill and experience to avoid getting more information than is necessary. Every multimedia solution must be based on good information solutions. GR asserted that it is the combination of skilled technicians, the right multimedia tools and information experts, that is the key to achieving such factors as faster troubleshooting, shorter downtimes and repair times, efficient information retrieval and updating, happier clients and more motivated staff. Achieving these will give a business lower costs and a competitive edge. Savings to be made from increased efficiency in information handling arise from a strategy based firmly on standards, and should include the following aims: to re-use current information; to secure the communication between information islands in the organization; to reduce time and costs for production, handling and distribution; to increase the information quality by adapting it to suit the users; to strengthen the resource captial when people leave the organization; to secure investments for the future. As an example, GR discussed the US Department of Defence's (DoD) CALS (Computer-aided Acquisition and Logisitics Support) initiative. He claimed that CALS will inevitably have an influence on our choice of products, systems solutions and standards (since many of the major vendors will be trying to achieve CALS compliance). GR said that standards will be critical for those * working in an industry that has to decided to follow the CALS strategy * who have a requirement to exchange information (both internally and externally) * who continuously have to verify their database * who are working with large volumes and/or have a high updating frequency * who want to secure their investment for the future GR reminded his audience that the emerging standards are only solving the problems of information management, not those of information quality! To carry through an integration strategy for technical information within any organization is not easy and not free of costs and effort; the key to success lies with top management. If their interest cannot be captured, and they cannot be made to understand the importance of taking strategic decisions, the risk of bad information management, increased costs and decreased competitive strength is obvious. 18. Documentation as support product -- Chair: Nick Arnold (OECD) Nick Arnold opened this panel session by saying that many of those attending the conference recognized the need and role for documentation as a support product. The purpose of the session would be to hear some case studies, and consider the best way to go about achieving this. Fundamental questions would be raised, such as "What are the processes and procedures that must be put in place to ensure that documentation is synchronized with a manufactured product?" and "What technology is needed to achieve this goal?". 18.1 "Synchronization of documentation and training for telecommunications products" -- Douglas N. Cross (Member of Technical Staff, Technical Support Switching Group, Bell Communications Research Inc. (Bellcore)) Douglas Cross (DN) began by stating that it is important that a supplier document each product from the conceptual phase, through the design and manufacturing phases, to the time that the product is placed in operation by a customer -- and then on an ongoing basis, so that changes in product configuration, functionality, and maintenance continue to be adequately addressed in product documentation and training. That is, changes in the product should trigger continuous synchronization of changes in related documentation and training. Bell Communications Research Inc. (Bellcore) produces certain Generic Requirements documents which are used by Bellcore's Client Companies (BCCs) to promote the creation, provision, and maintenance of high-quality, standardized, and synchronized documentation and training for their use. Generic Requirements documents start off as Technical Advisories (TAs) which undergo a formal process of comment and review before becoming Technical References (TRs). TRs address the generic needs of a typical BCC in areas such as product functionalities, technical capabilities, features and services, and product support (documentation and training). Bellcore has two Generic Requirements documents that are designed to ensure that documentation and training remain synchronized with the products, and that they meet appropriate standards and quality controls. The response of suppliers to these requirements has been overwhelmingly positive; having the requirements enables suppliers to know what is expected over them (and this encourages them to take an active interest in, and comment on, TAs). Suppliers also appear more prepared to commit major resources (money and people) to improving their documentation and training to meet Bellcore's Generic Requirements; many recognize that good documentation and training helps sell their products. Bellcore is actively involved in several national standards committees, in order to encourage the adoption of sound documentation and training practices. BCCs are very concerned about the usability of documentation, whether on paper or as Electronic Document Delivery (EDD). If documentation is sufficiently usable, BCCs may be able to reduce their training requirements, and make cost savings as a result. Whilst BCC network maintenance documentation users need to store, access, retrieve and search documentation quickly and easily, they prefer to leave the technicalities of creating and using SGML documents to the document producers. BCCs place great importance on linking documentation and training support to the service life cycle of products and services they purchase. They expect suppliers to provide direct documentation and training support, or to assist in obtaining these from a third party, or to enable the BCCs to provide these themselves. DC then turned to the issue of "bridging cultural policies", by which he meant not just handling the differences in international cultures, but also those aspects of corporate- and user/provider- cultures as well. On the international front, DC reminded the audience that the way things are done in the supplier's country may not meet the needs of a customer in another country. Suppliers should be receptive to customers' needs, regardless of the countries involved; other countries might have their own Generic Requirements documents, and suppliers should take notice of any international and/or inter-industry user groups where documentation experiences are shared. At the corporate level, suppliers have different approaches to meeting the needs of BCCs' for Documentation and Training. DC emphasized the importance of liaison with suppliers, and the need to ensure that those who will be responsible for maintaining the documentation are involved at an early stage in negotiations to supply initial documentation and training. DC suggested that users, providers, and those involved with producing Requirements and Standards should be brought together to achieve a number of goals. For example, to promote Electronic Document Delivery, to bring down the costs of producing and purchasing documentation, to address documentation needs that may be unique to each country/industry, and to ensure that everyone's needs are met. Bridging cultural policies is good for all concerned: suppliers gain by greater acceptance of their products (i.e. increased sales); customers gain greater operating efficiencies and cost savings; suppliers, customers, and requirements/standards groups gain by having requirements and standards for documentation and training that are more widely acceptable. DC concluded by raising a number of questions that suppliers and customers could usefully ask themselves. Suppliers should consider whether their documentation and training groups synchronize with each other, and other groups to ensure that documentation and training reflect the actual product configuration and the latest product developments. Suppliers should ensure that they have procedures in place to provide systematic development and review by the appropriate experts to ensure the technical accuracy, usability and timeliness of high- quality, standardized, and synchronized product documentation and training. Suppliers should also ask if their documentation and training groups consult customers to ensure that documentation and training meet the requirements, standards and needs of customers. Customers should check that there is synchronization between their users of documentation and training, their documentation production/procurement/distribution and training groups, and supplier documentation and training groups, to ensure that they obtain high-quality, standardized and synchronized documentation and training on a timely basis. Customers should ensure that user feedback loops are in place to correct and improve existing documentation and training, and to provide any additional documentation and training that might be required. Customers should also check that review procedures are in place to ensure that internally provided and externally provided documentation and training are relevant to and usable on the job(s) for which they are intended. 18.2 "Aerospace technical publications in the 90's on a multinational nacelles project" -- Paul Martin (Technical Publications Coordinator -- Customer Support, Nacelles Systems Division, Short Brothers PLC) Paul Martin (PM) gave a brief introduction to Short Brothers PLC; they are an aerospace company based in Northern Ireland. Short is currently involved in the International Aero-Engine Project, which is a collaboration of five major companies based in different European nations. Information on the engine being developed by the project runs to more than 10,000 pages, and it all has to be easily transferable between the different companies, and maintained for the next twenty-five years. Shorts provides a suite of manuals, each conforming to the ATA100 specification. Although each manual has a very different format, the same information may appear in several manuals simultaneously! Describing Short's document development and production cycle, PM said that most documents are produced on paper; draft copies are sent to Rolls Royce (who re-key it for their own on-line documentation system), whilst camera-ready paper documents are sent to Fiat in Italy, and Rohr Industries in Germany. In theory at least, both the camera-ready and on-line versions of the documentation should contain exactly the same information. Looking to the future, PM said that Short Brothers expect that the transfer of information in digital form (on disk, tape, or optical disk) will become a requirement. He anticipated that CD- ROM would soon replace all their paper documentation, and that SGML will be used to prepare all the manuals. Short Brothers are developing their own electronic publishing capability, and also a computer network to encourage and support the free flow of information. However, they still have some work to do on deciding how to best control the information flow, especially prior to the information being put into digitial form. Since international collaborations will be part of the future, other companies will soon be confronted with the types of problems that Short Brothers and their project partners are experiencing now. PM concluded by suggesting that successful documentation control will only be achieved by people working to a thorough system -- irrespective of the introduction of ever more sophisticated technology. 18.3 "A publisher using databases: feelings and experiences" -- Joaquin Suarez Prado (Prepress Director, Librarie Larousse) Joaquin Suarez (JS) briefly outlined the requirements of Librarie Larousse, and their solution for producing documents from an [SGML] database. Staff at Librarie Librarie Larousse use BASISplus (from Information Dimensions) for database management, WriterStation (from Datalogics) for editing, and DLPager (also from Datalogics) for page composition. JS suggested that perhaps their most demanding requirement was the need to keep a record of all the editorial changes made to every document between publication dates. JS summarized the criticisms Librarie Larousse had of their current system. They felt that it had taken an unacceptable amount of time to get the system up and running, and their only consolation was that this long lead-time appeared to be inevitable. JS was slightly depressed to learn that editors at Librarie Larousse favoured those features of the new system that were familiar to them from the old; they seemed almost reluctant to take advantage of the opportunities offered by the new system. Librarie Larousse considered that the performance of their database had not been what they had anticipated or would have liked. Using an SGML-based approach had added to the complexity of writing a dictionary entry. JS also felt that SGML markup was "heavy". However, there were also a number of benefits gained from implementing the new system. Since its introduction, editorial capacity had trebled, and Librarie Larousse were now producing twice as many dictionaries and encyclopaedia as before. The dictionaries they are now producing are more reliable, more accurate, and contain more information than was possible before. The new system ensures that the information is being made available to the maximum number of people. Librarie Larousse are now looking ahead to see what other advantages can be gained from extending their use of SGML. 18.4 "U.S. WEST's approach to object oriented information management" -- Paul J Herbert and Diane H A Kaferly (U.S. WEST Communications) Paul Herbert and Diane Kaferly (PH&DK) described U.S. WEST's approach to treating information as a product. Information is independent of its platform, display or medium; information about a U.S. WEST product provides additional value (accessible via screens, manuals, and training materials). Information about their customers, products, network and competitors was critical to U.S. WEST even before the advent of computers. The information products that U.S. WEST produce must be designed to cross business and geographical barriers. PH&DK defined the "classic paradigm" that had confronted U.S. WEST when considering its approach to information management; they needed to work through three distinct phases -- first define their problem clearly, then devise an appropriate solution, and lastly to implement this solution. In order to define their problem, U.S. WEST set up a consortium of clients and providers who were able to use a common semantic, work together at the appropriate levels of detail and rigor, and focus their attention on the problem concerned -- that is, to define the objects and their relationships needed to build an object oriented information management system. In their approach to managing information, U.S. WEST were keen to adopt a single source/multiple use model of information handling. They wanted a system in which a single source record could be distributed to many users in a forms that were appropriate to their needs. U.S. WEST found that managing information through a heirarchy of associations linking information sources etc. worked successfully. They also found that when introducing changes in the way they handled information, it was important from a management point of view to have devised solutions that could be linked to particular problems. Having talked briefly about the types of information stored in U.S. WEST's document management system, PH&DK concluded by mapping their approach onto the "classic paradigm" to which they had referred earlier. In order to define the problem, they had established a consortium of interested parties to establish clear goals for the application. Their solution had been built upon the adoption of standards such as SGML. The final implementation had relied upon the adoption of a well-designed object oriented database management system. 18.5 "Keeping track of law changes" -- Marc Woltering (Wolters Kluwer Law Publishers, Deventer) Marc Woltering (MW) outlined Wolters Kluwer's role in publishing the Dutch statutes. At present, there are around 8000 statute laws in force, all of which must be published and made available to the public. Any changes, new laws, or decisions to repeal old laws are published in journals that report rulings made by the Government and the courts. Once a statute has been published, Wolters Kluwer are only required to ensure that appropriate updates are issued. This task has been greatly simplified with the advent of loose-leaf publishing techniques, which has meant that Wolters Kluwer have been able to meet their obligations simply by issuing updates of any affected pages. Wolters Kluwer have already implemented the decision to use SGML to markup the text of the statutes, and to facilitate document management using sophisticated database techniques. MW said that they had taken advantage of all the usual benefits of using SGML -- such as having a neutral system of markup, which is easy to process and allows for user-defined structures, and so on. However, they had also encountered the typical problems of using SGML -- such as its text-only basis, and difficulties when processing complex tables. MW said that there a major problems associated with the area of law publishing, that are difficult to solve using any technique -- and which he believed were not well resolved by using SGML, either. In particular he cited the problem of handling alternate versions of the same document, maintaining comments within clauses, and ensuring the validity of cross-references between laws. 19. SUMMARY The sheer number (and commercial weight) of attendees indicated that SGML is developing well. The fact that people were at different stages of implementing SGML, and were doing so with very different short- and long-term perspectives, suggested that SGML has a bright future ahead. The SGML user community is not about to become a stagnant pool of a few international organizations and government agencies. The emergence of HyTime and the amount of work going into developing SGML-related standards such as DSSSL, implies that SGML is uniquely well- placed to become the basis of future information handling technologies. ================================================================= For further details of any of the speakers or presentations, please contact the conference organizers at: Graphic Communications Association 100 Daingerfield Road, 4th Fl. Alexandria, VA 22314-2888 United States Phone: (703)519-8157 Fax:(703)548-2867 ================================================================= You are free to distribute this material in any form, provided that you acknowledge the source and provide details of how to contact The SGML Project. None of the remarks in this report should necessarily be taken as an accurate reflection of the speakers' opinions, or in any way representative of their employers' policies. Before citing from this report, please confirm that the original speaker has no objections and has given permission. ================================================================= Michael Popham SGML Project - Computing Development Officer Computer Unit - Laver Building North Park Road, University of Exeter Exeter EX4 4QE, United Kingdom Email: sgml@exeter.ac.uk M.G.Popham@exeter.ac.uk (INTERNET) Phone: +44 392 263946 Fax: +44 392 211630 =================================================================