A typesetter’s tale on SGML

Steven Van den Bergh Lauwrie Stevens
Fotek Grafische Bedrijven
Entrepotstraat 3


Abstract This is a tale of a typesetter who has taken the SGML route for producing pages of text for its customers. Fotek Grafische Bedrijven is situated in Sint Niklaas, Belgium. A sister company has been set up in Hitchin, near London, in the United Kingdom. Another sister company is in the process of being set up near Namur. We are typesetters, some of you may regard us as dinosaurs or something from the dark ages... We actually get paid for putting pages on paper!... Isn’t that a quaint, old fashioned idea!!! The fact is, of course, that many of the organizations and companies that you, the reader, work for also derive their income directly or indirectly from producing final output, be it on paper or in some other media. We have chosen to relate our experiences of SGML from our early contacts with the standard, our examination of available tools and software (and the problems encountered), to the questions we asked ourselves and finally, our reasons for arriving at our final decision to develop our own solution aimed specifically at what we will call the “back-end” (side) of the SGML environment.

Let’s start our typesetter’s tale with some background...

Traditional typesetters have been applying GML ever since Linotype’s CORA code came into being. By using groups and combinations of typesetting commands, usually called formats or macros, they struggled with or without the aid of the publisher, to get manuscripts to conform to a structure so that they could apply these formatting commands to several manuscripts. The purpose being not to have to rewrite these formats every time and to write or modify as few as possible of these formatting commands. A format or macro was usually identified by a character, followed by a number, enclosed in brackets of a certain kind depending on the typesetting system one was using. The purpose of a format or macro number was to apply a pre-defined appearance to the text following it. This was and is still done by assigning a number of typesetting mnemonics to this format. These formats are usually kept in a library.

When DTP came along, there was a revolution in the world of typesetting. Old values were replaced by new values. Prices were cut. Structures, through formats, were not a way of life any more. Archiving and documenting the structure of a document was a waste of time and no longer required. Publishers started playing with their own Desktop Publishing systems and departments. Then came word processing. at first servicing the secretary in place of the typewriter. Then as the function of the secretary changed from personal secretary to departmental secretary, and the need for personal text and number crunching became a must, PCs together with word processing became more widespread. As management in general started to use PCs and the normal software found on them they also started to fully realize the new possibilities, as did the authors of manuscripts.

As word processing programs were being used for more and more difficult tasks, tasks for which they were not intended or designed from the beginning, they became more sophisticated in order to cope with these demands. Thanks to the evolution of word processing and Desktop Publishing software, the newly found freedom in applying sophisticated layout (which had to compensate for the user’s lack of expertise or the limits of the used software), one was able to produce pages of some sort, but at the right price. Thus we see the re-entrance of formats, macro’s or templates and taps or whatever name you wish to give them.

And yet, this is nothing new. We have been doing it for the last 25 years, the difference being that publishers and authors have discovered it too, through other means. The question is do they want to do it ... NO, they don’t! Because (they discovered) it is not cost effective in the long run. The use of these programs changed the hierarchy between publishers and authors, not it’s purpose. As a result of all this, many traditional typesetters lost foothold or interest in the typesetting of manuscripts and redirected their expertise to other markets. They were reluctantly replaced by DTP-houses and word processing individuals or departments, which could do the job faster and at a lesser cost. But what has been lost on the way, and have we really done the right thing?

We have lost structure, archiving and documentation. We have driven the enterprise from our market. And yet, it is time now, as there are still some of us left, to realize that the traditional typesetter, without being aware of it, is the best person placed to bring the use of SGML into practice.

How we got going ...

We could stand here before you and discuss FOSSEs and DSSLs - but we won’t. Lets talk practical instead! As a technically aware typesetting organization we have been aware of CALS and SGML for some time. Particularly in our specialized environment of producing technical, scientific and legal journals and publications. These documents have always followed a structured format and are mature for multi-media output or Internet availability. And from what you have just heard, you will appreciate that mark-up is nothing new to us. Our early considerations were of a fundamental nature...

SGML - Would it catch on?

Clearly it would, certainly in our chosen environment (journals and technical documents)

Would it affect us?

Yes, obviously given that the marketplace which we wished to address is the market of technical books and journals.

Did we therefore want or need to get involved as an organization and if so, at what stage?

This question was fundamental to our future. Without doubt SGML was coming to journals. This presented a crossroads to FOTEK. Did we risk seeing our client base eroded over a period of time as various titles and publications went over to SGML, and should we hope to compensate in some other market area? Or alternatively, were we to grasp the thorny issue of SGML? If so what would be the benefits - if any? What were our clients needs (now and in the future)? Would we require new equipment and/or software? Did we have the resources and capability? AND ... What would it cost?

All of these questions had to be addressed, both independently from each other and in combination. In reality we had no choice! If we had correctly anticipated the growing acceptance and use of SGML as a standard then the only real decision for us was whether we would take an active or passive role to the environment. A passive role meant simply reacting to the requirements, needs and requests of our clients and the marketplace. A comparatively simple process minimizing the required in-house level of SGML knowledge and understanding, buying available tools and software as necessary and generally becoming simply a processor. To take this course was to misunderstand what was happening within our sector of this industry. The whole nature of our business is undergoing change, we are becoming document managers, we must recognize the implications of multi-media, the real meaning of document management, and the needs for system independence. To take this route, an active role, would mean a total commitment throughout FOTEK.

The decision to follow this active route was perceived as an opportunity. Rather than offer simply ‘typesetting’ of SGML documents we would be able to advise and direct clients on the use of SGML. Many of our customers, like us, were in the early planning stages, together we could build up our expertise, working more as partners than in our respective traditional roles.

One of our first tasks was to gather knowledge in two primary areas:

The first area required us to look closely at the available manpower, what the qualifications should be, identifying a new hierarchy in experience and knowledge, and whom should service whom. Then deciding whom and what courses had to be followed in relation to the various aspects of SGML. The second area was related to the first. Here we were concerned with finding and evaluating SGML systems based on our limited knowledge which was a challenge at the time. But what the real challenge proved to be, was to find systems and applications which did the job well and fulfilled our requirements as typesetters and producers of typeset pages, pages which still had to be of the same typographic quality as traditional typeset pages.

What were we looking for?

Ideally a single software package allowing us to typeset in the traditional way, because this was and still is our source of income, and at the same time to be able to typeset one or more SGML instances using the original SGML mark-up WITHOUT CHANGE or compromise, because this is our future. Additionally to parse and validate the file and have the ability to edit, add or correct the file and SGML mark-up, finalizing the instances through parsing it with the parser of the customers choice and outputting final pages.

This does not sound simple and in practice is even less so. The SGML standard is concerned with structure, NOT appearance, and this can represent a conflict. Any of you who have, at some time, felt the need to have your SGML files (instances) typeset, will be aware of the difficulties in doing this whilst retaining the true integrity of the file.

Most commercially available solutions, which do not compromise in the flexibility of typesetting, require substitution, suppression or replacement of existing SGML mark-up. The substitutions may take place in the background or in some more obvious manner. Either way, the fact remains that the true SGML file is changed (compromised) and if the instance is corrected or edited at this stage (SGML) mark-up must be entered and the final file saved in SGML format for archiving. This presents a high risk of error.

We have examined and used a number of solutions over a period of time, but none managed to achieve our requirements. We therefore decided to look in our own direction. Our key requirement was basic: ‘Keep It Simple’. We wanted a system which would help the user and alleviate the need for extensive training for each and every keyboard operator in our organization, since all our operators are used to working with marked-up documents, then surely SGML could be presented as an extension of this.

We have been using a typesetting program 3B2 (produced in the UK by Advent Publishing) for the last 4 years. It uses a system of mark-up specifically designed to be SGML (CALS) compatible but had no means of parsing. But this did mean that we could import TRUE SGML files without any changes to mark-up whatsoever. Tags which were recognized would be interpreted and those which were not would remain in the file identified as potential recognizable mark-up. This seemed a good starting point and all the software needed now was the ability to parse a file, to offer an interactive error system and be capable of not only reporting the error location but placing the user at that position in the document and offering specific help in the context of the DTD at that point. This we achieved. We now had the basis of a very good SGML production and validation tool.

3B2’SGML is now a fully integrated system in terms of Text and Typography, Table, Math, Graphics, Forms Design, Page Make-up and now SGML. Its powerful text-editor, featuring all the standard tools within a fullblown editing product, combined with one of the most advanced text formatting engines, can produce pages of data, fully validated according to your own DTD. Built in into each 3B2’SGML program there is the Application builder, the page formatting engine, the SGML parser and the output formatting device. Working together, these elements not only produce the final typeset pages but at the same time a fully validated SGML file. There is only one data file within 3B2’SGML and this is your SGML file and your DTD, thus ensuring that all editing changes are both in the produced page and in the final SGML file. With 3B2’SGML, you not only see your finished page on screen but also the corresponding SGML file at the same time. Its possible to fully automate 3B2’SGML so that it may interact with one or more data bases, produce pages and validate the file in accordance to preset routines.

This represents our present position. Work is currently going on to further develop 3B2’SGML into a fully fledged SGML editor. Obviously, Fotek Grafische Bedrijven continues its search for tools to produce their work more efficiently and to be ready for other areas and opportunities in text handling as they present themselves.