[Mirrored from: http://www.sgmlbelux.be/96/hermans.htm]

QUESTOR
Publishing social law to different media

Paul Hermans
Pro Text
Interleuvenlaan 62
B-3001 Leuven

E-mail: Paul_Hermans@protext.be

Abstract

A discussion of practical experiences in using SGML and HyTime for publishing to different media (hardcopy, browser, HTML) using software such as Synex ViewPort and FrameMaker+SGML.


Keywords : SGML, HyTime, ViewPort

The project's context

The company

SD is a social secretariat which provides services and advice with respect to social law to 12,000 Belgian companies. One of its divisions organizes courses in social law. It was this course material that had to be published in a efficient and future-oriented way.

The requirements

The course material had to be published to different media:

Additional requirements were:

The product Questor

Demonstration of the product as it is now.

The making of Questor

Conceptually

It was our deeply felt conviction that the existing course material couldn't be used as it was. It was written in a sequential way, for the traditional book medium. This doesn't work online.

"Don't dump paper online" (William Horton).

What is needed for online use are independent information units which are self-containing, i.e. which give a clear answer to a user's specific question. So we convinced the writers at SD to rewrite the complete course material using such an underlying organization of topics, modules, information units, and micro documents.

After this had be done, we were confident that we now had good source material for an online product. But was it still possible to generate a good old-fashioned book from the same source?

A book needs other information (elements) than an online medium and vice versa. We solved this by including in our sourcefile all elements we thought of as being necessary in both versions. In going to a specific medium (paper or online) we could then strip out the unneeded elements. This meant that the two versions would differ not only in layout but also in content.

Writing of the DTD

Number of DTDs

We started with three DTDs:

The input DTD was enhanced with ICADD-attributes for generating braille, voice synthesis, ... but those were rapidly removed, because of two reasons:

Result: in a moment of frustration, we dropped the whole ICADD idea.

As you can guess - since a DTD continues to be improved - the maintenance of three separate DTDs quickly became a nightmare. The next phase was the integration of the three DTD's into one single DTD, using marked sections, which resulted in an unbelievable elegant construction. (Note)

Modularity of content

We use three kinds of information units:

Each topic, indicated as such by a fixed attribute, is used in navigating to the next/previous screen (topic) in the digital version based on Synex ViewPort.

Different topic content

Meta information

Every topic has:

Long title
Used only in the text-view of the digital version.
Short title
Used in the TOC (navigator) of the digital version and in the printed version.
Keyword
The same keyword can be attributed to one or more topics. You can have a one-to-may relationship.

Keywords are not used in the hardcopy version. They are used to generate the keyword-list used in the Windows Help-like topic search function.

Indexentry
This describes a one-to-one relationship. Each indexentry goes to only one topic.

Indexes are used in both the digital and the printed version.

Target
Used to have a meaningful name in case of one-to-many hypertextlinks (idrefs). More efficient than including it many times in an attribute.
Content

We found out that we needed to have an additional mechanism for stripping away elements in the content itself.

We have for example the element doccont (documentcontainer) containing (a reference) to a "digital paper" version (e.g. in Common Ground format) of contracts and other official papers. It is clear that this is aimed at the digital version. In our Omnimark-script for going to hardcopy we indicated to suppress all those "doccont"-elements. This worked fine as long as the doccont element had siblings inside a block. If the doccont was the one and only element inside a block, only the label of the block was printed. In other words you had a title without content.

The solution was to add an attribute to certain elements indicating the target medium of those elements.

Changes

The requirement that changes in content compared to the previous version were to be indicated was met by:

Links

Navigational links

Most of the links in the SGML are hardcoded: clinks in HyTime-speak, based on unique identifiers (IDs). I would like to stress however once again the importance of adhering to a modular writing technique. Since every topic is about only one subject, an answer to one user's specific question, you can always link to a complete topic, except for ... three of those clinks, which link to a location ladder for referring to a table.

The linking between the index, which is a separate publication, and the book itself is done by using TEI extended pointers. Those links are in our experience more concise to edit than using the rather elaborated HyTime syntax.

The user's own notes and bookmarks

In the Synex ViewPort environment, this is implemented with independent linkfiles (webs) which use location ladders based on the HyTime constructs nameloc, treeloc, and dataloc.

Treelocs and datalocs are good instruments when you have to deal with static, non-changing data. They become quickly unmanageable however when the structure of your data changes frequently, which is the case for Questor. In the best case the user's note still appears inside the topic, albeit at the wrong place. In the worst case the user receives "a bad anchor location"-message. We suspect that this problem will only become more frequent after each update.

The change web

For this purpose the nameloc, treeloc, and dataloc mechanism is perfect.

Software used during the project

Editing

For SGML editing, including TEI extended pointers: Author/Editor of SoftQuad.

For editing independent HyTime-webs: a combination of Panorama Pro of SoftQuad and Windows Notepad.

Conversion

For the conversions from the DTD for editing to the DTD's for printing and browsing: OmniMark from OmniMark Technologies (formerly Exoterica).

Additional scripts have been written for:

Browsing

Browsing of the digital version is done in a special Questor viewer based on the Synex ViewPort technology, which is in our opinion and based on our experience a very good product. And the Synex people are very responsive and competent people to work with.

The only minor points which need to be mentioned are:

Printing

We used Framemaker+SGML as formatting engine, which was at that moment still very new. Note that we didn't do any SGML editing in Framemaker+SGML, but we only used it as a batch formatting engine.

This meant the development of an EDD, Frame's format to describe structure and formatting rules, whose development is guided by its own EDD (very well done). In addition to this we had to add a few read rules to map SGML-elements to special Frame objects such as tables and graphics.

Our experiences:

Future plans

Technical

We are searching for better mechanisms for information reuse at a lower granular level. We have already implemented in the DTD the HyTime conloc-attribute for this aim, but it has not been used until now. We are also evaluating if SGML-repositories can be of any help.

Product

At present the content of Questor consists of general social law. But every sector (e.g. automotive industry) has its own rules and exceptions to general rules. Because of market demand we need to supply this sectorial info.

One possibility is to use marked sections, but since we have in Belgium more than one hundred sectors and for each sector most of the time completely different rules for workers and employees, such a solution would be hardly manageable.

We now plan to make different publications linked to each other by independent HyTime links but we hope to avoid the dynamic content problem by including a special anchor element in our DTD with a required ID, so we only have to use nameloc.

Conclusion

We believe that we have built a product which couldn't have been built without using SGML and HyTime.

SGML helped us in strictly separating content from formatting. The advantage of this approach is that you can use the formatting rules which are best adapted to the characteristics of the final output medium. SGML also helped us in generating the content best suited to the final output medium. Thanks to HyTime we have the uncredible power of independent linking, so we can offer precisely those links that are targeted towards a specific audience.

On the other hand, we are now confronted with severe problems. Those problems are mainly a result of the use of HyTime. HyTime's syntax is a disaster, editing independent HyTime links with the current SGML editing tools is a masochistic activity, and managing those links afterwards quickly becomes a nightmare.


Notes

Acknowledgments for help during this integration go to Dan Connolly and Joe English. (Back)