[This local archive copy mirrored from: http://www2.rpa.net/~harmison/IEEE2.html; see the canonical version of the document.]

Creating Electronic Documents that Interact with Diagnostic Software for On-Site Service

Abstract: Xerox Corporation has developed an integrated electronic documentation system for field technicians to use at customer sites for diagnosis and repair of equipment. The electronic documentation system lets users access large technical documents which include text, graphics, video, and sound. Documents interact with the equipment being serviced to provide the reader with information relevant to the current situation. This article discusses how the Standard Generalized Markup Language (SGML) was used, user interface requirements and how they are addressed in our system, multimedia implementation techniques, integration strategies with other service tools, and the technologies employed in the system.

Introduction

Many electronic document projects focus on reducing the cost of production by eliminating printing costs and reducing distribution costs [1]. Such projects achieve these results by delivering publications on CD-ROM and over on-line media such as intranets or the Internet [2]. Other projects use full-text search tools to provide the reader with improved access to the publication’s content. At Xerox, we to wanted to reduce costs and improve access to our publication’s content, but our primary objective was to improve the productivity and quality of the actual service we offer our customers.

Xerox decided our service engineers would use a relatively powerful (and therefore expensive) laptop computer on the customer’s site to diagnose and repair machines and to manage their work. To get the most from this investment, we developed a complete electronic documentation system, integrated with other laptop tools, such as call management, diagnostic tools, parts management, and expert systems, to provide the readers the information they need, when they need it. What stands out about our effort is the extent to which we integrated these other tools with our documentation.

In creating this system, we faced many of the same challenges other teams have encountered: pulling together documents created by different groups, with different formats; creating a simple user interface; and ironing out difficulties with graphics and multimedia. But there were several unusual aspects to our project, driven by our need to provide technicians with applicable information, at a customer site:

We used the Standard Generalize Markup Language (SGML) [3] to maintain the content of our electronic documents, and a stylesheet-based approach to presentation to easily control formatting, chunking, and the relationship between the document and the user interface at the moment of delivery.
We maintained a single window user interface to display text, tables of contents, graphics, and video rather than using pop-up windows which separate the text from the graphics and add complexity to the user interface.
The history list, bookmarks, and notes all mark the text, graphic, and video positions (including graphic zoom).
Text on graphics is managed as part of the SGML document structure, enabling users to search through the text that appears to be part of the graphics, and to make hypertext jumps from text to graphic, or graphic to text. This approach also improves graphic readability by using scaleable fonts to give the best text quality possible at all zoom levels.
Documents can contain interactive forms, much like the HTML forms seen on many Web pages. When users fill out these forms, the document reveals or hides conditional sections, thus tailoring the document to the particular situation in which users find themselves.
The electronic documents are integrated with other laptop applications to provide live information within the document about the current situation, and to filter out information which is not relevant to the particular problem being addressed by the technician.

The remainder of this article describes the project background, overviews the system we developed, takes a detailed look at the decisions involved in four key areas of our project, and discusses the technologies employed in our project.

Background

Project Requirements

In 1994, Xerox Corporation began to look at specific ways in which equipping our field service force with laptop computers would improve productivity and quality without increasing costs. We had a number of technical initiatives underway at that time, including expert systems, remote diagnostics, call management, administrative support, and electronic documentation. While all of these tools are still being used today, electronic documentation has been one of the most successful, both in terms of money saved and user appreciation.

When we set out to create an electronic documentation system for our Customer Service Engineers (CSEs), we faced the same problems most publishing groups face: converting our existing documentation, designing documents for electronic delivery, and implementing the myriad details associated with a new delivery medium. We also required that our solution work with different styles of documents, since service manuals are developed at several locations within Xerox; that it support video and sound seamlessly so training style information could be included in our reference manuals; that the user interface be accessible to people with little or no computer experience with only a few hours training; that the user interface work well on the small screen of a laptop computer; and finally (and perhaps most interestingly), that the electronic documentation integrate with other software packages running on the service laptop, such as parts management, call handling, and machine diagnostics.

Description of Documents

A typical Xerox service manual is between 1,000 and 10,000 printed pages. On average, there is more than one illustration per page. Our documents are divided into standard sections on diagnostics, repairs and adjustments, parts lists, general (common) procedures, wiring diagrams, plus other sections which vary from book to book. We traditionally spend a great deal of time and money creating documents which are highly formatted and well paginated. Several proprietary systems have been used to create service manuals over the last 15 years including the Star (and its successor, GlobalView) and a custom-developed system called Datacom.

While general standards for publication have existed for years, Xerox alternately has unified and distributed the function of technical publishing within the company. This has led to distinct communities of practice, each with its own customers and interpretations of the standards. These differences are not bad, in their own right. Many have resulted from genuinely different customer requirements over the years (printers versus copiers, the United States versus Europe). These differences did mean, however, that we were not dealing with either a technically or politically homogeneous group of documents or publishers.

In addition to reference manuals, the publishing groups within Xerox also produce training materials. Training is an expensive part of running a service organization, and we saw the wide-spread availability of laptop computers as an opportunity to help reduce training costs. We targeted technique-sensitive activities (such as removing a component which requires several turns and twists) as a starting place for providing training at the point of need.

User Interface

The field-service force employed by Xerox is highly experienced and knowledgeable; in fact, CSEs commonly solve problems without ever consulting their documentation. Yet when we started this project, few CSEs had much personal computer experience. Those who had used computers on the job most likely had used Xerox proprietary systems such as the Star, which did not give them experience directly applicable to using a laptop computer running Microsoft Windows.

We desired to create a documentation system which did not require more than a few hours of training. This meant keeping the user interface relatively simple to learn and to use. But two factors made that goal difficult to achieve: First, electronic documentation is just one of several applications used by the CSEs so we could not assume total control of the laptop environment. Second, feedback from previous electronic documentation projects indicated that efficient use of screen space was critical to user acceptance (owing chiefly to the large number illustrations and the reader’s desire to skim the page for information).

Integration

Electronic documents by themselves have some serious drawbacks over printed material, such as the limited size and resolution of computer screens, and the need to have a computer up and running to read the document (no small challenge when you don’t have power and your batteries are dead). Our early attempts at electronic documentation had shown us that integrating tools on the laptop was a key to getting CSEs to use the machine. Integration of information could offset some of the inherent drawbacks of electronic documents. For example, having an electronic parts list which displays how many parts the CSE has on hand is clearly something a paper parts list can’t offer. We also had long-term goals to make the laptop an integral part of the diagnostic and repair process. For this approach to work, the electronic documents needed to be linked with these tools. The strategy which has developed over time is to have the document be the central resource from which other tools are accessed.

Overview of the System

We called our electronic documentation system the Xerox OmniBrowser. The OmniBrowser was developed in-house to meet the specific requirements described earlier. The OmniBrowser (or just browser) is a Microsoft Windows application, written in Visual C++. It provides Standard Generalized Markup Language (SGML) text viewing capability, raster graphics viewing, and video and sound support. The browser has plug-in modules which can communicate with the copier or printer which is being serviced. It also has a customizable user interface to meet various levels of user experience. The browser is also integratible, meaning that it is possible to use it as a component in another software system.

The browser is based upon SGML because SGML allows us to completely control the content of our documents from creation right through to display on the user’s screen. The ability to create our own unique structures within SGML documents allows us to make documents respond to the user’s run-time environment. For instance, we store information in the SGML document about how sections of the document relate to a machine’s configuration. The system then displays only those parts of the document which are relevant at the time of use. The system can determine the configuration being serviced through either manual (user input) or automatic (direct connection) means.

Another advantage of using SGML is that each of the publishing organizations retain control over their own documents while presenting them through a common tool. As mentioned above, technical publishing is currently decentralized within Xerox. Each group serves a slightly different set of engineering and manufacturing customers, but the CSEs get documents from all the groups. Thus, each group develops documents independently but feeds those documents into a common delivery system. By using SGML, each publisher can create document type definitions (DTDs) and display rules which the OmniBrowser accepts. This has led us to adopt a principle of DTD independence which is discussed in the next section.

Document Structure and the Viewing System

Since the OmniBrowser needed to work with documents coded in different DTDs, the design needed to address how to format documents in different DTDs, how to specify the interaction between the document and the software, and how to deal with the unanticipated requirements of future documents.

Formatting

The problem of how to format documents written in different DTDs was the easiest to solve. By using an SGML display system which supported the concept of stylesheets, each DTD developer could design a set of styles which instructed the browser about how to format the document. There are a number of SGML tools which use this technique; we selected Electronic Book Technologies’ DynaText System’s Integrator Tool Kit.

Also stylesheets have the advantage of making it easy to change formatting at the last minute. In systems where documents are preprocessed into the tool’s formatting language, a format change often requires a lengthy reprocessing of the documents. Stylesheet-based systems can make such changes in seconds. This capability was helpful especially during our early book production efforts as we worked out bugs in on-screen formatting, linking, and document chunking.

The disadvantage of stylesheet systems is that it is difficult to make formatting exceptions. You might, for example, need to format the copyright statement in just a certain way. It sometimes seems pointless to develop a structure and stylesheet for a single element when you could format it more quickly manually. This disadvantage begins to disappear, though, when you start reusing your structures and styles on many publications. At that point, all the up-front work has been completed, and you begin to spread your development costs over more and more publications.

Interaction Between the Document and the Software

In electronic document viewing systems, there is a boundary between the viewing system and the document being viewed. Part of this boundary relates to formatting, as discussed above, but there is a more subtle boundary--that in which document content is represented in navigation interface elements such as windows, title bars, tables of contents, and pull-down menus. For example, most World Wide Web (WWW) browsers place the contents of the <title> element in the main browser window’s title bar when a page is displayed. They use this title as the menu item in the bookmark menu as well. The boundary for Web browsers, then, is hard coded to always link the <title> element to the title bar. This assumes both a known DTD (namely HTML) and a known desired behavior (putting the title in the title bar). The OmniBrowser could not safely make such assumptions since we didn’t know (or want to know) what DTDs might be used for our documents.

To solve the problem, we extended the concept of stylesheets to include style elements which controlled the user interface. Extending the stylesheet mechanism meant creating new functions within the stylesheet language which could be called upon when needed. Two examples help illustrate this technique:

Example 1: Chunking

Our documents consist of a set of procedures. The procedures are organized for easy access according to the fault code. Fault codes do not come in any particular order, so one procedure usually has nothing to do with the one before or after it in the document. Worse yet, two adjacent procedures may deal with very similar faults and look nearly identical but differ in important ways. This makes it important to keep the user from accidentally scrolling beyond the boundaries of the procedure currently being read. To do this, the browser displays only one part, or chunk, of the document at a time. Some of these chunks are procedures, others are parts lists, still others are groups of circuit diagrams.

In some systems, this type of chunking is accomplished through the physical division of the document into files, or through inserting control codes to indicate where chunks begin and end. In the OmniBrowser, chunking is accomplished using stylesheets. The stylesheets specify which elements, in which contexts, should be considered chunks. These chunks also serve as the lowest level elements in the table of contents. To change the chunking of the document, the publisher need only change these styles, not restructure the source document or reprocess the source data. This flexibility let us experiment with chunking styles to find the best one (most chunks were obvious from the outset, but some, such as the introductory material, or the pages and pages of block schematic diagrams, were less obvious). By showing different styles of chunking to our users, we were able to select one which organized the data well but didn’t force too many levels of nesting in the table of contents navigator.

Example 2: Electronic footers and history lists

One of the functions of a well designed footer or header in a print document is to provide context to the current page. Sometimes this is the chapter title, sometimes a range of topics or words. The need for context setting is even more critical in electronic documentation than in print because the viewing area is smaller and the reader is generally less familiar with the medium. We provided an electronic footer in the status bar of the browser’s main window. This footer always reflects the name of the current procedure and its section name.

To enable our footers to display this context information, we had the styles for procedures call a stylesheet function which, in turn, communicates to the browser what the current procedure name is. In this way, any element in the document which wants to place itself in the footer can do so. Repair procedures have a simple title, for instance, while diagnostic procedures have a compound heading block which includes fault code information.

The footer needn’t be the title of an entire chunk either. Suppose a procedure is quite long (several printed pages) and has several sub-procedures within it. The entire procedure might be presented as one chunk because it is followed from beginning to end. However, it is helpful to have the footer not only show the procedure title but also the sub-procedure heading. This can be accomplished by having sub-procedure elements pass both their own text, as well as the text of their parent element’s title, to the browser.

We also use the footer information to enter meaningful text onto the list of where the user has been in the document (history list) and in a menu item for searching within the current procedure.

The Unanticipated

Even if we hadn’t had to deal with different publishing groups developing their own DTDs, we would still have striven for DTD independence for the simple reason that we knew we would continue to change and improve the way we represent and present documents. We also knew that each document might have unique requirements for the OmniBrowser, such as new formatting styles or special links to other systems.

As we have become more SGML literate, we have refined our DTDs and developed new DTDs for different kinds of documents. We are free to do this without impacting the browser software because the browser does not assume anything about the DTD.

User Interface

To accommodate the bulk of our users, who had little or no experience with Windows interfaces, our design for the user interface has to simplify and maximize the use of the screen space. These two requirements were surpassingly interrelated. We elected to make the interface have all of its basic functionality visible at all times, and to reduce the amount of user ‘management’ of the application.

At the top of the browser’s main window is a standard menu bar and a tool bar. It is through the tool bar that users initially access the functions of the system, since it is readily available and visible (unlike menus which must be pulled down before commands can be chosen). At the bottom of the window is a status bar and footer bar. The status bar provides hints for using the tool; the footer bar provides context as to which part of the document the CSE is viewing. Once users becomes familiar with the functions of the system, they can opt to remove the tool bar, or make it smaller, which makes more screen space available for viewing the document. More space can be recovered by hiding the footer and status bars.

The browser’s main window is divided into two areas (panes). The left-hand pane displays a table of contents which users can navigate by clicking elements to expand or to collapse the hierarchy. The right-hand pane displays the text of the current procedure. As users navigate in the table of contents on the left, the text on the right will remain fixed. Once users locate the procedure they are looking for, they click the procedure’s title in the table of contents in the left pane, then the text view on the right changes to present the new procedure. Since the text view doesn’t change until a new procedure has been selected, users may rummage around in the table of contents, looking for something, without losing their place.

As the table of contents hierarchy is expanded, the table of contents is constrained to display only the current section of the document and its immediate subsections. In this way, users cannot scroll beyond the bounds of the current section and get lost. We compared this approach with that of having the whole table of contents accessible at all times and we found the restricted view preferable. There are times when an unrestricted view is desired, so the ability to switch between the two styles is a potential enhancement to the system.

When users display a graphic or video (and controls), the window remains divided into two panes, but now the right pane will display the graphic and the left will display the text. Within the graphic pane, users can zoom and pan (scroll) the graphic. Links from and to the graphic are supported (see Graphics, Video, Sound, and Forms Support, below). We selected the split-pane technique over using a pop-up window for several reasons. First, the text and graphic work together in our documents so the graphic should be presented beside the text, not as an off-shoot from it (as might be implied by a pop-up). Second, the moving and sizing controls, and the modal (or non-modal) behavior of pop-ups can confuse new users. Third, pop-up windows (at least in MS-Windows 3.11) have rather large title areas which consume space better used for content.

The main window may be divided horizontally or vertically for a graphic, depending upon the aspect ratio of the illustration. Users can adjust the division of the main window by moving the dividing bar. The stylesheets control the initial division, so large format graphics initially display in a large pane, while smaller graphics leave more room for text.

Graphics, Video, Sound, and Forms Support

Beyond placing text, graphics, and video within a common user interface, integrating multimedia into our system also included allowing graphics and video to work with the notes, bookmarks, and history (backup) functions of the browser. We also extended basic graphics support to include SGML-based text overlays, and we created basic forms support.

History, Annotations, and Multimedia

As with most viewers, our browser keeps a history trail showing where users have navigated and allowing users to back up through this trail. We allow users to view the trail and to back up to any particular point along the way. When we integrated graphics and video, we included their unique properties in the history list. For example, the history list records how a graphic was zoomed and positioned at each step in the history. In this way, links to graphics make more sense than if they were simply remembered as "opening the graphic."

We gave bookmarks and notes the ability to store this same information about graphics and videos. A bookmark not only marks a text location, but it can mark which graphic (or movie) was displayed and at what zoom level and location.

Text Overlays of Graphics

Given the high graphic content of our documents, it is not surprising that we spent a great deal of effort on our graphics viewing facilities. Perhaps the most interesting part of our graphics solutions was how we carried forward an old practice of our legacy publishing systems.

About twelve years ago, during the development of a custom publishing system, someone accidentally typeset some text on top of a graphic. This accident got people to thinking about the advantages of doing this on purpose: The text could be edited by the writer at the last minute, rather than going back to the illustrator; the text could be translated to a foreign language without touching the base illustration; and the same base illustration could be reused several places in the document with different text on it. For all these reasons, we have been managing text as a layer on top of graphics ever since.

As we looked at how we were going to display graphics on screen, it seemed like all the same advantages applied, plus some new ones: We could search on the text layer of the graphic (which we couldn’t do if it was in the graphic bitmap); we could use the content of the text layer to establish hyper-links between graphics and text; and we could improve the readability of the text on graphics by drawing it at the best possible quality as the graphic was zoomed in and out.

The Text Layer

We define the text layer of the drawing with a hot-spot structure in our SGML DTD. The <hot-spot> element contains all the information necessary to draw the text layer of the illustration and to link the areas on the illustration with the main text of the document [4]. This includes the position of the hot-spot on the graphic, the size of the spot, and the color of the spot. Each hot-spot contains marked up SGML text which appears within the bounds of the hot-spot. The text might be a paragraph or two, or it might be a cross reference element. Here are a few examples:

Example 1: Text on the graphic

<hot-spot x="1in" y="1in" w="1.5in" h=".5in">

Remove the roller by pulling up and back.

</hot-spot>

Here, the text within the element is simply drawn on top of the graphic as it is displayed. As the graphic is zoomed in and out, the text is scaled accordingly so that it stays the same size relative to the underlying graphic. As the user zooms in, the smoothness of the text improves since the text is being drawn using system fonts, rather than being enlarged from the graphic’s bit map.

Example 2: Graphic to text link

<hot-spot x="1in" y="3in" w=".25in" h=".25in">

<link><xref target-id=ADJ_9.3>ADJ 9.3</link>

</hot-spot>

In this example, the text "ADJ 9.3" will be drawn on the graphic. The user’s mouse pointer will change to a hand icon when it is over this area of the illustration. If the user clicks the mouse on this area, the software executes whatever behavior is associated with the <link> element in this context by the stylesheet (for example, displaying the text of procedure Adjustment 9.3). In this way, even graphical hot-spots benefit from the flexibility of stylesheets.

You might note the redundancy of having the <xref> element within the <link> element. This comes about because we are not using the SGML display engine to draw text on top of the graphic. This would slow down graphic rendering too much. Instead, we draw the text using operating system graphics calls. This has the unfortunate effect of not enabling the use of stylesheets for text on graphics. All the text within a hot-spot element is drawn, no more, no less. This is why the <link> element needs to have the redundant text inside. This is an example of the type of trade-off we needed to make to get the system done.

Example 3: Text to graphic link

<hot-spot x="1in" y="5in" w=".25in" h=".25in" >

1

</hot-spot>

This hot-spot is taken from a parts list illustration. The number "1" is a locator number which matches a number "1" in the associated list of parts. When users clicks on this hot-spot, the stylesheet for is executed. The normal behavior for would be to do nothing when clicked, but in this case, the stylesheet can determine that it is a in a <hot-spot> in a parts list illustration. In this case, it executes a link which displays the part list with part number "1" highlighted.

Conversely, each part within the list of parts also is a hyper-link which searches the associated illustration’s <hot-spot> elements for one with a matching locator number. Then the link can cause the illustration to zoom in and highlight the hot-spot’s bounding box. This makes it quite easy to locate a part on the drawing.

Creation, storage, and management of this data is very easy. The ‘link’ is maintained purely through the SGML context. This is in contrast with our experiences with systems where links needed to be re-established between graphics and text whenever either changed.

Lines

After the initial release of the browser, we extended the hot-spot concept to apply not only to text on a drawing but to the leader lines as well, so we could completely reuse illustrations, and link not just to the text in the text layer, but to the actual area on the graphic described by the text (the software can follow the leader line).

Carrying on this legacy has caused us to develop custom software not only for viewing documents but for editing and printing them as well. We think this added expense is worth the benefits in translatability, reuse and quality of the graphics.

Eventually, we will move to a completely vector-based solution for graphics, eliminating the need for much of this approach. In a good vector based solution, each element of a drawing could be treated as an object, including the text. Several ‘layers’ of text and lines could then be managed separately. We also should be able to locate information on the drawing (i.e. a part) at run-time. This would make it even easier to maintain the link between document text and illustrations. Currently, we have not identified a cost-effective vector solution which can run on our laptop platform.

Forms

The Xerox products we document often have a number of configuration options. A traditional light-lens copier might have a stapler, a stacker or sorter, a number of different paper trays, and different speeds (copies/minute). A digital networked device will have an even larger number of possible configurations including different network interfaces (TCP/IP, AppleTalk, Novell) and several printer description languages (PostScript, HPGL, Interpress).

To deal with this multiplicity of configurations, we could write procedures with lots of "If this, then that" kind of statements, but such procedures are difficult to read and to write. Instead, we usually create variant procedures which differ only in small ways. The variant procedures are easier to read because the reader doesn’t need to filter as much, but they are still costly to write. In addition, maintaining variant procedures is costly and error prone, especially during product development when change is frequent.

Electronic documentation has given us the ability to address both the reader’s and the writer’s problems with variant procedures. By using markup in our SGML documents, the writer can indicate which parts of a procedure are appropriate for which options. If there is a paragraph or sentence which applies only to machines with AppleTalk, then the writer indicates this in the SGML document [5, 6]. As the CSE reads such a procedure, only those parts of the procedure which are appropriate for the configuration being serviced are shown.

After the writer has marked the conditional parts of the procedure, the browser must be able to selectively display or hide those parts. This requires the browser to evaluate conditional logic (if/then statements). Such evaluation, in turn, requires the browser to know the configuration being serviced (the browser can’t decide whether to show the AppleTalk part or not if it doesn’t know if AppleTalk is being used). There are two methods through which the browser gets this information: from the user through on-screen forms, and directly from the machine being serviced (discussed later, under Integration).

The SGML-viewing technology we used did not have any facilities for displaying forms, but, with a combination of three simple functions and stylesheets, we were able to create a forms system which has worked quite well. This forms system supports three common input elements: check boxes, radio buttons, and text fields. All three elements can be created using any SGML editor, and the linkage between the form and the portion of the document the form controls is simple and direct.

An example will help illustrate the entire process of creating a form, linking parts of the document to the form, and reading the resulting configurable document:

Suppose we have a printer with three network interfaces: Novell Netware, AppleTalk, and TCP/IP. As the author is writing the installation procedure for the printer, she finds that in several places she is writing sentences such as "If the printer is using AppleTalk, do this.... If the printer is using TCP/IP, do this...., etc." So she decides to create a set of check boxes at the beginning of the procedure with which the CSE can indicate which networks are being installed. She inserts the following SGML elements:

<check-box var="AppleTalk">AppleTalk Network Installed</check-box>

<check-box var=" Netware">Novell Netware Network Installed</check-box>

<check-box var="TCPIP">TCP/IP Network Installed</check-box>

In the text of the procedure, she now surrounds each conditional part of the procedure with an SGML element and sets the attributes of the element to indicate which condition will control its display:

<configure var="AppleTalk">From the Apple menu, select "Chooser." Click on the printer icon. Make sure the new printer’s name appears in the list.</configure>

<configure var="Netware">....Novell stuff here....</configure>

<configure var="TCPIP">Use the ping command to verify that the printer is properly connected to the network. </configure>

When this procedure is being used in the field, the CSE will first see the check boxes at the beginning of the procedure. He will select which network options are installed. For the AppleTalk check box, the browser will record in memory that a variable "AppleTalk" is set to true if the box is checked and false if not. Likewise for the other two options. The "var" attribute indicates a named variable which is controlled by the check box.

As the CSE reads down the procedure, he will see all text which is not configured at all (i.e. not inside any <configure> element), and that text which is inside a <configure> element whose associated variable is set to true in memory. So, if the CSE checks Novell and TCP/IP, he will see the Novell and TCP/IP sections but not the AppleTalk section.

Implementing this system turned out to be simple and useful. The first step was to create a facility for storing run-time data within the browser. This allows the browser to remember the state of a given check box by remembering the value of the associated variable, and to evaluate conditional expressions based upon these values. This store of information is accessed through two stylesheet functions we created, GetVar and SetVar. The forms element (check-box, in our example) uses SetVar to set the value of the variable to true or false, and uses GetVar to determine if the check box should be checked or not. The configure element uses GetVar to determine if the contents of the configure element should be visible or not. It was also necessary to implement a Redraw function which is invoked whenever the user changes a form element, otherwise the element will not get updated until the user changes the view and causes a redraw.

Thus, by adding three simple functions, we were able to implement forms in a system which did not support forms directly. This also created a system for authoring configurable documents which required very little from the writer. Notice that the writer did not need to write conditional logic statements, or even maintain an explicit link between the form element and the document element it controlled. This concept was extended easily to create radio buttons and text fields once these basic facilities were available.

Integration with Other Tools

As mentioned earlier, Xerox has deployed many applications on our service engineers’ laptops. These include call management for dispatching service engineers on-line, parts inventory management, diagnostic tools, and expert systems. There are many possible interactions between the tools. This section looks at two styles of interaction and how they were implemented.

The first style of integration is to have the browser retrieve information from other tools and display that information within the electronic documentation. Including this information in the document makes using the information a natural part of reading the document. The user does not need to operate another software package to get the data, and the data will always be included correctly in the document, thus eliminating transcriptions errors.

The second style of integration involves allowing other software tools to use the browser’s services to display documentation. By doing this, a specialized tool (such as a network diagnostic tool) can perform its core function (diagnosing network faults) without having to include also all the information on how to repair the fault, adjust the components, or identify replacement parts. These general functions can continue to be performed by the electronic document. The result is less duplication of information and fewer interfaces for accessing it.

Client/Server Integration

There are places in the documentation where it is useful to display information from other laptop tools. Integrating with other tools in this manner puts the browser in the role of a client and the other tool in the role of a server.

For example, when viewing a parts list, the browser accesses the parts inventory system on the laptop. When the service engineer consults the parts list, the quantity on hand is displayed alongside each part in the list so the engineer to make quick decisions about the availability of parts.

The parts inventory system did not have any application programmer’s interface (API) when we started development, so another development group created one by developing a parts server which directly accessed the parts database. This server solved the problem of getting data from the inventory system.

Another example of client/server integration is the way the browser enables documents to access data from the machine being serviced. Many Xerox products include computers that store data about the current and past state of the machine. This data is useful in helping diagnose or repair a machine. Where the paper document might have said "Check the copy count since last service," the electronic document can display "The copy count since last service is 58,745." This makes using machine data very easy for the CSEs, reducing both the time required to use the machine data tool and the need to remember when to use it. It also eliminates errors in using the tool and transcribing the data.

To integrate this information into the document, we created an SGML entity to represent each piece of machine data and a small plug-in module to the browser which knows how to access the data on the machine. The stylesheet mechanism associates the SGML with the plug-in module. All the software unique to accessing data on the machine is isolated to the plug-in, thus eliminating the need to change the browser itself. By using different plug-ins modules the browser can access different types of machines.

Server/Client Integration

The browser can act as an information server, providing both access to data in the document and to display services. The most common use of this facility is to allow an application to display part of the electronic documentation within its own window. To provide this service, we gave the browser a remote control API which includes the ability to manipulate the user interface. Applications which take advantage of this option can avoid having to develop their own document display capabilities and populating their own document databases.

A client application can configure the browser to have all, some, or none of its normal user interface elements. At one end of the spectrum is the client which simply wishes to point the browser at a section of the document. This can be done with a few API commands. At the other end is the application which wants to embed the documentation within its own windows. In this situation, the client turns off all the browser’s user interface elements (menus, buttons, title bars, etc.) and puts the browser’s main window within one of its own windows. Interaction between the two applications is still handled at a high level through the remote control interface, but the appearance is one of having the documentation as a part of the client application.

This approach to providing the browser’s display services to other applications is not ideal for technical reasons. It requires the client to do some kludgey programming to control the browser’s main window, and the browser and client applications are both running as processes which creates some subtle problems controlling the focus of the applications. We would have preferred to provide a more standard library type API, but this was precluded both by technical and by funding limitations.

Technologies

Developing the browser primarily involved integrating commercial products. We wrote very little low-level code ourselves, opting to incorporate existing products for all the major elements of the viewer. The time and resources required to develop the application were reduced significantly compared to ground-up development. The first release of the program was produced in eight months by one full-time and one part-time programmer, and one part-time SGML expert. The tools we used are described below.

Description of Target Laptop

The laptop computer deployed to the field during 1995 and 1996 was an 8MB, 66Mhz 486 based machine with an internal double speed CD-ROM drive. The operating system was Microsoft Windows 3.1. Including an internal CD-ROM drive was an expensive option at the time, but it contributed significantly to user acceptance of the total package.

DynaText from Inso Corporation (formerly Electronic Book Technologies)

DynaText forms the backbone of the application. It provides SGML display and database facilities, including full-text and structured searching of the document. At the time we made our choice, DynaText’s chief advantages over other tools were a well developed API accessible from C or C++, speed (DynaText’s SGML database is very fast), formatting power, and a powerful stylesheet language.

Its major weaknesses have turned out to be its reliance on a single, compiled SGML database which limits our ability to do incremental updates, and the fact that its library is not re-entrant (meaning that only one application can use the library at a time).

Raster Images

We originally released the browser using DynaText’s built in raster viewer, but this soon proved unsuitable due to performance. We then licensed ViewDirector, from TMS, Inc. ViewDirector is very fast, memory efficient, stable, and it has a good API. ViewDirector’s weakness is mainly with the somewhat limited number of graphics formats supported, which was not a problem for us as we control the format used for all our graphics.

Video and Sound

To play video clips and sound, we used MS-Window’s built in support. These work well and cost nothing to use. They also will stay up to date with current trends in format and capabilities.

Conclusion

Our electronic documentation project has successfully met its major design goals and is functioning well in the field.

The OmniBrowser is currently used to view documents developed by three different organizations within Xerox, using different DTDs. All these documents present a common interface to the field technicians because we map their structural differences onto a common user interface with stylesheets. Our policy of separating the document from the viewer has enabled us to expand the application of the OmniBrowser to non-service documents, such as installation manuals.
The user interface has been easy to learn and use, although we continue to receive valuable criticism from our users and collaborators. Though we tried to provide only essential functions, there are still users who have not explored or used all the options available. We still find users who don’t know that they can make notes or zoom a graphic. Perhaps the most important thing we have learned about the user interface is how indistinguishable it is from the document itself. Our users simply don’t see the document and the viewer as separate entities. The implication for us as developers is that we must put as much energy into document engineering as software engineering.
Graphic performance has been acceptable to the users, although they would clearly prefer interactive circuit diagrams which allowed them to follow wires, and some have asked for three dimensional models. The linking between text and graphics has been cited as very useful, and is one of the most requested enhancements to documents (users want even more links). The video capability has not been used extensively, primarily because our first documents were for existing products and video development was not funded for these.
Integration between electronic documentation and other laptop tools has worked well. It is being used in new products and we are retrofitting existing documentation with the capability to use machine data to control the document’s content. We have issued an installation manual which uses a form to collect configuration information, then tailors the document to match the configuration specified.
Electronic documentation still suffers from not having the high resolution of paper, so studying large format circuit diagrams on screen is difficult. CSEs have requested that these portions of the document remain available in printed form.

Xerox’s electronic documentation initiative is an on-going activity. We have currently published electronic documents covering the majority of products we service and have deployed laptop computers to a sizable portion of our field organization. The CSEs have been very positive about the impact of these tools on their work environment. While we do not have conclusive data as to the impact of the laptops on CSE productivity, we do have strong anecdotal evidence that the CSEs are enjoying not carrying around paper manuals, that CD-ROMs are cheaper to duplicate and ship than paper documents, and that electronic documentation, by virtue of being lighter than paper, is more likely to be used than paper documentation.

References:

1. Travis, Brian E., Waldt, Dale C., "SGML Implementation Guide: A Blueprint for SGML Migration," Springer, 1995. pp. 372.

2. Gellerman, Robin, "Intranets: Get the Most out of your SGML Source," SGML 96 Conference Proceedings, Graphic Communication Association, 1996.

3. Goldfarb, Charles F., "The SGML Handbook", Clarendon Press, Oxford, 1990.

4. ISO/IEC 10744:1992, "Information Technology -- Hypermedia/Time-based Structuring Language (HyTime)," Ed. 1 125p., 1995.

5. Travis, pp. 365-368.

6. Maziarka, Michael, "Representing Information Applicability Using SGML Constructs: Marked Sections or Element/Attribute Representations?" SGML 96 Conference Proceedings, Graphic Communication Association, 1996.

Bio: Mark Harmison is a Technical Specialist/Project Manager at Xerox Corporation in Webster, New York. He graduated from Purdue University with a B.S. in Mathematics and a minor in Education. Between college and his current position, he has worked for NCR Corporation, General Electric Consulting, and as an independent consultant. He lives with his wife and two children in Rochester, New York.