[Mirrored from: http://www.acm.org/pubs/toc/Abstracts/cacm/32209.html]

Citation page for article published by ACM, the First Society in Computing. © ACM, Inc.

Markup systems and the future of scholarly text processing

Coombs, James H.
Renear, Allen H.
DeRose, Steven J.

Communications of the ACM
Vol.30, No. 11 (Nov. 1987), pp. 933-947


General Terms

DESIGN, DOCUMENTATION, HUMANFACTORS, LANGUAGES, STANDARDIZATION

Categories and Subject Descriptors

I.7.2 : Computing Methodologies, TEXT PROCESSING, Document Preparation. I.7.1 : Computing Methodologies, TEXT PROCESSING, Text Editing. K.6.4 : Computing Milieux, MANAGEMENT OF COMPUTING AND INFORMATION SYSTEMS, System Management. K.1 : Computing Milieux, THE COMPUTER INDUSTRY, Standards. K.2 : Computing Milieux, HISTORY OF COMPUTING, Software.


Review

Chris Hallgren

This paper considers the academic text in relation to text processing. In some ways, the view of the authors strikes a pose promised in the myth of truly portable generalized markup languages that loomed on the horizon in the 1960s. They repeat their case for descriptive markup over and over, as academic reseach and writing finds its way into several papers, publications, and venues, all with different formats.

Descriptive markup defines the parts of a document, regardless of how they are to be handled in terms of spacing, arrangement, punctuation, or typography. The problem with a descriptive markup language is that it is tied to a compiler that can read the descriptive tokens in the text and perform the necessary processing. If the document if used only on the author's computer, this poses no difficulty; if it is shared in machine-readable format, this leads to a lot of fussing with programs to get the document to come out correctly.

From their particular vantage point, the authors perform a wide survey of the methods used to prepare and format text documents. They correctly identify problems of maintainability and portability in several of the more primitive (simple punctuation tokens or commands) as well as sophisticated (desktop publishing format profiles) text-processing systems. Huge documents can require weeks of re-marking when a simple change in interpreters or systems takes place. Also, within the context of specialized text requirements such as formal dissertations or academic documents, descriptive tokens do have some advantages for handling the picayune differences between document parts in different contexts.

However, the authors neglect several aspects of the wide field of text processing, to the advantage of their argument. Most systems (Microsoft Word being one) have a page preview function, so that simple text changes can be previewed on the screen without binding the user to the change. Also, desktop publishing has greatly enhanced the ease with which an author can change fonts, styles, and other aspects of a document. Some of the more sophisticated systems, such as Interleaf, can employ a token system to define parts of a document and dynamically control styles in real time using the same system (WYSIWYG). Any markup language that is not WYSIWYG must lead two lives---marked up source and formatted output. The interaction between these two is often laborious, not to mention wasteful, in terms of memory or print drafts.

The paper's strong points are a good summary of academic text requirements and a summary of the less sophisticated mainframe methods of creating publishable text copy. The authors also supply a good history of text processing methods up to the mid-1970s.