Abstract for "Markup Reconsidered" (Raymond/Tompa/Wood)


We describe some of the implications of markup for document management systems. Markup's properties are inherited from text, since it is embedded in text. These properties are most advantageous when document structure is reducible to substrings of characters, and when the update characteristics of the structure are similar to the update characteristics of the text. We describe situations in which these characteristics are disadvantageous. Markup is not a data model, but one of several possible techniques for representing structure. For this reason it should not be the foundation of document management systems.

Postscript (by rcc)

The reference to this technical report may appear slightly frivilous in the context of this larger document, but it it not meant to be critical of the authors' serious comment on the limitations of markup. It is intended as a reminder that SGML has had significant intelligent detractors from the beginning (with published and unpublished criticisms), and that SGML pressed into service for bad uses will not credit the Standard or help users. For several years, Darrell Raymond and I have had a friendly running conversation about the (ultimate) usefulness of (SGML) markup -- for various alleged purposes. I think he has some good points, particularly with respect the observation I have felt, painfully, in understanding that "text" is susceptible to multiple simultaneous hierarchical descriptions. In practice (viz., current software) SGML has no clean, straight-forward means of dealing with this problem: there must be one privileged hierarchy. I think Darrell's observations from the database perspective, in spirit, deserve to be pondered and kept in mind. -- Robin Cover