The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: July 11, 1997
SGML Bibliography 1994

Copyright (c) Robin Cover [1986] 1994 - 1997. Last modified July 11, 1997.

This document http://xml.coverpages.org/sgmlbib0.html is part of the SGML Web Page. It has been superseded. Support for development and maintenance of the SGML Web Page is provided in part by SoftQuad, Inc. and by the Summer Institute of Linguistics, to whom gratitude is acknowledged.


SGML Bibliography


NOTE: This document is no longer up-to-date, and some of its links are probably broken. The current SGML bibliography is available from a top-level index to a series of smaller files. Please use this later source for more authoritative information -- not the document which follows.

Composition and editorial copyright (c) Robin Cover 1994. Last updated December 22, 1994.

[Back to main SGML Page]


Introduction

The following bibliography on SGML represents a small subset of bibliographic data on SGML I have collected since 1986, and have published in various formats. References to the earlier collections are given in the main SGML Page. The present listing provides references for the most essential publications on SGML, and representative titles from the larger corpus of secondary literature. I hope to be able to provide a subject index for these references and for the larger database of bibliographic materials, and perhaps database access. The delay is occasioned by a search for a good strategy to generate HTML from a real SGML knowledgebase, and for a means of chunking the information into manageable units for delivery on the Internet. Meantime, since the file is searchable in any good WWW browser, we hope it will be found useful.

Several Popular Resources

General and Introductory

The SGML Standard and The Handbook

Other SGML Handbooks

HyTime Books

SGML Resources on CD-ROM


References and Abstracts


ACH/ACL/ALLC (Association for Computers and the Humanities, Association for Computational Linguistics, Association for Literary and Linguistic Computing). Guidelines for Electronic Text Encoding and Interchange (TEI P3). Edited by C.M. Sperberg McQueen and Lou Burnard. [Chicago/Oxford]: ACH/ALLC/ACL, [April 8] 1994. 2 volumes. xxvi + 1290 pages. See ordering information for the 2 volumes here.

For an excellent general introduction to SGML, see Chapter 2 of the Guidelines (pages 13-36): "A Gentle Introduction to SGML." Chapter 2 supplies a broad introduction to SGML, but the remainder of the two volumes will be of interest to anyone planning to implement SGML for analysis of literary and linguistic data. For online hypertext versions of Chapter 2, see overview section. The SGML introduction chapter (2) is also available along with the other chapters via anonymous-FTP from various sources on the Internet where the TEI P3 documents are archived. For example: the SGML Project at Exeter ftp://info.ex.ac.uk/tei/p3/doc/p3sg.doc, or ftp://ftp-tei.uic.edu/pub/tei/doc/p3sg.doc, or from the SGML Repository ftp://ftp.ifi.uio.no/pub/SGML/TEI/P3SG.DOC. Using mail-based access, send a message to listserv@uicvm.uic.edu with the message line: get P3SG DOC; for the listing, send the message line: index tei-l; for the entire set of P3 files: get P3ALL $PACKAGE.


Adler, Sharon C. "The birth of a standard (SGML)." Journal of the American Society for Information Science 43/8 (September 1992) 556-558. ISSN: 0002-8231. (4) references. Author affiliation: IBM Corporation, Boulder, CO. Abstract: The Standard Generalized Markup Language (SGML) was adopted as an international standard for data description, data modeling, and interchange in October 1986. This article explores the evolution of the standard following its technical completion and leading to widespread market acceptance.


Adler, Sharon C. "DSSSL- Document Style Semantics and Specification Language." <TAG> 1/8 (January 1989) 8-9. An overview of the standard by the editor of DSSSL. For brief description of the goals of DSSSL, see the entry below on this Draft International Standard (ISO/IEC DIS 10179).


Ahearn, Hally. "SGML and the New Yorker Magazine." Technical Communication: Journal of the Society for Technical Communication 40/2 (Second Quarter, May 1993) 226-229. ISSN: 0049-3155. Author affiliation: Oster & Associates, Inc. [SGML case history; need abstract]


Alschuler, Liora. "Special Section: Standard Generalized Markup Language. Introduction." Technical Communication: Journal of the Society for Technical Communication 40/2 (Second Quarter, May 1993) 208-290, and 40/3 (Third Quarter, August 1993) 376-378. ISSN: 0049-3155. Author affiliation: Miles-Samuelson, Inc. These two issues of Technical Communication have eight (8) articles on SGML. See: [xrefs, not complete yet].


Amsler, Robert A.; Tompa, Frank W. "An SGML-Based Standard for English Monolingual Dictionaries." In Fourth Annual Conference of the UW Centre for the New Oxford English Dictionary: Information in Text. Proceedings of the Conference. Conference held in Waterloo, Ontario, Canada, 26-28 October 1988. Pages 61-79. Waterloo, Ontario: University of Waterloo, 1988. The 'Dictionary Encoding Initiative' referenced is loosely affiliated with the international Text Encoding Initiative; both projects seek to employ SGML. For SGML used in dictionary markup, see also Tompa below. Several of the Waterloo Annual Conference volumes contain articles germane to descriptively-tagged and SGML-tagged text. For further details on the Waterloo Centre, see Gonnet below.


Angerstein, Paula. "An Introduction to the Document Style Semantics and Specification Language (DSSSL): A Description of the DSSSL Standard and on its Status." CALS Journal (Spring 1993) 67-72. [This article is now (November 1994) partially out-of-date with respect to details in the current DSSSL draft, but it supplies a useful overview.]


Association of American Publishers. Author's Guide to Electronic Manuscript Preparation and Markup. 2nd edition, November 1987. Reprinted 1989. ISBN: 1-55653-086-2. Available from EPSIG.


Association of American Publishers. The Markup of Mathematical Formulas. 2nd edition, 1987. Reprinted 1989. ISBN:1-55653-083-8. Available from EPSIG.


Association of American Publishers. The Markup of Tabular Material. 2nd edition, 1987. Reprinted 1989. ISBN: 1-55653-085-4. Available from EPSIG.


Association of American Publishers. Reference Manual on Electronic Manuscript Preparation and Markup. 2nd edition, November 1987. Reprinted 1989. ISBN: 1-55653-084-6. Available from EPSIG.


Ballanti, Anna; Cork, Deborah; Dam, Lex van; Jonghe, Jurgen de; Herwijnen, Eric van; Nijdam, Marco; Samarin, Alexandre; Shave, Tony. "Text Processing at CERN. Part 1: Overview." SGML Users' Group Bulletin 3/2 (1988) 39-54.


Barnard, David T.; Fraser, Cheryl A.; Logan, George M. "Generalized Markup for Literary Texts." Literary and Linguistic Computing 3/1 (1988) 26-31. Abstract: Encoding literary texts for analysis, electronic transmission, or publication requires the marking of various substantive, structural and formal features. The development of a standard comprehensive markup language for these purposes is a desideratum. This paper offers a set of requirements for such a language, reviews related work, and describes a newly-created standard based on the Standard Generalized Markup Language.


Barnard, David T.; Hayter, Ron; Karababa, Maria; Logan, George M.; McFadden, John. "SGML Based Markup for Literary Texts: Two Problems and Some Solutions." Computers and the Humanities 22/4 (1988) 265-276. ISSN: 0010-4817. (Revision of Technical Report 204, Queen's University Department of Computing and Information Science, 1988, ISSN 0836-0227). Abstract: There is wide agreement on the need for a markup standard for encoding literary texts. The Standard Generalized Markup Language (SGML) seems to provide the best basis for such a standard. But two problems inhibit the acceptance of SGML for this purpose. (1) Computer-assisted textual studies often require the maintenance of multiple views of a document's structure but SGML is not designed to accommodate such views. (2) An SGML-based standard would appear to entail the keyboarding of more markup than researchers are accustomed to, or are likely to accept. We discuss five ways of reducing the burden of markup. We conclude that the problem of maintaining multiple views can be surmounted, though with some difficulty, and that the markup required for an SGML-based standard can be reduced to a level comparable to that of other markup schemes currently in use.


Barnard, David T.; Macleod, Ian A. "Maestro Working Paper 0: An Archive of Structured Texts." Technical Report 89-262. Department of Computing and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada. November 14, 1989. 10 pages. Abstract: We describe a research project to create a text archive system known as MAngement Environment for Structured Text Retrieval Online (Maestro). The system combines traditional text retrieval capabilities with structural queries based on a hierarchic representation of documents, and browsing based on non-hierarchic links within a single document or among a set of documents.


Barron, David. "Why Use SGML?" Electronic Publishing: Origination, Dissemination and Design (EPOdd) 2/1 (April 1989) 3-24. CODEN: EPODEU; ISSN 0894-3982. Abstract: The Standard Generalised Markup Language (SGML) is a recently-adopted International Standard (ISO 8879). The paper presents some background material on markup systems, gives a brief account of SGML, and attempts to clarify the precise nature and purpose of SGML, which are widely misunderstood. It then goes on to explore the reasons why SGML should (or should not) be used in preference to older-established systems. A summary of the article is also printed in "Why Use SGML," SGML Users' Group Newsletter 13 (August 1989) 10.


Beach, Richard J. Setting Tables and Illustrations with Style. PhD dissertation, University of Waterloo. [Published as] Technical Report CS-85-45. Department of Computer Science, University of Waterloo, Waterloo, Ontario. May 1985. Also available under the same title as: Technical Report CSL-85-3. Palo Alto, CA: Xerox Palo Alto Research Center [PARC], 1985.


Berglund, Anders. "SGML -- What is It?" In Proceedings of SEAS Anniversary Meeting 1985: User Friendly Computing (23-27 September 1985, Zurich), 1: 187-194. Nijmegen: SHARE Eur. Assoc, 1985. Abstract: SGML is intended to be the Standard Generic Markup Language providing the framework for marking up a document in a way that should be processable by products from different vendors and for different output devices.


Bingham, Harvey W. SGML Syntax Summary. Cambridge, MA: Interleaf, 2-June-1988. 46 pages. The document [now mostly superseded by other reference tools] supplies cross-reference information which is not given or optimally accessible in the ISO 8879 standard itself. The syntax summary covers the primary ISO document (8879), Amendment 1 (Fall 1987) and Amendment 1, Corrections (May 1988). Copies of the syntax summary were mailed to subscribers of <TAG> with issue 1/4 (1988). Updates are (were?) available from Interleaf.


Böhm, Klemens; Aberer, Karl; Hüser, Christoph. "Extending the Scope of Document Handling: The Design of an OODBMS Application Framework for SGML Document Storage." Technical paper. 17 pages. GMD-IPSI, 1994. (Email contact: kboehm@darmstadt.gmd.de) [needs abstract]


Brooks, Kenneth P. "Lilac: A Two-View Document Editor." IEEE Computer 24/6 (June 1991) 7-19. ISSN: 0018-9162. Author affiliation: Digital Equipment Corporation. Published summary: "By offering both WYSIWYG editing and language-based document description side by side, the Lilac document preparation system gives users the best of both worlds." For details, see the author's dissertation.

Abstract: A description is given of Lilac, an experimental document preparation system designed to provide the best of both the WYSIWYG (what you see is what you get) and the document compiler approaches. Lilac does this by offering both WYSIWYG editing and language-based document description as two views side by side on the screen. The page view is a WYSIWYG editor showing a close approximation to the printed output. The source view shows a program-like description of the document in a special-purpose language. This language supports subroutines, variables, and conditional execution, and is designed to encourage the use of subroutines to embody structure. Both views are editable, but Lilac is designed with the expectation that most editing will occur in the page view.


Brooks, Kenneth P. A Two-view Document Editor with User-definable Document Structure. Systems Research Center Technical Report, 33. Palo Alto, CA: Digital Equipment Corporation, November 1, 1988. vi + 193 pages. The report, with minor changes, represents the author's PhD dissertation presented to the Department of Computer Science, Stanford University, May 1988. Abstract: "Lilac is an experimental document preparation system which combines the best features of batch-style document formatters and WYSIWYG editors. To do this, it offers the user two views of the document: a WYSIWYG view and a formatter-like source view. Changes in either view are rapidly propogated to the other. This report describes both the user interface and design and the implementation mechanisms used to build Lilac." See also the description of Lilac in IEEE Computer.


Brown, Allen L., Jr,; Wakayama, Toshiro; Blair, Howard A. "A Reconstruction of Context-Dependent Document Processing in SGML." Pages 1-25 in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 [International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation]. Edited by Christine Vanoirbeek and Giovanni Coray. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4.

Abstract: SGML achieves a certain degree of context-dependent document processing through attributes and linking. These mechanisms are insufficient in several respects. To address these shortcomings we propose augmenting SGML's !LINK and !ATTLIST constructs with two new mechanisms, coordination and (rule-based) attribution. These mechanisms can be used to specify the result of context-dependent processing in a uniform fashion while considerably increasing SGML's expressive power. We illustrate this enhanced power by sketching a specification of (the result of) document layout that can be encoded in SGML augmented with coordination and attribution.


Brüggemann-Klein, Anne. Compiler-Construction Tools and Techniques for SGML Parsers: Difficulties and Solutions. Technical Report. Freiburg: Institut für Informatik, Universität Freiburg, May 29, 1994. To appear in Electronic Publishing: Origination, Dissemination and Design. For a Postscript version, try FTP from Freiburg. Author's address: Anne Brueggemann-Klein, Institut fuer Informatik, Universitaet Freiburg, Rheinstrasse 10-12, 79104 Freiburg, Germany, email: brueggem@informatik.uni-freiburg.de.

Abstract: The Standard Generalized Markup Language (SGML) is used to represent documents in an application-independent manner. In a recent paper, Nordin et al. analyze concisely which properties of the SGML language are hindering its more widespread use and acceptance. In particular, they identify a number of features in the SGML standard that make it difficult to apply commonly used implementation tools and techniques to build an SGML parser. One feature, however, or rather one combination of two features, escapes their notice. Unambiguity and the & operator were both intended to make SGML document grammars easier to read by humans. It is questionable, though, whether this goal is really achieved. At least, the combination of unambiguity and the & operator raises unforeseen problems in validating the grammars and in parsing the documents by machines. I am describing these problems here in detail. On the basis of this analysis, the standards committees that are currently revising the standard can make an informed decision on the future of the two features.


Brüggemann-Klein, Anne. Formal Models in Document Processing. Habilitationsschrift, vogelegt zur Erlangung der venia legendi für Informatik an der Mathematischen Fakultät der Albert-Ludwigs-Universität zu Freiburg i.Br. Freiburg, 1993. 110 pages, bibliography, index. For a Postscript version, try FTP from Freiburg. Author's address: Anne Brueggemann-Klein, Institut für Informatik, Universitaet Freiburg, Rheinstrasse 10-12, 79104 Freiburg, Germany, email: brueggem@informatik.uni-freiburg.de.

Summary: Part I of the dissertation (pages 1-64) treats the formal properties of documents, and particlarly, marked-up documents in terms of unambiguous expressions and unambiguous languages. Part II of the dissertation is concerned with design specifications for layout of a structured document as implemented in Designer: Chapter 4 discusses "The constituents of design specifications," and Chapter 5 discusses "The design specification language." The presentation focuses upon aspects of document, formatter, and style-sheet models that are applicable within the general context of logically marked-up documents. The approach to document design taken in Designer "emulates, in the context of electronic publishing, the separation of concerns between authoring, editing, designing, and typesetting that is well-established in the traditional publishing industry." (p. 4)


Brüggemann-Klein, Anne; Wood, Derek. Deterministic Regular Languages. Bericht 38. Universität Freiburg, Institut für Informatik. Oktober 1991. 18 pages.

Abstract: The ISO Standard for Standard Generalized Markup Language (SGML) provides a syntactic meta-language for the definition of textual markup systems. In the standard the right hand sides of productions are called content models and they are based on regular expressions. The allowable regular expressions are those that are "unambiguous" as defined by the standard. Unfortunately, the standard's use of the term "unambiguous" does not correspond to the two well known notions, since not all regular languages are denoted by "unambiguous" expressions. Furthermore, the standard's definition of "unambiguous" is somewhat vague. Therefore, we provide a precise definition of "unambiguous expressions" and rename them deterministic regular expressions to avoid any confusion. A regular expression E is deterministic if the canonical ε-free finite automaton M[subscript]E[/subscript] recognizing L(E) is deterministic. A regular language is deterministic if there is a deterministic expression that denotes it. We give a Kleene-like theorem for deterministic regular languages and we characterize them in terms of the structural properties of the minimal deterministic automata recognizing them. The latter result enables us to decide if a given regular expression denotes a deterministic regular language and, if so, to construct an equivalent deterministic expression.


Brüggemann-Klein, Anne; Wood, Derek. "Deterministic Regular Languages." Pages 173-184 in STACS '92: Proceedings of the 9th Annual Symposium on Theoretical Aspects of Computer Science [Cachan, France, 13-15 February 1992]. Edited by A. Finkel and M. Jantzen. Lecture Notes in Computer Science, 577. Berlin: Springer Verlag, 1992. ISBN: 3-540-55210-3. For a Postscript version, try FTP from Freiburg. Author's address: Anne Brueggemann-Klein, Institut fuer Informatik, Universitaet Freiburg, Rheinstrasse 10-12, 79104 Freiburg, Germany, email: brueggem@informatik.uni-freiburg.de. Abstract: (see the TR version).


Brüggemann-Klein, Anne; Wood, Derek. Electronic Style Sheets. Technical Report [UWO] 350. March 2, 1993. Department of Computer Science, University of Western Ontario, London, Ontario. 12 pages, bibliography. Supported under NSERC and ITRC grants of Derick Wood. The paper is available via FTP to UWO; ftp://ftp.csd.uwo.ca/pub/csd-technical-reports/350/. Authors' addresses: Anne Brueggemann-Klein, Institut fuer Informatik, Universitaet Freiburg, Rheinstrasse 10-12, D-7800 Freiburg, Germany, email:brueggemann@informatik.uni-freiburg.de; Derick Wood, Department of Computer Science, University of Western Ontario, London, Ontario N6A 5B7, Canada. Email: dwood@csd.uwo.ca.

Abstract: Document processing systems must provide formatted versions of documents, where the specification of formats is the task of the document designer. To match the stylistic quality expected in the traditional publishing process, electronic style sheets need to support the design mechanisms that have evolved over the centuries. The designer's craft should not depend on the formatter,in particular it should not involve programming the formatter. We propose four basic mechanisms called transcription types that are sufficient to express a wide range of layouts. Building on these four transcription types, we have defined a layout specification language, Designer,that is declarative and formatter-independent.


Brüggemann-Klein, Anne. "Regular Expressions into Finite Automata." Theoretical Computer Science 120 (1993) 197-213. See shorter article in Theoretical Computer Science and the related technical report.


Brüggemann-Klein, Anne. "Regular Expressions into Finite Automata." Pages 97-98 in Latin '92, edited by I. Simon. Berlin: Springer Verlag, 1992. Lecture Notes in Computer Science, 583. See full article in Theoretical Computer Science and the related technical report.


Brüggemann-Klein, Anne. Regular Expressions into Finite Automata. Bericht 33. Universität Freiburg, Institut für Informatik. Juli 1991. 22 pages. Abstract: It is a well-established fact that each regular expression can be transformed into a non-deterministic automaton (NFA) with or without ε-transitions, and all authors seem to provide their own variant of the construction. Of these, Berry and Sethi BS86 have shown that the construction of an ε-free NFA due to Glushkov Glu61 is a natural representation of the regular expression, because it can be described in terms of the Brzozowski derivatives Brz64 of the expression. Moreover, the Glushkov construction also plays a significant role in the document processing area: The SGML standard ISO86, now widely adopted by publishing houses and government agencies for the syntactic specification of textual markup systems, uses deterministic regular expressions, i.e., expressions whose Glushkov automaton is deterministic, as a description language for document types. In this paper, we first show that the Glushkov automaton can be constructed in time quadratic in the size of the expression, and that this is worst case optimal. For deterministic expressions, our algorithm has even linear run time. This improves on the cubic time methods suggested in the literature BEGO71ASU86BS86. A major step of the algorithm consists in bringing the expression into what we call star normal form. This concept is also useful for characterizing the relationship between two types of unambiguity that have been studied in the literature. Namely, we show that, modulo a technical condition, an expression is strongly unambiguous SS88 if and only if it is weakly unambiguous BEGO71 and in star normal form. This leads to our third result, a quadratic time decision algorithm for weak unambiguity, that improves on the bi-quadratic method introduced by Book et al. BEGO71. (A version of this TR is also to appear in the conference proceedings of Latin '92.)


Brüggemann-Klein, Anne. "Unambiguity of Extended Regular Expressions in SGML Document Grammars." Pages 73-84 in Algorithms -- ESA '93: Proceedings of the First Annual European Symposium [(Bad Honnef, Germany. September 30 - October 2, 1993)], edited by Th. Lengauer. [Series title:] Lecture notes in computer science, 726. Berlin: Springer Verlag, 1993. (9) references. ISSN: 0302-9743. ISBN: 3540572732. [needs abstract]


Brüggemann-Klein, Anne; Wood, Derick. Unambiguous regular expressions and SGML document grammars. Technical Report # 337. November 12, 1992. Department of Computer Science, University of Western Ontario, London, Ontario. 21 pages, bibliography. ISBN: 0771414544. The paper is available via FTP to UWO; ftp://ftp.csd.uwo.ca/pub/csd-technical-reports/337/. Authors' address: Anne Brueggemann-Klein, Institut fuer Informatik, Universitaet Freiburg, Rheinstrasse 10-12, D-7800 Freiburg, Germany; Derick Wood, Department of Computer Science, University of Western Ontario, London, Ontario N6A 5B7, Canada.

Abstract: The ISO standard for the Standard Generalized Markup Language (SGML) provides a syntactic meta-language for the definition of textual markup systems. In the standard, the right-hand sides of productions are based on regular expressions; although only expressions that denote words unambiguously are allowed. In general, the fact that a word is denoted by an expression is witnessed by a sequence of occurrences of symbols in the expression that matches the word. In an unambiguous expression as defined by Book, Even, Greibach, and Ott, each word has at most one witness. But the SGML standard also requires that a witness can be computed incrementally from the word with a one-symbol lookahead; we call such expressions 1-unambiguous. A regular language is 1-unambiguous if it is denoted by some 1-unambiguous expression. We give a Kleene theorem for 1-unambiguous languages and characterize them in terms of structural properties of the minimal deterministic automata that recognize them. This result enables us to decide whether a given regular expression denotes a 1-unambiguous language; if it does, then we can construct an equivalent 1-unambiguous expression in worst-case optimal time.


Brüggemann-Klein, Anne; Wood, Derick. The validation of SGML content models. Technical Report # 355. March 21, 1993. Department of Computer Science, University of Western Ontario, London, Ontario. 15 pages, 13 references. ISBN: 0771415028. To appeas in Mathematical and Computer Modelling. The paper is available via FTP to UWO; ftp://ftp.csd.uwo.ca/pub/csd-technical-reports/355/. Or try FTP from Freiburg. Authors' address: Anne Brueggemann-Klein, Institut fuer Informatik, Universitaet Freiburg, Rheinstrasse 10-12, D-7800 Freiburg, Germany; Derick Wood, Department of Computer Science, University of Western Ontario, London, Ontario N6A 5B7, Canada.

Abstract: The Standard Generalized Markup Language (SGML) is an ISO standard that provides a syntactic meta-language for the definition of textual markup systems, which are used to indicate the structure of documents so that they can be electronically typeset, searched, and communicated. We address only one problem raised by the standard, namely: In SGML, the right-hand sides of context-free productions are regular expressions, called content models, that are restricted to be what the standard calls ``unambiguous,'' but what is more appropriately called deterministic. We solve the problem of how to define determinism precisely, how to recognize deterministic regular expressions efficiently, and how to recognize deterministic regular languages. Any SGML parser must check that a given document grammar conforms to the standard; that is, it must validate it. Hence, our results are an important step in the clarification of the standard and in the efficient implementation of an SGML parser for SGML document grammars.


Brüggemann-Klein, Anne; Wood, Derek. On the Expressive Power of SGML Document Grammars. "In preparation", [reference 1991].


Brüggemann-Klein, Anne; Wood, Derek. Parser Generators for Document Grammars. "Submitted for publication," 1991.


Bryan, Martin. "Creating Informative Document Models." SGML Users' Group Newsletter 20 (September 1991) 12-17.


Bryan, Martin. SGML: An Author's Guide to the Standard Generalized Markup Language. Wokingham/Reading/New York: Addison-Wesley, 1988. ISBN: 0-201-17535-5 (pbk); LC CALL NO: QA76.73.S44 B79 1988. 380 pages. A highly detailed manual explaining and illustrating features of ISO 8879. According to the publisher, the book: (1) shows how to analyse the inherent structure of a document; (2) illustrates a wide variety of markup tags; (3) shows how to design your own tag set; (4) is copiously illustrated with practical examples; (5) covers the full range of SGML features. Technical and non-technical authors, publishers, typesetters and users of desktop publishing systems will find this book a valuable tutorial on the use of SGML and a comprehensive reference to the standard. It assumes no prior knowledge of computing or typography on the part of its readers. See further description in a publisher's blurb.


Bryan, Martin. "A TeX User's Guide to ISO's Document Style Semantics and Specification Language." TUGboat 14/3 (1993) [Proceedings of the 1993 Annual Meeting] 223-226. [needs abstract; partially out-of-date w/ current DSSSL draft]


Bullard, Len (editor), with Eric L. Jorgensen (CDNSWC Code 192, Project Director) and other members of the MID development team. Metafile for Interactive Documents (MID); A Draft Specification for the Encoding of Interactive Documents. Bethesda, MD: Carderock Division, Naval Surface Warfare Center, November 1994. 119 pages. The document is available in several formats via anonymous FTP: to NavySGML, or via HTTP connection to the NAVY DTD/FOSI Repository. See the full text of the announcement for other details.

Summary: "This draft of the MID Specification has been prepared for purposes of review and comment by the general National and International technical community interested in standards for Interactive Electronic Documents which require a mechanism (i.e., script) for controlling the presentation of text, graphics, and other multimedia information developed for electronic display. It is written as an application of ISO 8879 SGML and utilizes portions of the ISO 10744 HYTIME extensions to SGML. While it was initiated by the Navy for purposes of developing a run-time standard for DoD Interactive Electronic Technical Manuals (IETMs), the MID Standard has been intentionally developed to be suitable for inclusion in an International-Level standard and to be applicable to generic scripted interactive documents of any nature and for any application. The Navy point of contact is Eric Jorgensen, CDNSWC Code 182, email: jorgense@oasys.dt.navy.mil."


Burnard, Lou. "What is SGML and How Does it Help?" Pp. 65-79 in Modelling Historical Data: Towards a Standard for Encoding and Exchanging Machine-Readable Texts, edited by Daniel Greenstein. Halbgraue Reihe zur Historischen Fachinformatik, Serie A, Historische Quellenkunden (edited by Manfred Thaller). Band 11. Published for the Max-Planck-Institut für Geschiche, by Scripta Mercaturae Verlag (St. Katharinen), 1991. iv + 223 pages. ISBN: 3-928134-45-0. See other volume information sub the editor, Daniel Greenstein below. A revised copy of Burnard's article in tagged electronic format is available from the UICVM (TEI-L) LISTSERVer (listserv@uicvm on BITNET) as EDW25 DOC, October 1, 1991. Send a command to the LISTSERVer: get edw25 doc tei-l. The document is/was available in text format from the OTA FTP server. Or read an HTML copy (dated November 1994) mirrored on the server.


Burnard, Lou; Sperberg-McQueen, C. M. "Encoding for Interchange: An Introduction to the TEI." Draft version, November 21, 1994. 36 pages. Various versions of this document are or have been available. It has carried the filename TEIU5 in a number of incarnations, but apparently began as 'TEI ED W21'. Look on the OTA FTP server or environs for the most recent version, but if nothing obvious is there, try the WWW server for a copy dated November 21, 1994.

Abstract: The purpose of this document is to provide a brief introduction to the recommendations of the Text Encoding Initiative (TEI). It shows how these recommendations may be used to encode a wide variety of commonly encountered textual features, in such a way as to maximize the usability of electronic transcriptions and to facilitate their interchange among scholars using different computer systems. This tutorial discusses the basic principles of encoding texts, and describes most of the TEI "core" tag set and most of the elements defined in the TEI "base tag set for prose". It does not address other more specialized tag sets. However, the elements and attributes described here should be adequate for the encoding of a wide variety of different kinds of material to a reasonable degree of detail. Some basic knowledge of SGML is assumed.


Cave, Francis. "Information Handling Techniques for the Office: Untangling the Standards Web." Pp. 100-110 in Information Handling Techniques for the Office: Full Text Rules OA (Office Automation)? Proceedings of the Institute of Information Scientists Text Retrieval '86 Conference (London 1986). Edited by Susan Hills. London: Taylor Graham, 1987. ISBN 0-947568-33-6. Abstract: Standards are assuming a significant role in the fields of publishing and office automation, with the introduction of some significant techniques for describing documents in electronic form. The author discusses the standards making process. Significant standards include ISO 8879 standard generalized markup language (SGML) and ISO DIS/8613 office document architecture (ODA) and interchange format. He also mentions ISO DIS/9059 SGML document interchange format (SDIF) and other standards related to SGML. He discusses the origins of SGML and then looks at some of its features, describing it as a document markup metalanguage. He discusses implementation of SGML in existing systems and future systems and its limitations.


Chamberlin, Donald Dean; Goldfarb, Charles F. "Graphic Applications of the Standard Generalized Markup Language (SGML)." Computers and Graphics 11/4 (1987) 343-358. ISSN: 0097-8493. Abstract: The Standard Generalized Markup Language (SGML) is a language for representing document structure. This paper discusses ways in which the SGML language might be used to represent graphic as well as textual contents of a document. By using SGML markup for both graphics and text, a document processing application can achieve a more uniform treatment and tighter coupling between these two types of materials.


Chamberlin, Donald Dean; Hasselmeier, Helmut F.; Paris, Dieter P. "Defining Document Styles for WYSIWYG Processing." Pages 121-137 in Document Manipulation and Typography. Proceedings of the International Conference on Electronic Publishing, Document Manipulation and Typography [Nice (France) April 20-22 1988]. Edited by J. C. van Vliet. Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1988). ISBN 0-521-36294-6.

Abstract: Recent years have shown two distinct but converging trends in document processing: the trend toward direct manipulation, or "WYSIWYG" systems, and the trend toward high-level generic markup. The Quill project at IBM Research is an attempt to combine the flexibility and ease of use of a WYSIWYG interface with the formatting power of the international standard SGML markup language. The Quill system will present a WYSIWYG user interface but will format documents under control of an external Document Design that specifies the degree of user control over document appearance. Quill includes a tool called the Designer's Workbench that enables a Document Designer to specify the syntax and semantics of a given type of document. Each element in the document type is defined by a "look" consisting of a property sheet and an optional semantic routine. The semantic routines are written in a high-level programming language and can call a set of system-provided utility functions that are designed, according to rules described in this paper, to be suitable for WYSIWYG processing." See also on Quill: Wolfsthal.


Christophides, Vassilis; Abitebol, Serge; Cluet, Sophie; Scholl, Michel. "From Structured Documents to Novel Query Facilities." 12 pages, 28 references. INRIA, 1994. (Email contact: Vassilis.Christophides@inria.fr) [needs abstract, and xref to published version] Apparently published as pages 313-324 in SIGMOD '94. Draft version vailable in PostScript [UNIX compressed] via FTP from INRIA.


Christophides, Vassilis; Rizk, A. "Querying Structured Documents with Hypertext Links Using OODBMS." 13 pages. Technical document. INRIA/Euroclid, September, 1994. Submitted for publication to ACM [ECHT '94]. Draft version vailable in PostScript [UNIX compressed] via FTP from INRIA. [needs abstract]


Clark, James. "DSSSL Lite Specification Preliminary Draft." Network file: available via WWW client. 1994/11/24. Approximately 10 pages. Author's address: jjc@jclark.com.


Clark, James. "DSSSL Slides." Set of about 45 slides on DSSSL used in a presentation to the Norwegian SGML Users' Group, November 23, 1994. Network resource accessible via WWW client. Author's address: jjc@jclark.com.


Coombs, James H.; Renear, Allen H.; DeRose, Steven J. "Markup Systems and the Future of Scholarly Text Processing." Communications of the Association for Computing Machinery 30/11 (1987) 933-947. ISSN: 0001-0782. Cf. response in CACM 31/7 (July 1988) 810-811, cited with its authors. This seminal article is now reprinted in The Digital Word: Text-Based Computing in the Humanities, eds. George P. Landow and Paul Delaney (Cambridge/London: MIT Press, 1993) 85-118. Note in the same Digital Word volume a follow-up article: Steven J. DeRose, "Markup Systems in the Present" (pages 119-135).

Abstract: The authors argue that many word processing systems distract authors from their tasks of research and composition, toward concern with typographic and other tasks. Emphasis on "WYSIWYG", while helpful for display, has ignored a more fundamental concern: representing document structure. Four main types of markup are analyzed: Punctuational (spaces, punctuation,...), presentational (layout, font choice,...), procedural (formatting commands), and descriptive (mnemonic labels for document elements). Only some ancient manuscripts have no markup. Any form of markup can be formatted for display, but descriptive markup is privileged because it reflects the underlying structure. ISO SGML is a descriptive markup standard, but most benefits are available even before a standard is widely accepted. A descriptively marked-up document is not tied to formatting or printing capabilities. It is maintainable, for the typographic realization of any type of element can be changed in a single operation, with guaranteed consistency. It can be understood even with no markup formatting software: compare "<blockquote>" to ".sk 3 a; .in +10 -10; .ls 0; .cp 2". It is relatively portable across views, applications and systems. Descriptive markup also minimizes cognitive demands: the author need only recall (or recognize in a menu) a mnemonic for the desired element, rather than also deciding how it is currently to appear, and recalling how to obtain that appearance. Most of this extra work is thrown away before final copy; descriptive markup allows authors to focus on authorship. (abstract supplied by Steve DeRose)


Cover, Robin. "SGML: Annotated Bibliography and List of Resources." <TAG> 5/3 (March 1992) 4, [1-12]; 5/4 (April 1992) 4, [13-24]; 5/5 (May 1992) 4, [25-36]. A three-part article covering SGML bibliography and resources in 10 major categories, including SGML software available on the Internet.


Cover, Robin. "SGML Bibliography. [Appendix C.]" In Perspectives on Electronic Publishing: Standards, Solutions, and More [by Sandy Ressler]. Pages 285-320. Englewood Cliffs, NJ: PTR Prentice Hall, 1993. ISBN 0-13-287491-1.


Cover, Robin; Duncan, Nicholas; Barnard, David. "The Progress of SGML (Standard Generalized Markup Language): Extracts from a Comprehensive Bibliography." Literary and Linguistic Computing 6/3 (1991) 197-209. ISSN: 0268-1145. The article includes introductory essay sections delineating the fundamental conceptions of SGML, its broad application, and the advantages it brings to academia, industry and government sectors. Abstract: SGML (Standard Generalized Markup Language) is used to describe structured documents and related digital information in a machine and application independent way. Markup is added to a text or SGML database to describe its features rather than to specify processing instructions to be carried out. We elaborate on the need for SGML and for a bibliographic guide to SGML documents and resources. We further describe the development of a printed and electronic bibliography covering SGML and related issues in text processing. An extract of the most significant entries from the comprehensive bibliography is included.


Cover, Robin; Duncan, Nicholas; Barnard, David. Bibliography on SGML (Standard Generalized Markup Language) and Related Issues. Technical Report 91-299. Queen's University, Kingston, Ontario. February, 1991. ISSN 0836-0227. 312 pages. A revised print version of a bibliographic and information database (compiled by Robin Cover), structured in SGML-database and formatted with SGML ->> BibTeX utilities developed at Queen's University by Nick Duncan and David Barnard. For print copies, contact: (1) Department of Computing and Information Science; Queen's University; Kingston, Ontario, Canada K7L 3N6; TEL: (613) 545-6056; Email (Internet): heather@qucis.queensu.ca, or (2) the Graphic Communications Association. The printed version of the database contains a "Short Bibliography" of 67 essential references, and a fuller "Main Bibliography" with 1403 citations (many with abstracts). The second major section is an SGML Directory for some 117 SGML-supporting groups in academia, government, or industry: each entry supplies addresses, descriptions of software products or SGML services, and references. The Table of Contents may be seen here.


Cowan, D. D.; Mackie, E. W.; Pianosi, G. M.; Smit, G de V. "Rita -- an Editor and User Interface for Manipulating Structured Documents." Electronic Publishing: Origination, Dissemination and Design (EPOdd) 4/3 (September 1991) 125-150. ISSN: 0894-3982. Received 14-March-1991, Revised 16-November-1991. 50 references. Authors' affiliation: [Cowan, Mackie, Pianosi] Department of Computer Science and Computer Systems Group, University of Waterloo, Waterloo, Ontario, Canada; [Smit] Department of Computer Science, University of Cape Town.

Abstract: "Structured documents such as those developed for SGML, GML, or LaTeX usually contain a combination of text and tags. Since various types of documents require tags with different placement, the creator of a document must learn and retain a large amount of knowledge. Rita consists of an editor and user interface which are controlled by a grammar or description of a document type and its tags, and which guide the user in preparing a document, thus avoiding the problems of tags being used or placed incorrectly. The user interface contains a display which is almost WYSIWYG so that the appearance of the document can be examined while it is being prepared. This paper describes Rita, its user interface and some of its internal structure and algorithms, and relates anecdotal user experiences. Comparisons are also made with other commercial and experimental systems."


Cruz, Gil C.; Judd, Thomas J. "The Role of a Descriptive Markup Language in the Creation of Interactive Multimedia Documents for Customized Electronic Delivery. Pages 249-262 in Electronic Publishing '90: Proceedings of the International Conference on Electronic Publishing, Document Manipulation and Typography (Gaithersburg, Maryland, September 1990). The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1990.


Davidson, W. J. "SGML Authoring Tools for Technical Communication." Technical Communication: Journal of the Society for Technical Communication] 40/3 (Third Quarter, August 1993) 403-409. ISSN: 0049-3155. Author affiliation: SoftQuad, Inc. [needs abstract]


de la Beaujardière, Jean-Marie. "Well-Established Document Interchange Formats." Pp. 83-94 in Document Manipulation and Typography. Proceedings of the International Conference on Electronic Publishing, Document Manipulation and Typography (Nice (France) April 20-22 1988). Edited by J. C. van Vliet. Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1988). ISBN 0-521-36294-6. Abstract: A number of standards exist to facilitate the interchange of electronic documents between different computer systems. Though they each have their place and have proved very useful, none has reached the level of generality and completeness required to ensure an exchange of information without loss, especially in round-trip interchange. This paper takes a look at four well-known interchange formats (DIF, DCA, SGML, ODA) and compares the ways they handle the same sample document.


Derose, Steven J. "Markup Systems in the Present." Pages 119-135 in The Digital Word: Text-Based Computing in the Humanities, edited by George P. Landow and Paul Delaney. Cambridge/London: MIT Press, 1993. [need abstract]


DeRose, Steven J., and David G. Durand. Making Hypermedia Work: A User's Guide to HyTime. Boston/Dordrecht/London: Kluwer Academic Publishers, 1994. ISBN: 0-7923-9432-1. HyTime is an ISO sandard (ISO/IEC 10744) and is an application of SGML for hypermedia and other time-based documents. See the bibliography entry for the HyTime standard. A very readable introduction to SGML is provided in Chapter 3 of Making Hypermedia Work (pages 35-61), "Overview of SGML." Review of the book: Dale Waldt, "Is it high time for HyTime?" <TAG> 7/9 (September 1994) 1, 9-10. [reviewed also Harry Gaylord?] See the online file with the Foreword, Preface, and Table of Contents here. It is also via FTP from world.std.com.


DeRose, Steven J.; Durand, David G.; Mylonas, Elli; Renear, Allen H. "What is Text, Really?" Journal of Computing in Higher Education 1/2 (Winter 1990) 3-26. ISSN: 1042-1726. Abstract: "The way in which text is represented on a computer affects the kinds of uses to which it can be put by its creator and by subsequent users. The electronic document model currently in use is impoverished and restrictive. The authors agree that text is best represented as an ordered hierarchy of content object(s) (OHCO), because that is what text really is. This model conforms with emerging standards such as SGML and contains within it advantages for the writer, publisher, and researcher. The authors then describe how the hierarchical model can allow future use and reuse of the document as a database, hypertext or network."


Donovan, Truly. An Introduction to SGML-based Publishing. Charles F. Goldfarb Series On Open Information Management. Englewood Cliffs, NJ: PTR Prentice Hall, [forthcoming Fall] 1995. ISBN: 0-13-216243-1. Author's address: truly@lunemere.com.

Summary: This is a technical introduction to SGML. It is aimed at those people who need a conceptual understanding of SGML and its applications in sufficient detail to begin working effectively with SGML tools and technologies, to make judgments about introducing the technology into their environment, and to direct the activities of others who are engaged in design and implementation. After reading this book, programmers will be prepared to work through the complexities of SGML as presented in Goldfarb's classic The SGML Handbook. [publisher's pre-publication description]


Duke, John K. "Slow Revolution: The Electronic AACR2." Library Resources and Technical Services 38/2 (1994) 190-194. 5 references. Author affiliation: Assistant Director, Network and Technical Services, Virginia Commonwealth University, Richmond, Virginia. [Describes the process of digitizing the AACR2 (Anglo-American Cataloging Rules, 2nd edition) and structuring it with SGML encoding. The project illustrates some of the difficulties in writing/using a DTD for a complex existing document that both utilizes and specifies particular print formatting (punctuation, spacing) as a means of encoding information. (rcc)]

Abstract: [long; get via email from John] Abstract: Attempts to encode AACR2 thus far have met with only moderate success. Three principal alternatives exist for converting AACR2: (1) create a simple ASCII file, (2) create a MARC format for AACR, and (3) use the Standard Generalized Markup Language (SGML). It was decided by the AACR2 principals to use SGML to construct the machine-readable version. It was further decided that the file would not be issued to end users as a finished product, but rather as a source file to other developers who would be responsible for adding display and search software and, if desired, for integrating other cataloging-related products with it. In producing AACR2-e, we have tried to remain true to several fundamental principles: (1) preserve the integrity of the text; (2) support the revision process without laborious rekeying and proofing; (3) reflect the printed text accurately as regards physical layout; (4) produce future printed versions; (5) have a database structure for online versions with connections to related products; and (6) preserve sequences of rules, yet also permit linkages among related rules. An inordinate amount of time has been spent on developing the Document Type Definition, and difficulty has been encountered in the conversion of the file to SGML. The potential to make catalogers think in new ways about how to structure the cataloging code for more efficient use is the greatest contribution that the electronic AACR2 will make.


Ellison, Paul A. "SGML and Related Information Standards." Pp. 17-28 (1-12) in Document Exchange: The Use of SGML in the UK Academic and Research Community. Workshop Proceedings 5-7 March 1990 (see Mumford below). Abstract: "This paper explains the position of four ISO 'standards' (only one agreed standard, one draft standard and two draft proposals) in the area of text and office information processing. Those standards are SGML (Standard Generalized Markup Language), the 'Fonts' standard (Font Architecture and Interchange Format), DSSSL (Document Style Semantics and Specification Language), and SPDL (Standard Page Description Language). . . In addition, the paper relates these standards to ODA (Office Document Architecture) and places SGML and ODA in their own contexts."


Elsevier Science, B.V. Documentation of the Elsevier Science Article DTD, Version 1.1.0. By N.A.F.M. Poppelier, H. van der Togt, and F.K. Veldmeijer. Amsterdam: Elsevier Science, April 25, 1994. 75 pages. Available by FTP from Elsevier. See further in the main Elsevier entry.


Ensign, Chet. Managing Information in a World of Change: SGML in Action. Charles F. Goldfarb Series On Open Information Management. Englewood Cliffs, NJ: PTR Prentice Hall, [forthcoming Fall] 1995.

Summary: This is a series of detailed case studies of companies that have successfully deployed SGML to solve common information problems like document assembly lines that won't work because the pieces won't fit together, or products that can't get to market because an army of writers can't keep up with the changes, or support costs that are going through the roof because the published information is too complicated for the buyers to read. The goal of this book is to provide technology decision makers with the information they need to answer the question: "Why can't we just buy everyone a copy of Word for Windows?" The book answers this question, not with rational arguments, but with stories straight from the companies that have already been there.


Ensign, Chet. "SGML by Evolution." Technical Communication: Journal of the Society for Technical Communication] 40/3 (Third Quarter, August 1993) 387-393. ISSN: 0049-3155. Author affiliation: Information Builders, Inc. [SGML case study; need abstract]


Exoterica Corporation. The Compleat SGML. CD-ROM. August, 1993. Price: 95.00 dollars US. This hypertext tool for Microsoft Windows (using an Asymetrix Toolkit) links the full online text of the ISO8879:1986 SGML standard with 2348 SGML test documents. Along with this stack there is a set of extremely large files (roughly 10MB each) provided for testing purposes. They are fairly complex documents with large DTD's. The SGML documents are created in accordance with ANSI's Conformance Testing for SGML Systems 5 standard and serve to provide detailed illustration of the points being made in the standard. The tool also includes annotations that clarify some of the more esoteric areas of the standard. It contains the standard in hypertext form, along with Exoterica annotations and links to a viewable version of Exoterica's ISO8879 Conformance Test Suite. The tests are helpful as examples of the effect of particular clauses. The Compleat SGML is not intended to be a learning tool for SGML beginners, but those studying the standard should find it a very useful complement to any paper version.


Exoterica Corporation. Content Model Algebra. Technical Paper ECM11-1091. Release 2. Ottawa: Exoterica Corporation, 6-June-1991. v + 18 pages. "Anyone desiring to have a full understanding of SGML content models and theory behind the construction of text markup languages should have some familiarity with the basic concepts of Automata Theory and Set Theory on which content models are based... This report an outline of some of the concepts of Automata Theory and Set Theory relevant to creating text markup languages." [from the paper's Introduction] The document is available (free) from: Exoterica Corporation; 1545 Carling Avenue, Suite 404; Ottawa, Ontario; CANADA K1Z 8P9; TEL: +1 (613) 722-1700; TEL: +1 (800) 565-9465 [US and Canada]; FAX: +1 (613) 722-5706. Email: info@exoterica.com


Exoterica Corporation. Exoterica Complex Tables. Technical Report EUM09-0291-2. Release 2.0 Ottawa: Exoterica Corporation, February 18, 1991. iv + 28 pages. The document is available (free) from: Exoterica Corporation; 1545 Carling Avenue, Suite 404; Ottawa, Ontario; CANADA K1Z 8P9; TEL: +1 (613) 722-1700; TEL: +1 (800) 565-9465 [US and Canada]; FAX: +1 (613) 722-5706. Email: info@exoterica.com


Exoterica Corporation. Record Boundary Processing in SGML. Technical Paper ETR13-1092. Release 2. Ottawa: Exoterica Corporation, 22-October-1992. iv + 7 pages. The document is available (free) from: Exoterica Corporation; 1545 Carling Avenue, Suite 404; Ottawa, Ontario; CANADA K1Z 8P9; TEL: +1 (613) 722-1700; TEL: +1 (800) 565-9465 [US and Canada]; FAX: +1 (613) 722-5706. Email: info@exoterica.com


Exoterica Corporation. Understanding the SGML Declaration. Release 2.2. Ottawa: Exoterica Corporation, October 23, 1992. Technical Report ECM03-1092. iv + 40 pages. This (valuable!) booklet explains the parts and syntax of an SGML declaration, and will reliably guide a novice user through the task of modifying the concrete syntax (for an SGML application which permits such modification). It includes a helpful section on "Character Sets in the SGML Declaration," and a full "Template of the SGML Declaration" for use as a 'quick reference' when writing or modifying a declaration. The document is available (free) from: Exoterica Corporation; 1545 Carling Avenue, Suite 404; Ottawa, Ontario; CANADA K1Z 8P9; TEL: +1 (613) 722-1700; TEL: +1 (800) 565-9465 [US and Canada]; FAX: +1 (613) 722-5706. Email: info@exoterica.com. See the SGML Declaration main entry for other help.


Fahmy, Eanass; Barnard, David T. "Adding Hypertext Links to an Archive of Documents." Canadian Journal of Information Science 15/3 (September 1990) 26-41. Abstract: Texts are characterized by various types of linkages, within themselves and with other documents, which may be either explicit or implicit. When texts are available in machine-readable form, the ability to trace linkages should become much easier, and more complex tracing of linkages should be possible. Hypertext is an electronic document paradigm whose distinguishing feature is machine support for the building and tracing of intra- and inter-document links; a document is viewed as a collection of nodes connected by directed links. A limitation of many hypertext systems is that all links must be created explicitly by the user. This is impractical in many situations, and it is unnecessary if the link structure is inherent in the documents themselves. The work described in our paper is motivated by the perceived need to extend the hypertext paradigm so that links can be derived from a collection of documents. We explore how a rich set of links connecting documents in a text archive can be programmatically generated, and present a set of link types that are useful, specifiable and computable. The documents in the archive are encoded using the Standard Generalized Markup Language, which views a document as a hierarchical organization of document elements. The archive, therefore, consists of a forest of document trees.


Fawcett, Heather J. Adopting SGML: The Implications for Writers. Technical Report OED-89-03. Waterloo, Ontario: University of Waterloo Centre for the New Oxford English Dictionary, 1989. Abstract: SGML, the Standard Generalized Markup Language, separates the content of adocument from its format. SGML documents contain tags that describe what a text component is rather than how it should be formatted. The absence of device-dependent formatting codes means documents can be transferred across systems and formatted in various ways. The presence of tags allows for selective searching, editing and viewing of the text. However, determining what text components should be tagged can be difficult since text can be classified in various ways, depending on how the document will be used.


Fawcett, Heather. "The New Oxford English Dictionary Project." Technical Communication: Journal of the Society for Technical Communication] 40/3 (Third Quarter, August 1993) 379-382. ISSN: 0049-3155. Author affiliation: Information Design Solutions.


Flynn, Peter. "TeX and SGML: A Recipe for Disaster?" TUGboat 14/3 (1993) [Proceedings of the 1993 Annual Meeting] 227-230. 6 references. The text is/was online at Curia. Abstact: The relationship between TeX and SGML (Standard Generalized Markup Language, ISO 8879) has always been uneasy, with adherents to one system the other displaying symptoms reminiscent of the religious wars popular between devotees of TeX and other word processors. SGML and TeX can in fact coexist successfully, provided features of one system are not expected of the other. This paper presents a pilot program to test one method of achieving such a cohabitation. Author affiliation: University College, Cork, Ireland; email: pflynn@curia.ucc.ie.


Fought, John; Wesler, Marcia; Davenport, Heather; Van Ess-Dykema, Carol. Extending SGML Concurrent Structures: Toward Computer-Readable Meta-Dictionaries." Literary and Linguistic Computing 8/1 (1993) 33-38. 8 references. ISSN: 0268-1145. Authors' affiliation: [Fought, Wesler, Davenport] University of Pennsylvania; [Van Ess-Dydema] US Department of Defense. Correspondence address: jjohn@apollo.lab upenn.edu [John Fought, Director, Language Analysis Center, University of Pennsylvania].

Abstract: We propose the use of SGML 'concurrent structures' to create and tag the structure of an idealized or virtual document to be mapped onto the tagged structures from actual print dictionaries. The idealized structure is to be defined by a simplified document type definition [SGML DTD]; the elements of actual print dictionary entries will be rearranged to fit into the resulting template. We use a system of index numbers to link the elements of the generalized entries with their sources in the entries of the actual documents. We illustrate this technique by using it to merge elements from a number of different dictionaries into a generalized entry structure.


Gaspart, Jean-Pierre. "Use of the SGML Parser at the Office for Official Publications of the European Communities (OPOCE)." SGML Users' Group Bulletin 2/1 (1987) 29-36.


Gaylord, Harry E. "Character Representation." Groningen: Vakgroep Alfa Informatica, Rijksuniversiteit Groningen, 1994. 19 pages. Draft version of a TEI-related paper [June 24, 1994], to be published in revised format in CHUM. Author address: galiard@let.rug.nl. Available via FTP from ftp.let.rug.nl, or as a mirror copy on the server. The Netherlands FTP site also stores a copy of the paper in TeX format.


Gennusa, Pamela L. "Advantages of an SGML Implementation for Management of an Electronic Text Database." SGML Users' Group Bulletin 2/2 (1987) 73-86. "This paper focuses on the use of the Standard Generalized Markup Language (SGML) as a tool that goes beyond a neutral markup language. It investigates the possibility of using specific SGML features and conventions to help suppliers to manage and manipulate their technical data as a text database. The Standard Generalized Markup Language, SGML, is a meta-language that can be tailored to an application and that is valid regardless of the processing methods used. The language can be used to describe a text stream at varying levels of complexity, based on the needs of the application. Although SGML can be used in the financial, legal, and office publishing fields, its flexibility makes it of particular value to technical publishing." (Based upon a paper presented at MarkUp '86, Luxembourg.)


Gennusa, Pamela L. Negotiating for Electronic Documentation: Acquisition, Delivery, & Systems. Charles F. Goldfarb Series On Open Information Management. ISBN: 0-13-190372-1. Englewood Cliffs, NJ: PTR Prentice Hall, [forthcoming Fall] 1995.

Summary: SGML is rapidly emerging as the required format for information across many major industries. All contractors and subcontractors to the US Department of Defense and all contractors and subcontractors to the major airline and automobile manufactures must deliver documentation electronically in SGML. Over the years, these companies have evolved elaborate procedures and conventions for the delivery and procurement of documentation on paper. Gennusa describes the new systems, procedures, contractual instruments, and other conventions that the delivery and procurement of information in electronic form calls for. [publisher's pre-publication description]


Gilmore, Elizabeth. "Introducing Today's SGML". Technical Communication: Journal of the Society for Technical Communication] 40/2 (Second Quarter, May 1993) 210-218. ISSN: 0049-3155. Author affiliation: Passage Systems.


Glushko, Robert J.; Kershner, Ken. "Silicon Graphics' IRIS InSight: An SGML Success Story." Technical Communication: Journal of the Society for Technical Communication] 40/3 (Third Quarter, August 1993) 394-402. ISSN: 0049-3155. Authors' affiliation: [Glushko] Passage Systems, Inc.; [Kershner] Silicon Graphics, Inc. [SGML case study; need abstract]


Goldfarb, Charles F. "A Generalized Approach to Document Markup." Proceedings of the ACM SIGPLAN SIGOA Symposium on Text Manipulation. = SIGPLAN Notices 16/6 (1981) 68-73. Conference proceedings containing this paper also available as SIGOA Newsletter 2/1-2 (Spring/Summer 1981).


Goldfarb, Charles F. The SGML Handbook. Edited and with a foreword by Yuri Rubinsky. Oxford: Oxford University Press, 1990. ISBN: 0-19-853737-1. 688 pages. This volume contains the full annotated text of ISO 8879 (with amendments), authored by IBM Senior Systems Analyst and acknowledged "father of SGML," Charles Goldfarb. The book was itself produced from SGML input using a DTD which is a variation of the "ISO.general" sample DTD included in the annexes to ISO 8879. The SGML Handbook includes: (1) the up-to-date amended full text of ISO 8879, extensively annotated, cross-referenced, and indexed (2) a detailed structured overview of SGML, covering every concept (3) additional tutorial and reference material (4) a unique "push-button access system" that provides paper hypertext links between the standard, annotations, overview, and tutorials. See a detailed Table of Contents listing for further description.


Gonnet, Gaston. "Examples of PAT Applied to the Oxford English Dictionary." Technical Report OED-87-02. University of Waterloo Centre for the New Oxford English Dictionary. July, 1987. PAT and associated text processing tools are built around descriptively-marked text, even if not specifically SGML text. Compare also "PAT, GOEDEL, LECTOR and More: Text-dominated Database Software, " pp. 83-84 in: Tools for Humanists, 1989. A Guidebook to the Software and Hardware Fair Held in Conjunction with the Dynamic Text 6-9 June 1989 Toronto. Toronto, Ontario: Centre for Computing in the Humanities, 1989. The article describes several software tools developed at the Waterloo Centre, including TRUC (an editor for SGML or SGML-style tagged text). TRUC supports multiple views of a tagged document, based upon use of style-sheets.

The University of Waterloo has pioneered several important research efforts in the study of machine-readable lexical databases, machine transduction and generation of descriptively marked-up electronic texts (SGML-style markup). The Centre has also developed software to search, interactively display and format text structured with descriptive markup. These tools were developed for the New Oxford English Dictionary Project with the long range goal of application to other texts. A Newsletter is issued by the Centre describing ongoing research, publications, software enhancements, work of visiting scholars, conferences and other events. Persons interested in the Centre's research and publications may write for a current document list (e.g., especially the several publications and technical reports by Darrell R. Raymond, Donna L. Berg, Gaston H. Gonnet, Timothy J. Benbow, Heather J. Fawcett, Rick Kazman, Frank Wm. Tompa, George V. J. Townsend. See Gonnet, Raymond and Tompa in this bibliography. Address: Electronic Text Research; Centre for the New Oxford English Dictionary; Davis Centre; University of Waterloo; Waterloo, Ontario; Canada N2L 3G1 TEL: (1 519) 885-1211 extension 6183; Email (Internet):newoed@waterloo.edu.

The PAT and LECTOR tools are now supported commercially by Open Text Systems, Inc., a spin-off company working closely with the University of Waterloo Centre for the New Oxford English Dictionary and Text Research. Open Text Systems was "established to market, develop and customize the text management software created at the (NOED) Centre." The company began operations in December, 1989, and supports the Transduction Toolkit, PAT (text search system), GOEDEL (database management system), LECTOR (text display system) and TRUC software developed at the University of Waterloo Centre. The supported software was designed around and tested on one of the largest and most complex lexical databases, the Oxford English Dictionary, Second Edition. For further description, see (1) Steve Higgins, "Open Text Adds Automatic Indexing to Document Managment Software," PC Week 7/32 (August 13, 1990) 38; (2) UW Centre for the NOED Newsletter 22 (December, 1989) 1-2; (3) Dale Waldt, "OpenText Search and Retrieval Tools," <TAG> 5/1 (January 1991) 9. The group may be reached at : Open Text Systems, Inc., Unit 622, Waterloo Town Square, Waterloo, Ontario, CANADA N2J 1P2; Tel: (519) 746-8288; FAX: (519) 746-3255; Email (Internet): tbray@watsol.waterloo.edu (Tim Bray).


Gonnet, Gaston; Tompa, Frank W. "Mind Your Grammar: A New Approach to Modelling Text." Technical Report OED-87-01. University of Waterloo Centre for the New Oxford English Dictionary. February, 1987. (Cf. Gaston H. Gonnet and Frank Wm. Tompa, "Mind Your Grammar: A New Approach to Modeling Text," pp. 339-346 in the Proceedings of the 13th International Conference on Very Large Data Bases (VLDB87), Brighton, England (Sept. 1-4, 1987)).

Abstract: Beginning to create the New Oxford English Dictionary database has resulted in the realization that databases for reference texts are unlike those for conventional enterprises. While the traditional approaches to database design and development are sound, the particular techniques used for commercial databases have been repeatedly found to be inappropriate for text-dominated databases, such as the New OED. In the same way that the relational model was developed based on experiences gained from earlier database approaches, the grammar-based model presented here builds on the traditional foundations of computer science, and particularly database theory and practice. This new model uses grammars as schemas and "parsed strings" as instances. Operators on the parsed strings are defined, resulting in a "p-string algebra" that can be used for manipulation and view definition. The model is representation-independent and the operators are non-navigational, so that efficient implementations may be developed for unknown future hardware and operating systems. Several approaches to storage structures and efficient processing algorithms for representative hardware configurations have been investigated.


Goossens, Michel; Herwijnen, Eric van. "Scientific Text Processing." 68 pages. Accepted (January 1992) for publication in International Journal of Modern Physics. C, Physics and Computers. ISSN: 0129-1831. Abstract: Aspects of text processing important for the scientific community are discussed, and an overview of currently available software is presented. Progress on standardization efforts in the area of document exchange (SGML), document formatting (DSSSL), document presentation (SPDL), fonts (ISO 9541) and character codes (Unicode and ISO 10646) is described. An elementary particle naming scheme for use with LATEX and SGML is proposed. LATEX, PostScript, SGML and desk-top publishing allow electronic submission of articles to publishers, and printing on demand. Advantages of standardization are illustrated by the description of a system which can exchange documents between different word processors and automatically extract bibliographic data for a library database. See also (provisionally): Michel Goossens et Eric van Herwijnen, "Introduction à SGML, DSSSL et SPDL," Cahiers GUTenberg 12 (décembre 1991) 37-70.


Gordon, Thomas. The QWERTZ SGML Document Types: Version 1.1 Reference Manual. Arbeitspapiere der GMD, Nr. 588 0723-0508. Sankt Augustin: Gesellschaft für Mathematik und Datenverarbeitung mbH, [November] 1991. 59 pages.


Graf, John M. "Ambiguity in the Instance." <TAG> 7 (1988) 6-9. Observes that "a document created under the exact rules of a valid DTD may very well be invalid when passed through an instance parser." See the response of John McFadden and Sam Wilmott, "Ambiguity in the Instance: An Analysis" in <TAG> 9 (March/April 1989) 3-5.


Graphic Communications Association. The SGML Source Guide. The Graphic Communications Association's Guide to Standard Generalized Markup Language (SGML) Systems, Software, Service, Consultants, Seminars and Resources. Edited by Marion Elledge. Graphic Communications Association, February, 1991. [and later editions] 6" x 8". 105 pages. ISBN: 0-93505-13-2. Several SGML-related standards documents distributed by GCA are listed and annotated in this Guide. Listings of SGML suppliers are in alphabetical order and provide information on the type of business, name and description of products or services, and prices. The Guide is issued on a subscription basis in looseleaf format; updates are issued quarterly or as information is accumulated.


Greenstein, Daniel I. (editor). Modelling Historical Data: Towards a Standard for Encoding and Exchanging Machine-Readable Texts. Halbgraue Reihe zur Historischen Fachinformatik, Serie A, Historische Quellenkunden (edited by Manfred Thaller). Band 11. Published for the Max-Planck-Institut für Geschiche, by Scripta Mercaturae Verlag (St. Katharinen), 1991. iv + 223 pages. ISBN: 3-928134-45-0. A collection of fourteen essays on various aspects of conceptual modelling and development of standardized encoding methods for representing knowledge in historical texts. The contributions are by Manfred Thaller, Lou Burnard, Daniel I. Greenstein, Hannes D. Galter, Ingo H. Kropač, Donald A. Spaeth, Hans Jørgen Marker, Thomas Werner, Jan Oldervoll, and Kevin Schurer. The essays reflect interaction with and critique of encoding methods which emerged from the TEI phase I efforts as documented in TEI-P1; see on TEI entry and its pointers to the UICVM LISTSERVer where early TEI research documents are archived.


Grootenhuis, Jan. "Disambiguation of SGML Document Models." <TAG> 12 (December 1989) 11-12. Contact: Jan Grootenhuis, c/o CIRCE, Kralenbeek 1873, 1104 KJ Amsterdam, THE NETHERLANDS; tel: +31 20 998966.


Gross, Mark. "Getting Your Data into SGML." Technical Communication: Journal of the Society for Technical Communication] 40/2 (Second Quarter, May 1993) 219-225. ISSN: 0049-3155. Author affiliation: Data Conversion Laboratory.


Guittet, Christian (ed.) FORMEX: formalisation de l'échange de publications électronique = Formalised Exchange of Electronic Publications. Luxembourg: Office des Publications officielles des Communautés européennes, New Technologies -- Project Management Department, 1985. Copyright ECSC, EEC and EAEC. ISBN: 92-825-5399-X. The volume contains an introduction to SGML and implementation of the standard for electronic interchange of CEC and OPOCE documents. FORMEX unified two different approaches to text interchange: (1) Common Communication Format (CCF), PGI-84/WS/4, Paris: UNESCO, 1984, itself based upon ISO 2709 Format for Bibliographic Information Interchange on Magnetic Tapes, and (2) ISO/DIS 8879-1985 (SGML). Extended sections of the book are reprinted in an appendix to Ann M. Western's report "SGML in Europe -- Autumn 1985," SGML Users' Group Bulletin 1/1 (1986) 25-57. See also Christian Guittet, "FORMEX: une mise en pratique des normes internationales," SGML Users' Group Bulletin 1/2 (1986) 95-101 and "FORMEX Special Interest Group," SGML Users' Group Newsletter 11 (January 1989) 3.


Guittet, Christian. "Appendix -- Introduction to SGML. Extract from FORMEX. Published by the EEC Office of Official Publications." SGML Users' Group Bulletin 1/1 (1986) 26-57.


Hajagos, Lani. "Documents and SGML." UNIX Review 11/3 (1994) 39-41. Author affiliation: Frame Technology.


Hansen, B. S. "A function-based formatting model." Electronic Publishing: Origination, Dissemination and Design 3/1 (February 1990) 3-28. (22) references. ISSN: 0894-3982. Author affiliation: Department of Computer Science, Technical University of Denmark, Lyngby, Denmark. Abstract: Concerns a document processing model accounting for aspects of an activity which is usually called formatting. The core of the model, an experimental formatting language called FFL, is the central topic. FFL is a purely functional language in the style of FP and the applicative part of APL. Sequences, characters, and so-called boxes constitute the data types and among the build-in primitives are functions for aligning/spacing, breaking etc. Emphasis is put on presenting the language and exemplifying its use. Also considered are issues in type checking of formatting function definitions and techniques for doing incremental formatting with FFL formatting functions. FFL is currently being implemented by the BENEDICK project group led by the author.


Hayter, Ron. "Comments on 'On Improving SGML'." Technical Bulletin 4. Software Exoterica Corporation, 1988. Hayter argues that Kaelbling's "improvements" to SGML" (see reference to Kaelbling) are based upon a misunderstanding of the intent of the standard. Kaelbling's original draft known to Hayter was apparently 16-March-1988; Kaelbling's revised draft of 18-October-1988 responds to Hayter's comments.


Heath, Jim; Welsch, Larry. "Difficulties in Parsing SGML." In Proceedings of the ACM Conference on Document Processing Systems, Santa Fe (5-9 December 1988). Pages 71-77. New York: Association for Computing Machinery, 1988. See similarly, by the same authors, "Difficulties in Parsing: Suggestions to Improve SGML," <TAG> 10 (July 1989) 8-10.

Abstract: A frequently cited problem with the Standard Generalized Markup Language (SGML) is that applications using the standard have been slow in arriving. Part of this delay is because of the instability of the standard and part because of constructs of the language that are functionally redundant and/or add unnecessary complexity to both machine and human processing. This paper is based on our experience implementing an SGML parser using commonly available tools for building programming language translators. It describes the problems we encountered and suggests modifications to SGML to eliminate those problems. The modified language can be implemented using well tested tools and will be more stable and more amenable to both computer and human processing while maintaining all of the fundamental strengths of SGML.


Herwijnen, Eric van. Practical SGML. 2nd edition. Boston/Dordrecht/London: Kluwer Academic Publishers, 1994. ISBN: 0-7923-9434-8. xx + 288 pages. [First edition: Dordrecht/Hingham, MA: Wolters Kluwer Academic Publishers, 1990. 200 pages. ISBN: 0-7923-0635-X.] Reviews of the first edition: (1) by Carol Van Ess-Dykema in Computational Linguistics 17/1 (March 1991) 110-116, (2) by Deborah A. Lapeyre in <TAG> 16 (October 1990) 12-14, and (3) by Nico Poppelier in TUGboat 13/2 (July 1992) 184-185. See an excerpt from the first edition Preface for an overview by the author. Reviews of the second edition: (1) By Nico Poppelier Poppelier, (2) by Harry Gaylord: see "Review: Eric van Herwijnen, Practical SGML Second Edition", also mirrored as a copy on the server. An online text file with Foreword, Preface, and Table of Contents for the second edition of the book is available here, or via FTP from world.std.com.

An electronic version of Practical SGML was used to produce an online tutorial of SGML. The SGML tutorial is now published by Electronic Book Technologies, and is readable under a runtime version of the DynaText SGML browser. Contact Electronic Book Technologies (EBT) or the GCA regarding this SGML Tutorial: Graphic Communications Association, 100 Daingerfield Rd., Alexandria, VA 22314. Tel 1.703.519.8160, and FAX 1.703.548.2867.


Hockey, Susan. "Developing Access to Electronic Texts in the Humanities." Computers in Libraries 13/3 (February 1993) 41-43. [Published] Abstract: The Center for Electronic Texts in the Humanities (CETH) was established in 1991 by Rutgers and Princeton Universities to provide a national focus for those who are involved in the creation, dissemination, and use of electronic texts and resources in the humanities. These resources may be literary works, historical documents, manuscripts, papyri, inscriptions, transcriptions of spoken texts, or dictionaries, and they may be written in any natural language. Electronic texts become much more useful when additional information, such as author, title, chapter, or features such as quotations and proper names are marked in some way. There are at least thirty different methods of encoding such features, but a new common format developed by the Text Encoding Initiative (TEI SGML) is emerging. A further issue to be addressed is that many existing texts also suffer from inadequate documentation and unclear copyright situations.


Hockey, Susan. "Text Encoding Initiative and SGML." In Seminar on Cataloging Digital Documents (1994): University of Virginia Library and Library of Congress). Proceedings of the Seminar on Cataloging Digital Documents, October 12-14, 1994. Computer file. University of Virginia Library, Charlottesville, and the Library of Congress. October 13, 1994. The document is available via the LC WWW server, or as a copy mirrored on the Server. See a summary of the LC seminar by Sarah E. Thomas (Director of Cataloging). Internet access to the proceedings volume's Table of Contents is in the document http://lcweb.loc.gov/catdir/semdigdocs/seminar.html.


Hockey, Susan; Walker, Donald. "Developing Effective Resources for Research on Texts: Collecting Texts, Tagging Texts, Cataloging Texts, Using Texts, and Putting Texts in Context." Literary and Linguistic Computing 8/4 (1993) 235-242. [Describes TEI-SGML.] Abstract: Although the value of corpus-based research has been recognized since the compilation of the Brown and LOB corpora in the 1960s, the overall picture today is still one access to texts provided in many different ways, some of which are ad hoc and dependent upon individuals. Attention has thus turned to the need for reusable corpora and the establishment of procedures to guarantee that reusability. In the longer term we see the library as the place that will maintain and provide access to electronic texts and corpora, as it already does for print and other archival media. The Text Encoding Initiative's guidelines will play an important role in standardizing corpus-access procedures, in particular the TEI's proposal for an electronic text file header which will ensure that adequate information is available about the text and will provide the link with the library catalogue. We see a further need for detailed studies of the 'uses and users' of electronic texts and for research to establish a sounder methodology for the compilation of corpora.


Holstege, Mary Anne. "Marking and the Design of Notations." PhD Dissertation. Technical Report No. STAN-CS-89-1270. Department of Computer Science, Stanford University. June, 1989. 239 pages.

Abstract: A document is a written artifact. It is a designed linguistic structure rendered in a relatively stable visual medium. As a linguistic structure it has two aspects: the message itself, constructed by a writer to convey some ideas, and the notation used to express those ideas. Therefore, to understand documents, one must understand notations and how their design works in a visual medium to present a writer's message. Creating a well-motivated language for describing notations provides insight into their workings and into the construction of documents written in them. It provides writers with a logical and uniform view of their documents, serves as a conceptual tool for the rational design and analysis of notations, and becomes the basis for the creation of an integrated suite of tools for document production. Such a language for describing documents and their notations defines a document model.

This dissertation sets out a general document model. It weaves together ideas from three strands: grammar-driven structure editing, text-formatting, and linguistics. Two linguistic principles guide the overall structure. First, a document is a text plus a grammar, so the visible form of a document can be changed either by altering the text or by changing the grammar rules. Second, visible marks are not random. They exist for the express purpose of providing clues to the logical structure of the document. Editing a grammar can produce principled changes in the visible form of a document in a way that editing simple visual markup does not. Combining this observation with conventional wisdom from the other two strands leads to the division of the grammar into three parts. Abstract syntax governs logical structure, abstract geometry governs visual structure (layout),and concrete syntax mediates between them, specifying how logical elements are to be marked visually. The model uses an extension of the operator-phylum model for the abstract syntax, a generalization of boxes-and-glue for the abstract geometry, and a functional description based in part on linguistic marking theory for the concrete syntax.


Huitfeldt, Claus. "Multi-Dimensional Texts in a One-Dimensional Medium." Pages 142-161 in Wittgenstein and Contemporary Theories of Language: Papers read at the French-Norwegian Wittgenstein Seminar in Skjolden, 23-26 May 1992. Edited by Paul Henry and Arild Utaker. Working Papers from the Wittgenstein Archives at the University of Bergen, No 5. Bergen: University of Bergen, 1992. Abstract: "This paper discusses one of the tools which may be used for representing texts in machine-readable form, i.e., encoding systems or markup languages. This discussion is at the same time a report on current tendencies in the field. An attempt is made at reconstructing some of the main conceptions of text lying behind these tendencies. It is argued that although the conceptions of texts and text structures inherent in these tendencies seem to be misguided, nevertheless text encoding is a fruitful approach to the study of texts. Finally, some conclusions are drawn concerning the relevance of this discussion to themes in text linguistics."


Interleaf, Inc. The SGML Guide. Corporate document M73071-001. Waltham, MA: Interleaf, 1994. vi + 83 pages. [no personal author given] Free: call 1-800-955-5323; or contact via surface mail: Interleaf. Inc., Prospect Place, 9 Hillside Avenue, Waltham, MA 02154.


International Organization for Standardization (ISO). ISO 639:1988 (E/F). Code for the Representation of Names of Languages. First edition, 1988-04-01. Reference number is ISO 639:1988 (E/F). iii + 17 pages. Geneva: International Organization for Standardization, 1988. [revision and addition of part 2 (alphabetic 3-character codes) is underway] See provisionally (a) the primary data from the 1988 standard as given here from Keld, or (b) a different compilation of the ISO 639:1988 language codes, or (c) the comparable MARC 3-character language codes, from about 1991.

ISO 639:1988 is a technical revision of ISO 639:1967, prepared by Technical Committee ISO/TC 37. The two-character language codes of ISO 639 are relevant to SGML encoding in two respects. First, the SGML standard (ISO 8879) itself specifies that declaration of 'public text language' should be given using the language code(s) from ISO 639; see ISO 8879-1986(E) page 36, section 10.2.2.3. Second, the WSD (Writing System Declaration) implemented in the Text Encoding Initiative uses the two-character language code of ISO 639 (as amended) as a 'language.code' attribute of the 'nat.language' declaration, specifying the language in which the WSD is written.

ISO 639 contains much other information about the use of language symbols, registration of new symbols, etc. The language codes of ISO 639 are said to be "devised primarily for use in terminology, lexicography and linguistics, but they may be used for any application requiring the expression of languages in coded form." The registration authority for ISO 639 is given as Infoterm, Österreiches Normungsinstitut (ON), Postfach 130, A-1021 Vienna, AUSTRIA.

The two-character language codes of ISO 639 are recognized as being inadequate for use as SGML language attributes when tagging text, viz, for use as global 'lang' attributes attached to any element to identify the language of the text element or a language shift. In principle, there should be nothing wrong with tagging language using SGML elements rather than attributes, if the encoder has principled reasons for not using attributes (e.g., indexing engines which read simple tags but not SGML attributes). But the two-character codes of ISO 639 are neither sufficiently mnemonic nor complete for the world's languages: whereas ISO 639 supplies codes for only about 136 languages, the Ethnologue published by the Summer Institute of Linguistics identifies over 6100 languages (see Ethnologue: Languages of the World, ed. Barbara Grimes. 11th edition. Dallas, TX: Summer Institute of Linguistics, 1988). A revision of ISO 639 completed late 1990 is supplies 3-character language codes (following MARC 3-character language codes in part), based upon the code sequence of the American National Standard (ANSI Z39.53). This draft will be circulated for worldwide review in 1991/92. See below under ISO CD 639/2:1991 (CD part 2). [entry needs update]


International Organization for Standardization (ISO). ISO CD 639/2:1991. Code for the Representation of Names of Languages: alpha-3 Code. iii + 52 leaves. ISO CD 639/2:12/16/91 culminates more than three years of intense collaboration between the representatives of ISO TC 37/SC2 (Layout of Vocabularies) and ISO TC46/SC4 (Computer Applications in Information and Documentation). It preserves the principal features of ISO 639-1 (the existing alpha-1 list) while articulating a code that meets the needs of librarians, managers of bibliographic services, and information specialists. The document is out for DIS ballot until April 15, 1992; it is anticipated that executive action will be taken on the DIS following the meeting of ISO TC/46 in London, May 18-22, 1992. Since the list of 3-character language codes is considered to be an open list, the ISO Council has designated a registration authority for 639 part 2. Proposals for allocating new language symbols should be directed to this authority. It is the Library of Congress, c/o Collection Services, Washington, DC 20540. See the list of language codes from a 1992 draft version.

Abstract: "This part of ISO 639 provides 3-character alphabetic symbols for the (re)presentation of names of languages. The symbols were devised primarily for libraries, information services, and publishers to use to indicate language in the exchange of information, especially in computerized systems. These symbols have been widely used in the library community, however, they may be used for any application requiring the expression of language in coded form, including use by terminologists and lexicographers. The list is considered to be an open list. This part of ISO 639 also includes guidance on the creation of language symbols and on their use in some of these applications. Languages designed exclusively for machine use, such as computer programming languages, are not included in this code list." [status? file? entry needs update. See the ANSI/NISO standard.]


International Organization for Standardization (ISO). ISO 8879:1986. Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML). International Organization for Standardization. Ref. No. ISO 8879:1986 (E). Geneva/New York, 1986. A subset of SGML became a US FIPS (Federal Information Processing Standard) in 1988. The British Standards Institution adopted SGML as a national standard (BS 6868) in 1987, and in 1989 SGML was adopted by the CEN/CENELEC Standards Committees as a European standard, #28879. Australia has dual numbered versions of ISO 8879 SGML and ISO 9069 SDIF (AS 3514 - SGML 1987; AS 3649 - 1990 SDIF). [needs update to mention results of 5-year review, and revision process]


International Organization for Standardization (ISO). ISO 8879:1986 / A1:1988 (E). Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML), Amendment 1. Published 1988-07-01. Geneva: International Organization for Standardization, 1988. This amendment is incorporated into the text Charles Goldfarb's SGML Handbook.


International Organization for Standardization (ISO). ISO 9069:1988. Information Processing -- SGML Support Facilities -- SGML Document Interchange Format (SDIF). 13 September 1988. Geneva/New York: International Organization for Standardization, 1988. Also available as The British Standard Guide to SGML Document Interchange Format (SDIF), BS 7138 1989 (ISO 9069: 1988; see in "Snippets," SGML Users' Group Newsletter 14 (October 1989) 12. [needs update]


International Organization for Standardization (ISO). ISO/IEC 9070:1991. Information Processing -- SGML Support Facilities -- Registration Procedures for Public Text Owner Identifiers. Second edition. 15 April 1991. The "public text" envisioned in this standard as applied to SGML might be DTDs (Document Type Definitions), or declaration subsets of DTDs, public entity sets, etc. Names include an owner name and an object identifier. Equivalent encodings for the names in ASN.1 and SGML may be supplied for interchange purposes. Note: "The intention of the amendment that has resulted in a 2nd edition is to extend 9070 beyond the simple boundaries of SGML only. It is now used by 9541 (and 10036) for the definition of 'structured names'. A New Work Item Proposal is being submitted to change the title and scope of 9070 to show its extended usefulness." (note from Paul Ellison, December 1991) [needs update]


International Organization for Standardization (ISO). ISO/IEC TR 9573:1988 (E). Information Processing -- SGML Support Facilities -- Techniques for Using Standard Generalized Markup Language (SGML). December 09, 1988. Anders Berglund, editor. vi + 124 pages.

A major revision of the TR underway (as of May 1990) will result in a new TR with (16) parts: (1) SGML Tutorial (2) Basic Techniques (3) Advanced Techniques (4) Using Short References for Identifying Markup (5) Using non-Latin Alphabets (6) Referencing and Synchronisation (7) Mathematics and Chemistry (8) Tables (9) Using SGML for Computer-to-Computer Interchange (10) Designing Applications for Database Interfacing (11) Application at ISO CS for International Standards and Technical Reports (12) Public Entity Sets for General and Publishing Symbols (13) Public Entity Sets for Mathematics and Science (14) Public Entity Sets for Latin Based Alphabets (15) Public Entity Sets for non-Latin Based Alphabets (16) Public Entity Sets for Ideograms (adapted from Ludo Van Vooren, "SGML Standards Committee Update: Activities of ISO SC 18 WG8," <TAG> 14 (May 1990) 11-12. See also Joan M. Smith in "More Liaison Statements to ISO," SGML Users' Group Newsletter 13 (August 1989) 6-7. A description of this ISO document is found in "Publication of Techniques for Using SGML," SGML Users' Group Newsletter 11 (January 1989) 3-4. Further update of parts 1-5 of TR 9573 will be delayed until the 5-year revision of SGML (ISO 8879) is completed. [needs update]


International Organization for Standardization (ISO). ISO/IEC 10036:1993 Information Technology -- Font Information Interchange -- Procedure for Registration of Glyph and Glyph Collection Identifiers. Geneva: International Organization for Standardization, 1993. [needs update]


International Organization for Standardization (ISO). ISO/IEC TR 10037:1991. Information Processing -- SGML and Text Entry Systems -- Guidelines for SGML Syntax-Directed Editing Systems. 15 March 1991. Geneva: International Organization for Standardization, 1991. The document supplies technical guidance for the development of context- sensitive SGML editors. See "Guidelines for Syntax-Directed Editing Systems," SGML Users' Group Newsletter 14 (October 1989) 3. [needs update]


International Organization for Standardization (ISO). ISO/IEC DIS 10179.2:1994. Information Technology - Text and Office Systems - Document Style Semantics and Specification Language (DSSSL). Voting on the current DIS began 1994-08-10 [and ends mid-December, I think]. Edited by Sharon Adler [and James Clark (?)] "ISO/IEC 10179 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology." viii + 142 pages.

"This International Standard defines the Document Style Semantics and Specification Language (DSSSL) used to specify formatting and other transformations of SGML-encoded documents. The initial focus of DSSSL is on formatting for both paper and electronic media, and on the conversion of SGML documents encoded according to different DTDs.

This International Standard has been structured to permit future sections to be added to this International Standard to cover the other areas of document processing and data management.

The main objective of the DSSSL Standard is to provide a specification language for expressing formatting and other document processing specifications in a formal and rigorous manner so that these specifications may be processed by a broad range of formatters, either natively or using a translation mechanism.

The DSSSL specification language will include tree transformation specifications and formatting specifications and other semantics to allow users to specify the types of formatting to be applied to various objects during composition and layout and pagination.

For formatting, a DSSSL-driven implementation can create a style sheet language that can be mapped into the DSSSL typographic characteristics and other composition and layout semantics.

In addition to the basic formatting semantics, DSSSL includes a language for writing a general transformation specification that provides the capability to transform documents from one SGML application into another.

DSSSL is designed to allow for specifications that apply to a class of documents. These specifications are applicable to all possible document instances in an SGML application as well as to a particular document instance.

The DSSSL specification language is declarative; it is not intended to be a complete programming language, although it contains constructs normally associated with such languages and provides a well-defined interface to a user-selected programming language, if such a capability is required. DSSSL specifications can be unambiguously parsed and interpreted among heterogenous systems. In addition, DSSSL specifications can be used by existing formatting systems through the use of "front-end" DSSSl processors and translators. DSSSL has no bias toward batch or WYSIWYG formatting systems and does not prescribe any predefined formatting algorithms.

The standardization of formatting semantics is provided in DSSSL through a set of basic structures known as flow objects and the associated set of formatting characteristics that are applied to these objects. DSSSL provides mechanisms for defining and extending the semantic constructs so that a DSSSL application designer can construct a DSSSL application in a manner that best reflects his application environment." [transcription from the Introduction (DIS 1994-08-10)]


International Organization for Standardization (ISO). ISO/IEC DIS 10180:1991. Information Processing -- Text Composition -- Standard Page Description Language (SPDL). Geneva: International Organization for Standardization, 1991. For a summary, see: (1) SGML Users' Group Newsletter 20 (September 1991) 17-18; Peter J. Robinson, and Stephen M. Strasen, "Standard Page Description Language," Computing Communications 12/2 (April 1989) 85-92; (2) "Text Composition Standards," SGML Users' Group Newsletter 15 (January 1990) 7-8. Note: ISO/IEC 10180 has now passed DIS ballot with no negative votes. The joint editors are expected to have the final text ready for publication during 1992 (so Paul Ellison, December 1991). [needs update]


International Organization for Standardization (ISO). ISO/IEC CD 10743:1991. Information Technology -- Standard Music Description Language (SMDL). April 1, 1991. SMDL "defines a language for the representation of music information, either alone, on in conjunction with text, graphics, or other information needed for publishing or business purposes." Multimedia time sequence information in also supported. SMDL is a HyTime application conforming to ISO/IEC DIS 10744 Hypermedia/Time- based Structuring Language (HyTime), and an SGML application conforming to Standard Generalized Markup Language (ISO 8879:1986). An earlier version was published by ANSI (American National Standards Institute), as ANSI X3V1.8M Journal of Development. ANSI Project X3.542-D. Standard Music Description Language (SMDL). X3V1.8M/SD-8. 60 pages. Sixth Draft. April 15, 1990. See a description of SMDL in: Steven R. Newcomb, "Standard Music Description Language Complies with Hypermedia Standard," IEEE Computer 24/7 (July 1991) 76-79. [needs update]


International Organization for Standardization (ISO). ISO/IEC 10744:1992. Information Technology -- Hypermedia/Time-based Structuring Language (HyTime). Edited by Charles F. Goldfarb (with assistance from Steven R. Newcomb). "HyTime is a standard neutral markup language for representing hypertext, multimedia, hypermedia, and time- and space-based documents in terms of their logical structure. Its purpose is to make hyperdocuments interoperable and maintainable over the long term. HyTime can be used to represent documents containing any combination of digital notations. HyTime is parsable as Standard Generalized Markup Language (ISO 8879:1986). HyTime provides standardized means of expressing (1) intra- and extra-document locations, and arbitrary links between them, (2) the scheduling of multimedia objects in 'finite coordinate spaces,' and (3) rendering instructions for arbitrarily projecting such objects onto other finite coordinate spaces, and other constructs." [taken from an abstract in CACM 34/11 (November 1991) 67-83.]

For further information on HyTime, see (1) the WWW SGML Page HyTime main entry, (2) the book by Steve DeRose and David Durand, (3) the book by Eliot Kimber, and (4) the CACM article by Steve Newcomb.


International Organization for Standardization (ISO). ISO 12083:1993(E) Information and documentation -- Electronic manuscript preparation and markup. [so titled in print copy distributed in mid-1994 by NISO/EPSIG] 96 pages. First edition, 1994-01-15. Genève: ISO, 1994 (c). Prepared by Technical Committee ISO/TC 46, Information and documentation, Subcommittee SC4, Computer applications in information and documentation. This "ISO" standard supercedes the 1988 (EPSIG/AAP) standard authorized by ANSI/NISO; see the bibliographic reference. The standard included three public DTDs (books, articles, serials) in "final" form and a provisional DTD for mathematics. The ISO 12083 DTDs [though not now in final form (November 1994)] are available on the Exeter SGML Project server and elsewhere; try: Exeter ftp://info.ex.ac.uk/ISO-12083/ or else ftp://actd.saic.com/pub/SGML/ISO-12083/.


International Organization for Standardization (ISO). ISO/IEC DIS 13673:1993 Information Technology -- Text and Office Systems -- Conformance Testing for Standard Generalized Markup Language (SGML) Systems. First edition. Voting was 1993-08-12 thru 1994-02-12. [entry needs update]


Johnson, Jeff; Beach, Richard. "Styles in Document Editing Systems." IEEE Computer 21/1 (January 1988) 32-43. 16 references. Authors' affiliation: Xerox Corporation.


Joloboff, Vania. "Document Representation: Concepts and Standards." In Structured Documents. Edited by Jacques André, Richard Furuta, and Vincent Quint. Cambridge Series on Electronic Publishing. Pages 75-105. Cambridge: Cambridge University Press, 1989. This article examines the problem of document representation in computer systems for printing, editing or interchange among heterogeneous systems. After a discussion of the various possibilities for defining documentation representation formalisms, it considers a number of standard representations typical of their class: page description languages, SGML, Interscript, ODA. Several other articles in the volume are of direct or marginal relevance to SGML as a metalanguage for document-structuring.


Kaelbling, Michael J. "On Improving SGML." OSU-CIRSC-7/88-TR22. The Ohio State University, 1988. Draft 18-October-1988, "accepted for publication in Electronic Publishing: Origination, Dissemination and Design." Department of Computer and Information Science; The Ohio State University; 2036 Neil Avenue Mall; Columbus, OH 43210. Summary: Several improvements are suggested to the syntax of SGML, the recent international standard for the description of electronic document types. These improvements ease processing by existing tools, remove ambiguity cleanly, and increase human usability. They also indicate some guidelines that should be followed in the design and specification of computer-software standards. By following accepted computer-science conventions for the description of languages the design of a standard may be improved, and the subsequent implementation task simplified. Note: see the response of Ron Hayter, "Comments on 'On Improving SGML'," Technical Bulletin 4. Software Exoterica Corporation, 1988. Hayter argues that Kaelbling's "improvements" to SGML are based upon a misunderstanding of the intent of the standard. Kaelbling's original draft known to Hayter was apparently 16-March-1988; Kaelbling's revised draft of 18-October-1988 responds to Hayter's comments.


Kaelbling, Michael J. "On Improving SGML." Electronic Publishing: Origination, Dissemination and Design (EPODD) 3/2 (May 1990) 93-98. 14 references. Received 16-March-1988, Revised 18-May-1990. ISSN: 0894-3982. Author affilation: Siemens AG, ZFE IS EA 11; Corporate Applied Computer Sciences; Otto-Hang-Ring 6; 8000 Munich 83, FRG. Another version of the paper is found in OSU-CIRSC-7/88-TR22

Abstract: "Several improvements are suggested to the syntax of SGML, the recent international standard for the description of electronic document types. These improvements ease processing by existing tools, remove ambiguity cleanly, and increase human usability. They also indicate some guidelines that should be followed in the design and specification of computer-software standards. By following accepted computer-science conventions for the description of languages the design of a standard may be improved, and the subsequent implementation task simplified."


Khatchadourian, Haroutioun; Modiano, Nicole; Heyer, Gerhard; Waldhör, Klemens. "Use and Importance of Standard[s] in Electronic Dictionaries: The Compilation Approach for Lexical Resources." Literary and Linguistic Computing 9/1 (1994) 55-64. 11 references. ISSN: 0268-1145. [Abstract needed] Authors discuss [esp. pages 60-61] the development and use of the 'MLEXd' SGML DTD within the MULTILEX project's efforts to standardize access to lexical data.


Kimber, W. Eliot. "HyTime and SGML: Understanding the HyTime HyQ Query Language." Version 1.1. Technical Report. IBM Corporation, August 2, 1993. 40 pages. This tutorial is available in compressed Postscript format from the Exeter SGML Project FTP server as Kimber-on-HyQ-1.1.ps.Z (note binary mode FTP transfer required), or in compressed text (ASCII) format FTP to SGML Project. Alternately, it is available in plain text (ASCII) format from the SGML Repository. Abstract: "This document is intended to provide a brief tutorial introduction to the HyQ language. It is assumed that you have a working knowledge of SGML and have a copy of the HyTime standard, ISO 10744 [Hypermedia/Time-based Structuring Language ISO/IEC 10744:1992], at hand, although it does not assume that you have more than a passing familiarity with HyTime." [from About this Document] Note: Eliot Kimber is authoring a full-length book on HyTime that will be published in 1995; see the bibliographic entry.


Kimber, W. Eliot. Practical Hypermedia: An Introduction to HyTime. Charles F. Goldfarb Series On Open Information Management. New York: Prentice-Hall Professional Technical Reference, [Spring/forthcoming] 1995. 250 pages (ca.) ISBN: 0-13-309899-0. Author affiliation: Passage Systems, Inc. (W. Eliot Kimber (kimber@passage.com); Systems Analyst and HyTime Consultant; Passage Systems, Inc., 2608 Pinewood Terrace; Austin TX 78757 (512)339-1400; 465 Fairchild Dr., Suite 201; Mountain View, CA 94043 (415) 390-0911.) [Provisionally, see Kimber's HyQ tutorial.]

Summary: HyTime is an ISO standard (ISO/IEC 10744) that is an extension to SGML. It is intended to support electronic documents which use hyperlinking and multi-media elements. In this book, Kimber focuses on the most practical aspects of the HyTime standard, explaining how to use HyTime to move information from the traditional print-based medium to hypermedia. [publisher's pre-publication description]


Laan, C. G. van der. "SGML (, TeX and . . .)". TUGboat 12/1 (March 1991) 90-104. 49 references. Author affiliation: Rekencentrum TUG, Groningen. [needs abstract]


Lavagnino, John. "Simultaneous Electronic and Paper Publication." TUGboat 12/3 [Proceedings of the 1991 Annual Meeting] (December 1991) 401-405. [Concludes that SGML is the "best choice" for creating a multiform text.] Author affiliation: Brandeis University.


Loeffen, Arjan. "Text Databases: A Survey of Text Models and Systems." Work paper, [no number]. 10 pages. Utrecht: University of Utrecht, [1994]. Available in Postscript format as "sigmod.uue" [encodes "sigmod.ps"] through anonymous FTP. Note that other valuable research papers on SGML from Arjan Loeffen are available from the same FTP server: see the subdirectory "models" and the subdirectory "sgml-model". Author's address: Arjan Loeffen, Faculty of Arts, University of Utrecht; Achter de Dom 22-24; 3512JP Utrecht; The Netherlands; ++31+30536417 (voice work); ++31+206656463 (voice home); ++31+30539221 (fax work); Email: Arjan.Loeffen@LET.RUU.NL.

Abstract: "Text models focus on the manipulation of textual data. They describe texts by their structure, operations on the texts, and constraints on both structure and operations. In this article common characteristics of machine readable texts in general are outlined. Subsequently, ten text models are introduced. They are described in terms of the datatypes that they support, and the operations defined by these datatypes. Finally, the models are compared." [The text models discussed include: TDM (relational model based upon nonfirst normal form), P-string model, PAT (University of Waterloo), TOMS ("textual object management system" - an indexing toolkit), the containment model, MdF ("Monads-dot-Features"), the Banyan system, Extended MAESTRO, Grif, and Multos.


Macleod, Ian A. "Extending the command language interface to handle marked-up documents." Pages 192-196 in Information in the year 2000, from research to applications. ASIS '90. Proceedings of the 53rd Annual Meeting of the American Society for Information Science [Toronto, Ontario., Canada, 4-8 November 1990]. Vol 27. Edited by Diane Henderson. Medford, NJ, USA: American Society for Information Science, 1990. 14 references. ISBN: 0938734482. Author affiliation: Department of Computer and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada.

Abstract: Two important international standards relating to text have emerged. One of these, SGML, describes a framework for descriptive markup. The other, and more recent, deals with a command language interface for full text retrieval. The two standards have been developed in isolation from one another and the command language can handle only the conventional view of text and not the relatively complex structures implicit in descriptive markup. It is shown how a relatively simple syntactic extension to the command language enables it to be applied to SGML databases. Some implementation issues are also discussed.


Macleod, Ian A. "A Query Language for Retrieving Information from Hierarchic Text Structures." Technical Report 89-263. Queen's University (Kingston, Ontario) Dept. of Computing and Informatio Science. August, 1989. 26 pages. Funding: Supported by the Natural Sciences and Engineering Research Council of Canada. Abstract: Descriptive markup languages provide a mechanism for specifying the structure of a document. The basic premise of the work described here is that structure is an important characteristic of a document and is something more than a layout specification. For this reason, it appears important that retrieval tools should be developed which can take advantage of structural knowledge. In this paper, a query language is described which provides such a capability. The underlying implementation strategy is also discussed.


Macleod, Ian A. "A Query Language for Retrieving Information from Hierarchic Text Structures." Computer Journal 34/3 (June 1991) 254-264. (24) references. ISSN: 0010-4620. Author affiliation: Department of Computer and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada. See previous version in the Queen's technical report.

Abstract: "Descriptive markup languages provide a mechanism for specifying the structure of a document. The basic premise of the work described here is that structure is an important characteristic of a document and is something more than a layout specification. For this reason, it appears important that retrieval tools should be developed which can take advantage of structural knowledge. A query language is described which provides such a capability. The underlying implementation strategy is also discussed.


Macleod, Ian A. (guest editor). SGML Special Issue: Computer Standards & Interfaces. 1995. The journal Computer Standards & Interfaces is a North-Holland (Amsterdam) publication characterized as "The International Journal on the Development and Application of Standards for Computers, Data Communications and Interfaces." In mid-1995 it is to sponsor a special SGML issue, edited by Ian A. Macleod. The 'Call for Papers' read as follows, in part: "SGML (the Standardized General Markup Language) is an international standard whose importance is rapidly growing. It is fair to say that the era of electronic text has finally arrived. A large number of potential text applications are seeking solutions, and there is significant industrial interest in the technologies being developed in the SGML context. In view of the high importance of SGML, Computer Standards and Interfaces is planning a special issue on this topic to be published in mid 1995. The goal is to collect papers incorporating important advances in the field. Topics of interest include, but are not limited to, the following: Novel applications of SGML; SGML databases and information retrieval; Languages for accessing and manipulating SGML structures; Hypertext/Hypermedia; Entity management; Visualisation and SGML; Related standards and SGML; Converting legacy databases to SGML; Tools for developing and using DTDs." See the full announcement for other publication details.


Macleod, Ian A. "Storage and Retrieval of Structured Documents." Information Processing and Management 26/2 (1990) 197-208. Author affiliation: Department of Computer and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada.

Abstract: There have been a number of important related activities which suggest the need for a new model for text. ISO standards for document description have been recently developed. These standards view documents as hierarchical objects and it is likely that languages such as SGML will become widely used in the near future for document markup. As structured documents become available, so there will be a need to evolve tools to take advantage of structural knowledge. The goal of the work described here is to develop such tools. A conceptual model for bibliographic data has been designed. The model is known as Maestro (Management Environment for Structured Text Retrieval and Organization). It supports structured documents and provides a query language to retrieve and link information contained in these structures. In this paper an overview of Maestro is presented together with an outline of the basic implementation strategy.


Macleod, Ian A.; Barnard, David T.; Hamilton, D.; Levison, M. "SGML Documents and Nonlinear Text Retrieval." Pages 226-244 in Intelligent Text and Image Handling: Proceedings of a Conference on , RIAO '91 [Barcelona, Spain, 2-5 April, 1991]. Edited by André Lichnerowicz. Amsterdam/London/New York: Elsevier, 1991. 17 references. ISBN: 044489361X. [Conference organized by the Centre de Hautes Etudes Internationales d'Informatique Documentaire (CID), Center for the Advanced Study of Information Systems, Inc. (CASIS); sponsored by the Commission of the European Communities, Minist. Educ. Sci. Spain, Univ. Autonoma Barcelona; et al.]

Abstract: Standard generalized markup language (SGML) is an international standard for markup languages. Descriptive markup is a means whereby the logical structure of a document can be explicitly encoded. Such markup can subsequently be processed to provide an appropriate physical layout of the actual document content. Additionally, the logical structure provides the information necessary for highly context sensitive retrieval. The authors describe an SGML application which can process encoded documents into a format suitable for storage and retrieval by an appropriately powerful retrieval system. It is also possible to encode links between documents using SGML. One technique is that suggested by the Text Encoding Initiative (TEI), a co-operative international venture to promote guidelines for the encoding and interchange of machine-readable texts. The authors describe how such links can be processed to produce the equivalent structures in a document database.


Macleod, Ian A.; Nordin, Brent; Barnard, David T.; Hamilton, Doug. "A Framework for Developing SGML Applications." Pages 53-64 in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 [International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation]. Edited by Christine Vanoirbeek and Giovanni Coray. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Authors' affiliation: Queen's University, Kingston, Ontario. [need abstract]


Macleod, Ian; Reuber, A. R. "The Array Model: A Conceptual Modeling Approach to Document Retrieval." Journal of the American Society for Information Science 38/3 (1987) 162-170. Abstract: Recent research has sought to build document-retrieval systems on top of relational database management systems (DBMS) in order to increase the power of document retrieval. While the use of DBMS shows a more flexible approach to designing search strategies, the underlying representation of the information is inflexible and does not correspond to either the structure or the meaning of the real-world objects. This limitation can be overcome through the use of conceptual modelling techniques. The array model presented here is based on these techniques and has been designed specifically for application in document retrieval.


Maler, Eve; El Anduloussi, Jeanne. Developing SGML Document Type Definitions. [approximate title] Charles F. Goldfarb Series On Open Information Management. Englewood Cliffs, NJ: PTR Prentice Hall, forthcoming [Spring 1995 projected]. ISBN: 0-13-309881-8. (Author contact: maler@zk3.dec.com)

Summary: Every SGML document must conform to some specified Document Type Definition (DTD). Maler and El Anduloussi explain the basics of DTD design, then present a methodology and series of techniques to help information professionals design, implement and document DTDs. [publisher's pre-publication description]


Mamrak, Sandra A.; Barnes, Julie; Hong, H.; Joseph, C.; Kaelbling, Michael; Nicholas, Charles; O'Connell, Conleth; Share, M. "Descriptive Markup -- The Best Approach?" Communications of the Association for Computing Machinery 31/7 (July 1988) 810-811. Authors' affiliation: The Chameleon Research Group, Department of Computer and Information Science, Ohio State University. [Reply to the CACM article of Coombs/DeRose/Renear. The authors argue that a few of the claims are overstated, and that some of the difficulties in the use of descriptive markup (e.g., document portability) are trivialized.


Mamrak, Sandra A.; Kaelbling, Michael J.; Nicholas, C.K.; Share, M. "Chameleon: A System for Solving the Data-Translation Problem." IEEE Transactions on Software Engineering 15/9 (September 1989) 1090-1108. ISSN: 0098-5589. Abstract: "There is a need for widespread exchange of electronic documents in domains as diverse as book publishing, automated offices, factories, and research laboratories. The variety of data representations, and the subsequent need for data translation, is a major obstacle to this exchange. This paper describes a comprehensive data translation system with the following characteristics: 1) it is derived from a formal model of the translation task; 2) it supports the building of translation tools; 3) it supports the use of translation tools; and 4) it is accessible to its targeted end-users. A software architecture to achieve the translation capability is fully implemented. Translators have been generated using the architecture, both by the original software developers and by industrial associates who have installed the architecture at their own sites." Contact: Sandra A. Mamrak, Department of Computer and Information Science, The Ohio State University, 2036 Neil Ave., Columbus, OH USA 43210-1277. Email (Internet): mamrak@cis.ohio-state.edu OR mamrak@oboe.cis.ohio-state.edu (Internet).


Marin-Navarro, José; Alevantis, Panagiotis E. "Alice in the Wonderland of SGML: Streamlining Text Entry in the CELEX Databases." The Electronic Library 9/3 (June 1991) 155-160. Abstract: This article describes the system used for the introduction of textual data into the CELEX full-text document databases. The solution implemented is based on the establishment of a text production database for the management and validation of texts before introducing them into the CELEX dissemination databases, and the management of structured documents described with the help of an SGML syntax. Note: CELEX (Communitatis Europææ LEX) is the computerized multi-lingual documentation system for European community law. Contact: Commission of the European Communities, Service EUROBASES, 200 rue de la Loi, B-1049 Brussels, BELGIUM; Tel: +32-235-00-01; FAX: +32-235-00-03. Another description of CELEX is Geoffrey Gudgion, "SGML Applications in the European Commission," EuroCALS Newsletter 4 (May, 1990) 17-20; the author discusses SGML applications relative to the CELEX database (European Commission Law), INFOTEX or INFO 92 (database of European tariffs) and electronic mail.


Matzen, Richard Walter. "A formal language model for detecting ambiguity in SGML." PhD Dissertation. Department of Computer Science, Oklahoma State University, 1993. xii, 144 pages. [need DA summary]


Matzen, Richard W.; George, K.M.; Hedrick, G.E. "A model for studying ambiguity in SGML element declarations." Pages 668-676 in Applied Computing: States of the Art and Practice - 1993. [Proceedings of the 1993 ACM/SIGAPP Symposium on Applied Computing = Proceedings of the 1993 (8th) ACM/SIGAPP Symposium on Applied Computing, Indianapolis, IN, USA, 14-16 February, 1993.] Edited by E. Deaton, K. M. George, H. Bergel, and G. Hedrick. New York, NY, USA: ACM, 1993. 14 references. Authors' affiliation: Oklahoma State University, Stillwater, OK, USA.

Abstract: The Standard Generalized Markup Language (SGML) is a meta-language system for document representation that was adopted as an ISO standard in 1986. In SGML, element declarations define the logical components (elements) of documents; a content model is the part of an element declaration that defines the content of the elements. SGML defines and prohibits "ambiguous content models" but does not show a method for detecting them. Model groups, the only required components of content models, are expressions similar to regular expressions. This paper defines ambiguous model groups and gives an algorithm for detecting them. When the optional components of element declarations are not considered, the algorithm detects ambiguous content models as defined by the standard. The algorithm is based on a construction of indexed nondeterministic finite automata (NFAs) in which each arc is bound to a particular occurrence of an element symbol in a model group.


McFadden, John R.; Wilmott, Sam. "Ambiguity in the Instance: An Analysis." <TAG> 9 (March/April 1989) 3-5. This article is a response to the article of John M. Graf (John Graf, "Ambiguity in the Instance," <TAG> 7 (1988) 6-9. Graf observed that "a document created under the exact rules of a valid DTD may very well be invalid when passed through an instance parser.") McFadden and Wilmott argue that conforming parsers and a validating parsers must be differentiated. On this distinction, cf. further the article of McFadden and Wilmott, "The SGML Conformance Testing Initiative," <TAG> 9 (March/April 1989) 1-3. The writers conclude that Graf's examples represent a misunderstanding of the use of SGML parsers, and that tag minimization is "safe" with proper DTD design. For more on SGML's definitions of "ambiguous" and "unambiguous," see Brueggemann, Price, Price, Kaelbling, and Warmer (esp. pp. 80-83). More yet: William W. David, Jr., "OMITTAG Minimization," <TAG> 5/2 (February 1992) 4-5 (who notes "A little history may be of some help in understanding why the standard is the way it is. SGML was developed when desktop computers that had 64K were large. . .") and Jan Grootenhuis, "Disambiguation of SGML Document Models," <TAG> 12 (December 1989) 11-12.


McGaffey, Robert. "Automatic Tables Using SGML, C, and TeX" TUGboat 13/3 (October 1992) 291-294. [needs abstract]


Moline, Judi; Benigni, Dan; Baronas, Jean (eds). Proceedings of the Hypertext Standardization Workshop (January 16-18, 1990 National Institute of Standards and Technology, Gaithersburg, MD). NIST Special Publication 500-178, March, 1990. CODEN: NSPUE2. Several papers in this proceedings volume reference SGML, HyTime and SMDL as potentially valuable in creating hypertext/hypermedia standards. Reports from the workshop's Data Interchange Group and User Requirements Discussion Groups likewise identified SGML or SGML-like GIs as having probable priority in emerging standards formulations.


Mumford, Anne (editor). Document Exchange: The Use of SGML in the UK Academic and Research Community. Workshop Proceedings 5-7 March 1990. Advisory Group on Computer Graphics, 1990. This proceedings volume contains several important contributions on SGML (submitted by Anne Mumford, Paul Ellison, Martin Bryan, Angella Scheller, David Duce and Ruth Kidd, Tim Niblett, Lou Burnard, John Larmouth, Paul Bacsich and Paul Lefrere, Malcolm Clark, and Kathleen Crennell). The volume is available from the organizer: Ann M. Mumford, Computer Centre, Loughborough University, Loughborough LE11 3TU, UNITED KINGDOM; TEL: 44 509 222312; FAX: 44 392 211603. See a full list of contributors and presentation-titles in "Document Exchange in UK Universities," SGML Users' Group Newsletter 17 (August 1990) 10.


Murata, Makoto; Hayashi, Koichi. "Formatter Hierarchy for Structured Documents." Pages 77-94 in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 [International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation]. Edited by Christine Vanoirbeek and Giovanni Coray. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4.

Abstract: This paper describes a formatting model of structured documents. In this model, a document is formatted by a hierarchy of co-interacting formatters. Each formatter creates layout subtrees, by pouring logical streams into layout streams. This formatting model was originally proposed in Interscript. We extend it for tnt, and clearly illustrate the formatting algorithm. Finally, we propose some new techniques for incremental formatting, reduced formatting, and parallel formatting.


Naggum, Erik. "Answers to Frequently-Asked-Questions (FAQs) - for the UseNet Newsgroup comp.text.sgml." A draft version (Version 0.0, 1991-12-15) is available via Internet anonymous-FTP as ftp.ifi.uio.no:pub/SGML/FAQ.0.0. The latest version of the FAQ document may be fetched at any time from this public disk region, generously sponsored by The University of Oslo, Department of Informatics with oversight by Erik Naggum. The FAQ will also be found on servers which archive collections of FAQs. Suggestions for additional questions (or answers) to be included in the FAQ may be directed to the author: Erik Naggum; Naggum Software; Boks 1570, Vika; 0118 OSLO, NORWAY; Email: erik@naggum.no OR enag@ifi.uio.no on the Internet.


Naggum, Erik. "DSSSL." [Congratulatory note on DSSSL, i.e., ISO/IEC DIS 10179.2:1994. Information Technology - Text and Office Systems - Document Style Semantics and Specification Language (DSSSL)]; see the DIS full citation. Usenet Newsgroup comp.text.sgml, December 5, 1994. Author affiliation: SGML Repository, and Naggum Software (+47 2295 0313). The article will be found in the comp.text.sgml archives, as well as in the comp.text.sgml Digest (see digest entry) Volume 5, Issue 6 (1994-12-05). A copy of the article is also provided here. Summary: a positive affirmation of the work of James Clark, Sharon Adler, and other members of the DSSSL team.


National Information Standards Organization. American National Standard for Electronic Manuscript Preparation and Markup. (ANSI/NISO Z39.59-1988). Published for NISO (National Information Standards Organization) by Transaction Publishers (New Brunswick, NJ), 1991. xv +167 pages. ISBN: 0-88738-945-7. ISSN: 1041-5653. This AAP (Association of American Publishers) standard is an application of SGML. An earlier form of the document was Standard for Electronic Manuscript Preparation and Markup. (ANSI/NISO Z39.59-1988). 1987, 1988. ANSI Z39.59-1988 was promoted to ISO DIS in 1992, and was to be published in revised format as ISO 12083:1993 in late 1993. See now "ISO 12083". The AAP/EPSIG application is SGML-conforming, and provides a suggested tagset for authors and publishers. The standard is said to "represent the first industry wide application of SGML (Standard Generalized Markup Language, ISO 8879). The standard defines the format syntax of the application of SGML publication of books and journals. The standard achieves two goals. First, it establishes a standard way to identify and tag parts of an electronic manuscript so that computers can distinguish between these parts. Second, it provides a logical way to represent special characters, symbols, and tabulator material, using only the ASCII character set usually found on a standard keyboard." The standard is available for $75 (75 US dollars) from Transaction Publishers or from NISO: Transaction Publishers, Department NIS091, Rutgers--The State University, New Brunswick, NJ 08903, TEL: (1 908) 932-2280; FAX: (1 908) 932-3138; NISO is at National Information Standards Organization, P.O. BOX 1056, Bethesda, MD 20827, Tel: (1 301) 975-2814, FAX: (1 301) 869-8071; Email (Internet): niso@enh.nist.gov (or BITNET) niso@nbsenh. Discounts are available for purchase of multiple copies. Equally, the volume may be ordered from EPSIG.


National Information Standards Organization. Codes for the Representation of Languages for Information Interchange (ANSI/NISO Z39.53-1994). NISO, 1994. ISBN: [ ] Overview: "The National Information Standards Organization (NISO) has published a revised standard for language codes. Codes for the Representation of Languages for Information Interchange (ANSI/NISO Z39.53-1994) is used by libraries, information services, and publishers as the standard for designating languages in which documents or document handling records (such as order records or bibliographic records) have been created. The revised standard reflects a thorough review of the 1987 edition and includes many changes requested by users. Codes have been added for 28 languages or language groups previously not represented. Numerous minor changes also have been made to reflect current accepted usage in language names. The USMARC Code List for Languages is kept consistent with ANSI/NISO Z39.53 and will be revised to incorporate the changes in this new edition." [from a NISO-L news announcement; see the complete text for details.] [Note: need to clarify he relationship between this and ISO 639/2.]


Newcomb, Steven R.; Kipp, Neill A.; Newcomb, Victoria T. "The 'HyTime' Hypermedia/Time-based Document Structuring Language." Communications of the Association for Computing Machinery 34/11 (November 1991) 67-83. ISSN: 0001-0782. Abstract: HyTime, a proposed standard for digital communications, should enable authors of electronic documents to incorporate active references to other on-line documents regardless of their notations. HyTime, which stands for Hypermedia/Time-based Document Structuring Language, is built on the Standard Generalized Markup Language (SGML). SGML/HyTime enables all types of documents to package the 'information about information' using standard 'markup,' which provides information about the notations and structure of the document so that any application with an appropriate data importation facility can understand and interpret it. The documents' structured character will also make them useful for querying, access and version control, maintenance and nonsequential browsing.


Nicholas, Charles K.; Welsch, Lawrence A. On the Intechangeability of SGML and ODA. Technical Report, NISTIR 4681. ii + 19 pages. Gaithersburg, MD: U.S. Department of Commerce, National Institute of Standards and Technology (NIST), January 1992.


Nordin, Brent; Barnard, David T.; Macleod, Ian A. "A Review of the Standard Generalized Markup Language (SGML)." Computer Standards and Interfaces [Amsterdam, Netherlands: North-Holland] 15/1 (May 1993) 5-20. 33 references. ISSN: 0920-5489. Abstract: The international standard ISO 8879:1986 and its related material describes both a text markup scheme and an implementation of a text parser based on that markup scheme. By trying to clarify the relationship between the documents and an implementation, the authors show that optional SGML features properly belong to separate applications. The result suggests more general and powerful mechanisms which could be obtained.


O'Connell, Conleth S. Jr. Supporting the Development of Grammar Descriptions for Multiple Applications. OSU-CISRC-TR-7/90-TR20, Department of Computer and Information Science, The Ohio State University, July 1990. Abstract: In computer science, context-free grammars are used extensively to describe data sets such as manuscript types and programming languages. The data, or members, contained in a particular set represent instances of the grammar describing that set, for example, documents and programs. Determining the elements comprising instances is the task of content investigation. Imposing structure on these elements is the task of grammar development. Creating, editing, and manipulating instances of a grammar is the task of grammar instantiation. Grammar instantiation has received much attention with software systems such as programming environments and compound-document environments. Content investigation and grammar development have only recently been recognized as recurring complex tasks. They have received little attention because of their newly emerging significance. This work focuses on grammar development. Grammar development produces a grammar description in a particular notation that contains two types of information: a formal, context-free grammar and auxiliary information. Auxiliary information describes the application of the grammar description. For example, a grammar may describe the manuscript type ``article,'' but the auxiliary information may describe how to format the instances for layout, how to analyze the sentence structure, or how to exchange documents of that type. The separation of the general, context-free grammar from the application-specific, auxiliary information provides the power and flexibility to generalize problem classes associated with grammar development. The formalisms of context-free grammars motivate two such problem classes: syntactic properties and semantic properties. The analysis of the development of large grammars motivates two other problem classes: reusable grammars and multiple notations. A review of existing software systems reveals that a new, general-purpose, support environment was required for developing grammar descriptions. A prototype environment for developing grammar descriptions, DeveGram, has been designed and implemented. DeveGram controls and manages the four problem classes by capturing any context-free grammar, providing mechanisms for determining properties about a grammar, capturing auxiliary information, and generating automatically grammar descriptions in a testbed of different notations. DeveGram produces grammar descriptions for a testbed of software systems differing in syntax and purpose. The testbed presently consists of Yacc, SGML, MDL, MANDEN, and BNF. (Note: see more on the Chameleon project by Mamrak and Walter.


Painter, J. Derek. "Marking up the Dictionary (The Oxford English Dictionary)." Information Media & Technology. 21/2 (March 1988) 72-74. CODEN: IMTEED. ISSN: 0306-2880. Abstract: The article describes the Oxford University Press's implementation of the Standard Generalized Markup language (SGML). SGML provides a rigorous syntax for describing unambiguously the content and structure of any document in such a way that its presentation can be controlled by conversion to typographic codes and selective retrieval can be enabled by the application of search software. Clearly, because the languages is generic, its use is independent of specific devices and it can be implemented universally, regardless of the make of front-end, host, printer and operating system. SGML has been adopted by the Oxford University Press in order to convert the Oxford English Dictionary into a lexical database.


Poppelier, Nico. "Pre-publication Review [of Eric van Herwijnen], Practical SGML, 2nd edition." TUGboat 15/1 (March 1994) 24-25. [See Poppelier's review of Practical SGML first edition by Nico Poppelier in TUGboat 13/2 (July 1992) 184-185.] Author affiliation: Elsevier Science Publishers; email: n.poppelier@elsevier.nl.


Poppelier, N.A.F.M.; Herwijnen, Eric van; Rowley, C.A. "Standard DTDs and Scientific Publishing." EPSIG News 5/3 (September 1992) 10-19. The article was posted to the discussion group 'sgml-math', and is available in Postscript format [dated 7-August-1992] on the Elsevier FTP server. See further on Elsevier Science in the main Elsevier entry. [Abstract needed]


Price, Lynne A. "Graphic Representation of Content Models." <TAG> 10 (July 1989) 12-16. The article demonstrates the use of tree structures and (more extensively) FSAs to represent SGML content models. FSAs are useful in revealing ambiguity (seemingly equivalent models). The article is derived from the author's tutorial session at the ACM Conference on Document Processing Systems, Santa Fe, New Mexico (5-9 December 1988).


Price, Lynne A. "The Problem with Ambiguous Content Models." SGML Users' Group Bulletin 3/1 (1988) 25-26. Abstract: "IS 8879 defines an ambiguous content model -- one for which an element or character string occurring in a document instance can satisfy more than one primitive content token without look-ahead -- to be a nonreportable markup error. In other words, it is an error to use an ambiguous content model, but SGML software will not necessarily detect the condition. SGML could have been defined to give a unique interpretation to each possible content model. For example, an element or character string occurring in a document instance could have been interpreted as matching the leftmost possible primitive content token. Such a definition, however, would not have given consistently intuitive results. Several illustrations of this point are given below.


Price, Lynne A. "Using SGML and TeX for User Documentation." In TEXniques No. 7: Proceedings, TeX User's Group 1988 Annual Meeting (21-24 August 1988, Montreal). Pages 203-210. TeX User's Group, 1988. Abstract: The Standard Generalized Markup Language (SGML), defined in International Standard (ISO) 8879, is a notation for representing documents and making their inherent structure explicit. The open-ended list of SGML applications includes document interchange, formatting or typesetting, loading databases for information retrieval, stylistic or linguistic analysis, and computer-aided translation. The combination of SGML and TeX is a natural one. This paper reviews the philosophy of SGML and then describes a particular environment where SGML and TeX are used together, giving specific examples of how processing is shared between the SGML application and TeX macros.


Price, Lynne A.; Schneider, Joe. "Evolution of an SGML Parser Generator." In Proceedings of the ACM Conference on Document Processing Systems, Santa Fe, 5-9 December 1988. Pages 51-60. New York: Association for Computing Machinery, 1988. Abstract: The Standard Generalized Markup Language (SGML) is a notation for describing classes of structured documents and for coding documents belonging to described classes. An advantage of SGML and other grammar-based document representations is the ability to perform multiple applications on a single document source file. This paper describes the evolution of a software development tool for implementing such applications. It explains the original design as well as enhancements made during the system's first eighteen months. Although not statistically significant, data on the use of the enhanced features are presented. The experience described is relevant to other software engineering tools for text processing.


Price-Wilkin, John. "A Gateway Between the World-Wide Web and PAT: Exploiting SGML Through the Web." The Public-Access Computer Systems Review 5/7 (1994) 5-27. To obtain this document, use the following URL: gopher://info.lib.uh.edu:70/00/articles/e-journals/uhlibrary/pacsreview/v5/n7/pricewil.5n7. Or send the following e-mail message to listserv@uhupvm1.uh.edu: GET PRICEWIL PRV5N7 F=MAIL.

Abstract: The HyperText Markup Language (HTML) used by the World-Wide Web has limited markup and structure recognition capabilities. Only a small set of text characteristics can be represented, and few of these have any functional value beyond display capabilities. The HTML ANCHOR element supports hypertext links; however, it cannot retrieve components of a linked document, such as a single glossary entry from a collection of several thousand entries, without resorting to programs external to HTML and the Web server. In spite of these limitations, HTML and the Web are keytechnologies for libraries. The Standard Generalized Markup Language (SGML) is a full-featured, standard markup language. HTML is actually an SGML Document Type Definition. Ideally, it would be possible to retrieve text documents marked up with the richer SGML tag set via the World-Wide-Web. This technical paper discusses how the Web can be linked to the PAT system, Open Text's search engine that supports access to SGML-encoded documents. This Web-to-PAT Gateway utilizes the Web's Common Gateway Interface (CGI) capability and SGML-to-HTML filter programs. After briefly overviewing key technical concepts, the paper explains the operation of the Web-to-PAT Gateway, using several examples of how it is employed at the University of Virginia Libraries, including access to text files such as a Middle English collection, the Oxford English Dictionary, and the Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange. [from the Introduction]


Raymond, Darrell R. "Flexible Text Display with Lector." IEEE Computer 25/8 (August 1992) 49-60. ISSN: 0018-9162. Author affiliation: Department of Computer Science, University of Waterloo. Published summary: "Lector provides flexible text interaction for X11 applications. It handles descriptively marked-up text and acts as a text previewer, database browser, code prettyprinter, or menu utility." Further note: The article supplies screen shots of Lector's display of the Oxford English Dictionary and the associated electronic stylesheet (style-specification file); both use SGML-style tagging. It also shows various styles for a csh man page, for prettyprinted C-code, and for text/graphic representation of pieces in a chess game. See further on the Lector application in Raymond's technical report.


Raymond, Darrell R. "Lector - An Interactive Formatter for Tagged Text." Technical Report OED-90-02. University of Waterloo Centre for the New Oxford English Dictionary and Text Research, August, 1990. 26 pages, 13 figures.

Abstract: Lector is an X.11 application that provides highly interactive text formatting. Unlike text previewers, Lector handles descriptively marked-up text, supports multiple styles, and interacts well with other programs, including other invocations of Lector. Appropriate selection of texts and styles enables Lector to act as a text previewer, database browser, code prettyprinter, menu utility, and iconic interface. Lector's implementation revolves around a set of tradeoffs involving efficiency, simplicity and generality. The result demonstrated the utility of generalized text display tools. Note: for further details on the Waterloo Centre, see Gonnet.


Raymond, Darrell R.; Tompa, Frank Wm.; Wood, Derick. "Markup Reconsidered." Department of Computer Science, Technical Report No. 356. The University of Western Ontario, 1993. ISBN: 0-7714-1504-4. 15 pages, 32 references. Also available as OED-93-01, UW Centre for the New Oxford English Dictionary, University of Waterloo (April 1993). A presentation under the same title was given at the First International Workshop on Principles of Document Processing, Washington, D.C. (October 21-23, 1992). An earlier version (unpublished") was written as "Markup Considered Harmful" and a related work was entitled The UWO version of the paper is available via FTP to UWO; ftp://ftp.csd.uwo.ca/pub/csd-technical-reports/356/.

Abstract: We describe some of the implications of markup for document management systems. Markup's properties are inherited from text, since it is embedded in text. These properties are most advantageous when document structure is reducible to substrings of characters, and when the update characteristics of the structure are similar to the update characteristics of the text. We describe situations in which these characteristics are disadvantageous. Markup is not a data model, but one of several possible techniques for representing structure. For this reason it should not be the foundation of document management systems.


Reid, Brian K. "A High-Level Approach to Computer Document Formatting." Pages 24-31 in Conference Record of the Seventh Annual ACM Symposium on Principles of Programming Languages: Papers Presented at the Symposium, Las Vegas, Nevada, January 28-30, 1980. Sponsored by the Association for Computing Machinery, Special Interest Group on Automata and Computability Theory, Special Interest Group on Programming Languages. [Alternate main entry: ACM Symposium on Principles of Programming Languages (7th, 1980; Las Vegas, Nevada)]. New York/Baltimore: ACM, 1980. ISBN: 0897910117. vii, 261 pages. ACM order no. 549800. Author affiliation: Computer Science Department, Carnegie-Mellon University, Pittsburgh, PA.


Renear, Allen; Mylonas, Elli; Durand, David. "Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies." Pages xx-xx [ca. 16 pages] in Research in Humanities Computing, edited by Nancy Ide and Susan Hockey. Oxford: Oxford University Press, 1993 [actual date? Draft version is, January 6, 1993.]. (23) references. Authors' affiliation: Allen Renear, Brown University; Elli Mylonas, Harvard University; David Durand, Boston University. Note: An earlier version was presented at the annual joint meeting of the Association for Humanities Computing and the Association for Literary and Linguistic Computing, Oxford University, April 1992. An HTML version of the draft is available [get link from Allen]

Abstract: We examine the claim that 'text is an ordered hierarchy of content objects [OHCO]'; this thesis was affirmed by the authors, and others, in the late 1980s and has been associated with certain approaches to text processing and the encoding of literary texts. First we discuss the nature of this claim and its connection with the history of text processing and text encoding standardization projects such as SGML and the Text Encoding Initiative. We then describe how the experience of the text encoding community, as represented and codified in the TEI Guidelines, have raised difficulties for this thesis. Next we consider two progressively weaker versions of this thesis formulated in response to these difficulties. Ultimately we find that no version appears to be free from counterexample.

Although none of these formulations proves to be theoretically sound, they are nonetheless methodologically illuminating as each generalizes actual encoding practices, making explicit certain assumptions that, even though they have been fundamental to the working methodologies of most text encoding projects, have never been explicitly articulated, let alone explained or defended. The counterexamples to the different versions of the OHCO thesis also arise in actual encoding projects -- so although our focus is theoretical it is grounded in the methodology and problems of contemporary encoding practices. The problems discussed here have implications not only for text encoding and our understanding of the nature of textual communication, but raise very fundamental issues in the logic and methodology of the humanities.


Reynolds, Louis R.; Derose, Steven J. "Electronic Books: Hypertext Publishing Lets You Structure, Distribute, Retrieve and Annotate the Information You Need." Byte Magazine 17/6 (June 1992) 263-268. [In Byte special section "Managing Infoglut: How to Add Value to Your Data"] Authors' affiliation: Electronic Book Technologies.


Robinson, Brian; Wu, Gilbert. "Applications of SGML." <TAG> 19 (August 1991) 4-9. Note: Compare by the same authors "Applications of SGML," University Computing 14/2 (1992) 53-57 (ISSN: 0265-4385). Abstract: This paper does not dwell on the technicalities of the Standard Generalised Markup Language (SGML) but focuses on applications of SGML which are currently the subject of research and development contracts within the Information Technology Group or ERDC at the Hatfield Polytechnic. Detail is provided on one particular project undertaken for the Science and Engineering Research Council (SERC). Software has been developed which allows users to complete complex electronic forms on a standard Personal Computer in SGML format. The software is independent of the form structure which is defined in ASCII files using a powerful, compact purpose-designed language. Close control over all aspects of data capture including data integrity, virtual fields and online user help is supported. The completed forms are transmitted as electronic mail across a wide area network and processed automatically in a mainframe environment at SERC.


Robinson, Peter R. The Transcription of Primary Textual Sources Using SGML. Office for Humanities Communication Publications, No. 6. Oxford: Oxford University Computing Service, Office for Humanities Communication, 1994. vii+ 136 pages, references. ISBN: 1897791070. [needs abstract]


Rockley, Ann. "Ontario Hydro and SGML." Technical Communication: Journal of the Society for Technical Communication] 40/3 (Third Quarter, August 1993) 383-386. ISSN: 0049-3155. Author affiliation: Information Design Solutions. [SGML case study]


Rubinsky, Yuri. "Standards for Hypertext Interchange." SGML Users' Group Newsletter 15 (January 1990) 14-15. For more on SGML applied to hypertext/hypermedia, see the standard and: (1) Yuri Rubinsky, "Standards for Hypertext Interchange Need Not Come out of Thin Air," <TAG> 11 (October 1989) 4-5; (2) Yuri Rubinsky, "Comments on an SGML Application for Hyper- and Multi-Media Interchange: Informal Report from the GCA Hypertext/Hypermedia Standards Forum," <TAG> 11 (October 1989) 5-6.


Sacks-Davis, Ron; Arnold-Moore, Timothy; Zobel, Justin. Database Systems for Structured Documents. Technical Report. Nara, Japan: International Symposium on Advanced Database Technologies and their Integration (ADTI'94), 1994. 13 pages, 33 references.


Scheller, Angela. "Document Standards: Availability and Products." Computer Networks and ISDN Systems 16/1-2 (September 1988) 138-142. CODEN: CNISE9. ISSN: 0169-7552. Abstract: With the growth in the spread of computer networks the demand by users for document interchange features is becoming increasingly apparent. The prerequirement for the realization of document interchange in a heterogeneous computer environment are internationally accepted standards for the description of documents. Already in early 1986, the Standard Generalized Markup Language SGML was published as an international standard for the structuring of documents. The publication of the Office Document Architecture ODA is expected in the course of 1988. The final text is already available. ODA was originally developed for the pure office environment, whereas the concept for SGML addressed the author/publisher environment. This fact is mirrored in the current pilot projects testing the standards: the manufacturers of office and word-processing systems mainly work with ODA, whereas in the technical scientific and publishing sectors SGML is often implemented. Users requiring an interface both to the office sector as well as to the publishing sector will therefore be confronted with the problems related to working with two different, only partially compatible standards.


Scheller, Angela. "Experience with SGML in the Real World: DAPHNE, a System Integrating Computer Graphics Metafiles into SGML Documents." In Document Exchange: The Use of SGML in the UK Academic and Research Community. Workshop Proceedings 5-7 March 1990, ed. Anne Mumford. Advisory Group on Computer Graphics, 1990. Abstract: DAPHNE is a document processing system implemented to support joint editing within the German Research Network DFN. It is based on two international standards in the area of document and graphics processing, the Standard Generalized Markup Language SGML and the Computer Graphics Metafile CGM. This paper presents the functionality offered by DAPHNE today as well as plans for future extensions. It also describes the experience gained with a distributed environment of commercial products for processing SGML documents in general and DAPHNE documents in particular.


Seaman, David M. "'A Library and Apparatus of Every Kind': The Electronic Text Center at the University of Virginia." Information Technology and Libraries 13/1 (March 1994) 15-19. 1 reference. Author affiliation: Coordinator of Electronic Texts, University of Virginia Library, Charlottesville, VA. Abstract: The Electronic Text Center at the University of Virginia combines an online archive of thousands of SGML-encoded electronic texts, all available through a single piece of search software, with a library-based center housing hardware and software suitable for the creation and analysis of text. Through ongoing training sessions and support of individual teaching and research projects, the Center is now building a diverse and expanding user community locally, and providing a potential model for similar enterprises at other institutions.


SGML Users' Group. "A Brief History of the Development of SGML." 3-June-1989. 2 pages. Available from the SGMLUG as a separate document, and in the SGML Users' Group Newsletter 14 (October 1989) 6-7, and (being free of copyright restrictions) elsewhere: (1) The SGML Handbook, cited here, Appendix A: pp. 567-570; (2) The SGML Source Guide, also cited; (3) Joan Smith's Book on SGML and Related Standards, Appendix 1.


Smith, Joan M. SGML and Related Standards. Document Description and Processing Languages. Ellis Horwood Series in Computers and their Applications. New York/London: Ellis Horwood, 1992. xviii + 152 pages. ISBN: 0-13-806506-3. The book supplies a valuable survey from the perspective of Joan Smith, who served as a leading SGML advocate in the UK for many years. Smith is an independent consultant, and founder of the International SGML Users' Group. See a publisher's description and the volume Table of Contents for an overview.


Smith, Joan M. SGML Products and Services. A document covering primarily CALS-SGML, produced by Joan Smith for the CALS in Europe SIG. Periodically updated. The cost is approximately 20 UK pounds. Contact: David Ardron, Secretary, CALS in Europe SIG; Ferranti Computer Systems Ltd,; Western Road, Bracknell, Berkshire RG12 1RA; UNITED KINGDOM; TEL: +44-344-483232.


Smith, Joan M. "The Standard Generalized Markup Language (SGML) for Humanities Publishing." Literary and Linguistic Computing 2/3 (1987) 171-175. ISSN: 0268-1145. Abstract: a new methodology, and the core of which is generic coding, has been developed within the International Organization for Standardization (ISO). This is known as the Standard Generalized Markup Language (SGML). Using SGML, the elements of a document are marked up as to their role, be it a paragraph, an abstract, a note, or whatever; the style of presentation is a separate issue and is not addressed by SGML. These elements can form part of a data base, which can be updated at will. So there is the notion of data base publishing. The Standard Generalized Markup Language is presented as a tool for full-text data base publishing, where the options for output are open, an example being given as a marked up document. Its value for all aspects of humanities publishing is addressed: whether for scholarly papers intended for a journal, books, specialist publications, dictionaries, or biographies, indeed whatever is input to an electronic medium with the intention of being imaged subsequently in some form; whether alone, in part, or in combination with other text. SGML represents an advance in publishing methodology, taking advantage of developing technology. It can be exploited as such in an academic environment to give an added dimension to research publications.


Smith, Joan M. "Standard Generalized Markup Language and Related Standards." Computing Communications 12/2 (April 1989) 80-84. ISSN: 0140-3664. CODEN: COCOD7. Abstract: Projects developed by the International Organization for Standardization-International Electrotechnical Commission Joint Technical Committee 1-Subcommittee 18-Working Group 8 are described here, with the working group concentrating on the formulation of standards for text description and processing languages in the broader domain of text and office systems. Central to the work of WG 8 is ISO 8879 Standard Generalized Markup Language for the description of the information content of documents. Other standards and technical reports produced by the group support SGML in some way, either directly or indirectly. Their role in office publishing is described, and some information is given about office applications and the products that are available in the marketplace. Note: Joan Smith has contributed numerous articles covering (SGML) standards updates. E.g., see "Standards," Literary and Linguistic Computing 4/4 (1989) 294-296; "Standards," Literary and Linguistic Computing 4/1 (1989) 57-58; "Standards," Literary and Linguistic Computing 1/3 (1986) 191-192.


Smith, Joan M. The Standard Generalized Markup Language (SGML): Guidelines for Editors and Publishers. British National Bibliography Research Fund, 26. 1987. ISBN: 0-7123-3111-5. ISSN: 0264-2972. The abstract for Smith's "Authors" volume (see here) generally pertains to this document as well.


Smith, Joan M. The Standard Generalized Markup Language (SGML): Guidelines for Authors. British National Bibliography Research Fund, 27. 1987. ISBN: 0-7123-3112-3. ISSN: 0264-2972. Abstract: These guidelines are for authors of scholarly publications who wish to prepare documents for a publisher on existing text entry devices, word processors and personal computers, adding markup to the text in accordance with the Standard Generalized Markup Language (SGML). A simple approach is adopted, based on the concept of a starter set of tags. An explanation of SGML is given and why markup should be used, and advice provided on what is to be done if the author has a publisher, has not yet got a publisher, or is his or her own publisher. As far as the preparation of the document is concerned, there is advice on keying conventions, when not to use stylistic and formatting characteristics of the system, and conditions under which its features and facilities may be used. The starter set of tags is explained, and how to deal with lists, tables, and figures. Cross referencing is addressed and the preparation of an index -- all with examples. Information is given on how to extend the starter set and how to cope with text the author may not be able to mark up for any reason. How to deal with characters for printing, that cannot be imaged on the text entry device, is explained, also how to use abbreviations for lengthy character strings of a repetitive nature. For all other issues, the author is referred to the publisher, to the companion 'Guidelines for Editors and Publishers', and to the standard itself.


Smith, Joan M.; Stutely, Robert S. SGML: The Users' Guide to ISO 8879. Chichester/New York: Ellis Horwood/Halsted, 1988. 173 pages. ISBN: 0-7458-0221-4 (Ellis Horwood) and ISBN: 0-470-21126-1 (Halsted). LC CALL NO: QA76.73.S44 S44 1988. The book (1) supplies a list of some 200 syntax productions, in numerical and alphabetical sequence; (2) gives a combined abbreviation list; (3) includes highly useful subject indices to ISO 8879 and its annexes (4) supplies graphic representations for the ISO 8879 character entities; (5) lists SGML keywords and reserved names. A more complete overview of the book may be found in the SGML Users' Group Newsletter 9 (August 1988) 9.


Smith, MacKenzie. "DynaText: An Electronic Publishing System." Computers and the Humanities 27/5-6 (1993-1994) 415-420. 10 references. [Review of Electronic Book Technologies' DynaText program for use in humanities computing.]


SoftQuad, Inc. The SGML Primer. SoftQuad's Quick Reference Guide to the Essentials of the Standard: The SGML Needed for Reading a DTD and Marked-up Documents and Discussing them Reasonably. Version 3.0. Toronto: SoftQuad Inc., December, 1991. Correction and revision of Version 2.0, May 1991. 36 pages. This Primer provides a highly readable and even enjoyable introduction to the essential concepts and features of SGML. It may be one of the best brief treatments of SGML you can find -- something you can lend to colleagues without fear of having them turned off by the unavoidable complexity of SGML. The book consciously attempts a popular presentation, using clever illustrations, some surprising examples (structured events in the world of cuisine art, recipe for a biblical mythology), and a bare minimum of technical language. It is available from SoftQuad Inc.; 56 Aberfoyle Crescent, Suite 810; Toronto, Ontario; Canada M8X 2W4; TEL: +1 (416) 239-4801; FAX: +1 (416) 239-7105.


SoftQuad, Inc. The SGML World Tour. CD-ROM. Toronto, Ontario: SoftQuad, Spring, 1994. ISBN: 1-896172-01-6. A large library of SGML resources on CDROM disk, which may be free. Tel: 1-800-387-2777 (1 416 239-7105)


Sperberg-McQueen, C. Michael. "Specifying Document Structure: Differences in LaTeX and TEI Markup," TUGboat 12/3 = Proceedings of the 1991 Annual Meeting) 415-421 (available similarly as a TEI document TEI EDW22, June 9, 1991).


Sperberg-McQueen, C. Michael. "The Standard Generalized Markup Language (SGML): A Brief Introduction." Proceedings of the American Society for Information Science = Proceedings of the ASIS annual meeting 30 (1993) 285. [56th ASIS Annual Meeting Proceedings of the 56th Annual Meeting of the American Society for Information Science October 24-28, 1993 Columbus, OH] ISSN: 0044-7870


Sperberg-McQueen, C. Michael. "Text in the Electronic Age: Textual Study and Text Encoding, with Examples from Medieval Texts." Literary and Linguistic Computing 6/1 (1991) 34-46. ISSN 0268-1145. Abstract: This paper discusses characteristic problems in designing methods of encoding texts in machine-readable form for textual study. Any electronic representation of a text embodies specific ideas of what is important in that text. A well-developed encoding scheme is thus in some sense a theory of the texts it is intended to mark up. This paper describes, with examples, the theory implicit in the Text Encoding Initiative (TEI), a project to develop guidelines for the encoding of machine-readable texts. Any machine-readable representation of texts must use markup, but no finite vocabulary of markup items can be complete, since neither the set of textual features worth marking nor the set of texts to be studied is finite. Any useful markup scheme must therefore be extensible. Additionally, a markup scheme must allow several discrete views of texts. Texts are both linguistic and physical objects. They have simultaneously a linear, a hierarchical and a directed-graph structure. They refer to objects in real or fictive universes. Texts, finally, are cultural and thus historical objects: a useful encoding scheme must be able to represent textual variation, parallel texts, and the gradual accretion of interpretation and commentary with which human culture adorns venerated texts.


Sperberg-McQueen, C. Michael; Goldstein, Robert F. "HTML to the Max: A Manifesto for Adding SGML Intelligence to the World-Wide Web." Presentation at WWW-2 '94. September 15, 1994. Link to the authoritative version of the document at UIC, or see a mirrored copy here.

Abstract: HTML demonstrates that SGML markup is useful for networked information. How can it be made even more useful? One way is to extend the tag set from HTML to HTML2, etc. We argue here for a more radical approach: full SGML awareness in WWW. We believe the difficulties are small, the cost affordable, and the advantages overwhelming. SGML is a metalanguage for defining markup languages; HTML is just one instance of this infinite family. At present, documents in other SGML document types must be translated into HTML for display by a Mosaic client --- sometimes this imposes unacceptable information loss. WWW browsers could handle other SGML document types without translation by launching a general-purpose SGML browser to view them, as they now launch graphics viewers; a better solution overall would be to buildSGML display into the WWW browsers themselves. Either way, display of an SGML document would be controlled by a style sheet using a small number of display primitives ('bold', 'line break', etc.) to specify the rendition of each element type. For 'well-known' document type definitions (DTDs) like HTML, style sheets could be distributed with the browser, or built in. For other DTDs, the browser would fetch a style sheet from the server. Using style sheets, browser software can also make it easy to customize document display. DTDs and style sheets can be designed to accommodate extensions, ensuring that authors can make small extensions to the tag set with no change whatsoever in the target browsers and virtually no performance penalty.


"Standard Generalized Markup Language (SGML; ISO/IEC 8879/1986)." Communications of the Association for Computing Machinery 34/11 (November 1991) 72-73. ISSN 0001-0782. Sidebar to the article on HyTime, by Steven R. Newcomb; see here. Abstract: The Standard Generalized Markup Language (SGML) is designed to describe documents in terms of their logical structure. SGML provides a meta-syntax for expressing agreed-upon syntaxes for individual document types, and for the syntax of the generic coding in the documents themselves. The language allows one document to appear transparently on dissimilar systems, even when those systems require distinct distribution methods among various files. Both private and public enterprises are turning to SGML as a general solution for their information-handling problems; SGML is amenable to certain kinds of processing, and all SGML documents can be validated by a single validating parser. The biggest commercial user of SGML today is perhaps the US Defense Department's Computer-aided Acquisition and Logistic Support Initiative.


Szillat, Horst. SGML - Eine praktische Einführung. Place: International Thomson Publishing GmbH, 1994. ISBN: 3-929821-75-3. 226 pages. Abstract [supplied by the author] [English] This German SGML-book gives an introduction to SGML. The material is discussed by examples. In the second part of the book the author explains his ideas of what is formatting of a SGML-document and shows that these ideas can be realized by LaTeX. [German] Dieses SGML-Buch gibt eine Einführung in SGML. Das Material wird an Hand von Beispielen diskutiert. Im zweiten Teil des Buches erklärt der Autor seine Idee, was Formatierung eines SGML-Dokumentes bedeutet und zeigt, daß diese Ideen mit LaTeX relisiert werden können.


Tompa, Frank W. "What is (Tagged) Text?" In Dictionaries in the Electronic Age: Proceedings of the Fifth Annual Conference of the UW Centre for the New Oxford English Dictionary (18-19 September 1989, St. Catherine's College, Oxford). Volume 2. Pages 81-93. Waterloo, Ontario: UW Centre for the New OED, 1989. Note: for further details on the Waterloo Centre, see Gonnet.


Travis, Brian E.; Waldt, Dale C. The SGML Implementation Guide. Springer-Verlag, 1995. ISBN: [unknown] Approx 350 pages. See the [provisional] Table of Contents.

Author's abstract: This is the book the authors needed when they were first implementing SGML. At that time, and up until now, there has not been a complete source of information for the SGML implementor. We had to perform major research at every single phase of our implementation process using time-honored systems analysis techniques. While his approach worked, we would have gladly embraced any help we could have found.

The philosophy behind this book is to provide a pragmatic working knowledge of SGML and related disciplines and techniques needed to actually achieve a successful implementation.

The book is not a review of products, but it does contain mention of some products as an example of what is available. It is not an executive briefing offering a high-level view of the advangates of implementing a structured approcah to data, nor is it a nuts-and-bolts description of how to write SGML applications. Rather, it strikes a ground between those two extremes, offering to the people who must make the decision to implement, then the implementors, enough information to get well down the road to SGML.


Tucker, Hugh A. and Bogh, Torkil. SGML & ODA. Standards for Document Processing and Interchange. DS/INF 14, 1989. Dansk Standardiseringsrad, 1989. Book form of the technical report SGML/ODA: Standards for Document Processing and Interchange. See the summary and review in "New Book on SGML and ODA Published. SGML & ODA. Standards for Document Processing and Interchange. DS/INF 14, 1989," <TAG> 12 (December 1989) 17-18.


Turner, Ron; Douglass, Tim; Turner, Audrey. README FIRST: SGML for Writers and Editors. Charles F. Goldfarb Series On Open Information Management. Englewood Cliffs, NJ: PTR Prentice Hall, [forthcoming May] 1995. ISBN: 0-13-432717-9.

Summary: This is a non-technical introduction to SGML for writers and editors who need to work in an SGML environment. The focus is not on the technical details of the standard but rather on how writers and editors can benefit from and work effectively with SGML. Included with the book is a diskette that contains SGMLAB, a DOS-based SGML application that includes a parser and browser and numerous sample SGML documents. Using SGMLAB, readers can view on-line both the structure and output of SGML documents, and validate those documents. [publisher's pre-publication description]


Vignaud, Dominique. L'édition structurée des documents: SGML application à l'édition français. Paris: Éditions du Cercle de la Librarie, 1989. ISBN: 2-7654 0420-8. This volume was prepared to assist French publishers with application of the SGML standard. It supplies a basic DTD, and additional materials are available (including electronic files) for extending the DTD. The book is said to be the first volume in a series L'édition structurée des documents, published by Éditions du Cercle de la Librarie. For availability, contact the Syndicat nationale de l'édition (SNE) or: Éditions du Cercle de la Librarie, 35 rue Grégorie-de-Tours, 75006 Paris, France. Additional details: see "SGML: application à l'édition français," SGML Users' Group Newsletter 13 (August 1989) 9; Yuri Rubinsky's brief review, "Can Imaginative Objects Have Intentions?" <TAG> 10 (July 1989) 11; or "French Book DTD Available," <TAG> 9 (March/April 1989) 15. The book is similar in purpose to the American (EPSIG/AAP) volume "Standard for Electronic Manuscript Preparation and Markup" published by NISO, and to the British volumes written by Joan Smith: Smith and Smith. Whereas the EPSIG/AAP standard for electronic publishing defined some 220 tags, Vignaud's DTD deliberately defines only 60 tags.


Vooren, Ludo van; Severson, Eric C. "SGML Architectural Forms." <TAG> 5/2 (February 1992) 1-3. "A concept emerging from the HyTime Committee, called 'SGML Architectural Forms,' provides SGML users with a new tool for describing document semantics. The essence of the Architectural Forms idea is that it allows users to extend the attribute set for an element without doing violence to the basic processing, parsing and integrity of the DTD or associated document instances. Extending the attribute set allows users to express and preserve information that would otherwise require use of external files. The attraction of the approach is that it does not require use of new structures and processes; it uses the SGML parser and an extended form of the DTD to convey the desired information. . . for hard-wired SGML applications that work with only one DTD, the SGML architecture approach provides a simple, low-cost way to connect to other SGML applications. The idea behind SGML architectural forms is to directly code the relationship between the SGML elements and target applications semantics in the DTD of the document instance to be converted. . .An immediate application could address an area that has been overlooked by developers: SGML searches across multiple document types. For example, the user might want to find the chapter titles but does not know how these are tagged in the various DTDs. A search engine could look for architectural forms instead of tag names." (extract)


Vooren, Ludo van. "Implementing SGML: Where Do You Start?" <TAG> 13 (February 1990) 5-7. This contribution proposes implementing SGML in several stages: Document Analysis, Process Design, Document Type Declaration Writing, Document Preparation. Published in similar format in SGML Users' Group Newsletter 17 (August 1990) 5-7.


Walter, Mark. "OSU's Chameleon Architecture: A Grammatical Approach to Translation and DTDs." Seybold Report on Publishing Systems 20/7 (December 24, 1990) 17-23. Describes the approach taken by the Chameleon Research Group at the Department of Computer and Information Science at Ohio State University in building SGML translators and DTDs. See more on Chameleon sub Mamrak and O'Connell.


Warmer, Jos; Van Vliet, Hans. "Processing SGML Documents." Electronic Publishing: Origination, Dissemination and Design (EPOdd) 4/1 (March 1991) 3-26. Received 10-January-1990, revised 18-October 1990. ISSN: 0894-3982. Authors' affiliation: [Warmer] PTT Research, DR Nehir Laboratories, Liedschendam, Netherlands; [Van Vliet] Faculteit Wiskunde et Informatica, Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam.

Abstract: SGML (Standard Generalized Markup Language) is an ISO standard that specifies a language for document representation. The main idea behind SGML is to strictly separate the structure and contents of a document from the processing of that document. This results in application-independent and thus reusable documents. To gain the full benefit of this approach, tools are needed to support a wide range of applications. The ISO Standard itself does not define how to specify the processing of SGML documents. Many existing SGML systems allow for a simple translation of an SGML document, which exhibits a 1-1 correspondence between elements in the SGML document and its translation. For many applications this does not suffice. In other systems, the processing can be expressed in a special-purpose programming language. In this paper the various approaches to processing SGML documents are assessed. We also discuss a novel approach, taken in the Amsterdam SGML Parser. In this approach, processing actions are embedded in the grammar rules that specify the document structure, much like processing actions are embedded in grammars of programming languages that are input to a parser generator. The Appendix contains an extended example of the use of this approach. [check]


Warmer, Jos; Egmond, Sylvia van. "The Implementation of the Amsterdam SGML Parser." Electronic Publishing: Origination, Dissemination and Design (EPOdd) 2/2 (July 1989) 65-90. ISSN: 0894-3982. Abstract: The Standard Generalized Markup Language (SGML) is an ISO Standard that specifies a language for document representation. This paper gives a short introduction to SGML and describes the (Vrije Universiteit) Amsterdam SGML Parser and the problems we encountered in implementing the Standard. These problems include the interpretation of the Standard in places where it is ambiguous and the technical problems in parsing SGML documents. Note: the "Amsterdam parser" is available electronically via Internet anonymous-FTP.


Warren, P. T. "SGML and Style Sheets: the Implications for Electronic Document Preparation." University Computing 9/2 (June 1987) 81-86. ISSN: 0265-4385. Author affiliation: Leicester University, England. Abstract: Standards have been a long time coming in the field of text processing and the recent publication of the standard generalized markup language starter set has attracted some interest. This is a generic mark-up system for the structural, as opposed to the presentational, features of documents. It can then be implemented on a variety of output devices according to the facilities available. Style sheets help enforce uniformity of style throughout a document, and across documents from different authors, by allowing the author to write without attention to formatting. The paper shows how the style sheet feature of a proprietary word processor may be configured to simulate most of the features of the SGML starter set.


Watson, Bradley C.; Davis, Robert J. "ODA and SGML: An Assessment of Co-existence Possibilities." Computer Standards and Interfaces 11 (1990-1991) 169-176. (8) references. ISSN: 0920-5489. Authors' affiliation: Online Computer Library Center [OCLC], Dublin, Ohio. [needs abstract]


Weitzman, Louis; Wittenburg, Kent. "Automatic Presentation of Multimedia Documents Using Relational Grammars." [To appear as] Pages xx-xx in Proceeedings of ACM Multimedia '94 [San Francisco, CA, October 15-20, 1994. New York: ACM, 1995. Abstract: This paper describes an approach to the automatic presentation of multimedia documents based on parsing and syntax-directed translation using Relational Grammars. This translation is followed by a constraint solving mechanism to create the final layout. Grammatical rules provide the mechanism for mapping from a representation of the content of a presentation to forms that specify the media objects to be realized. These realization forms include sets of spatial and temporal constraints between elements of the presentation. Individual grammars encapsulate the "look and feel" of a presentation and can be used as generators of that style. By making the grammars sensitive to the requirements of the output medium, parsing can introduce flexibility into the information realization process.


Wohler, Wayne L. "The DTD May Not Be Enough: SGML Declarations." <TAG> 5/10 (October 1992) 6-9. Part one of a three-part serialized article in <TAG>'s occasional tutorial series. The full text of this tutorial is available online. See Part 2 and Part 3. See the SGML Declaration main entry for other information. This first article covers: Introduction, Document Character Sets, Defining Character Sets, System Character Set, Using Character Sets from the Far East, Conclusion. Author affiliation: Wayne L. Wohler is an Advisory Engineer with Publishing Solutions, IBM Corporation [Boulder, Colorado], and represents IBM's SGML interests in various working groups.


Wohler, Wayne L. "The DTD May Not Be Enough: SGML Declarations." <TAG> 6/1 (January 1993) 1-7. Part two of a three-part serialized article in <TAG>'s occasional tutorial series. The full text of this tutorial is available online. See Part 1 and Part 3. See the SGML Declaration main entry for other information. This second article covers: Declaration of a Concrete Syntax, How is a Syntax Defined?, Defining the Concrete Syntax, Conclusion. Author affiliation: Wayne L. Wohler is an Advisory Engineer with Publishing Solutions, IBM Corporation [Boulder, Colorado], and represents IBM's SGML interests in various working groups.


Wohler, Wayne L. "The DTD May Not Be Enough: SGML Declarations." <TAG> 6/2 (February 1993) 1-6. Part three of a three-part serialized article in <TAG>'s occasional tutorial series. The full text of this tutorial is available online. See Part 1 and Part 2. See the SGML Declaration main entry for other information. This third article covers: Feature Usage Declaration, Application Specific Information, Using the Concrete Syntax Scope, Capacity Sets, Reference Capacity Set, A Few Final Notes, Putting It All Together. Author affiliation: Wayne L. Wohler is an Advisory Engineer with Publishing Solutions, IBM Corporation [Boulder, Colorado], and represents IBM's SGML interests in various working groups.


Wolfsthal, Y. "Style control in the Quill document editing system." Software -- Practice and Experience 21/6 (June 1991) 625-638. (14) references. Author affiliation: IBM Palo Alto Science Center, CA. Abstract: A critical problem in the design of editors for structured documents is that of style control, i.e. mapping the logical elements of the documents to their physical appearance on pages. This paper presents a novel approach to style control, used in the Quill document editing system that has been prototyped at the IBM Almaden Research Center. The style control mechanism is an integral part of the editing system and consistent with the overall system architecture, in both its inner structure and its user interface. Properties that specify the formatting process, together with action routines for specifying complex semantics, are the basic style control primitives in the proposed approach. See also on Quill in Chamberlin.


Wonneberger, Reinhard. "Approaching SGML from TeX" TUGboat 13/2 (July 1992) 226-227.


Wonneberger, Reinhard; Mittelbach, Frank. "SGML -- Questions and Answers." TUGboat 13/2 (July 1992) 221-223.


Wright, Haviland. "SGML Frees Information: Escape a World Where There is Too Much Data and Go to a Place Where You Can Access the Information Hidden Within It." Byte Magazine 17/6 (June 1992) 279-286. [In Byte special section "Managing Infoglut: How to Add Value to Your Data"] Author affiliation: Avalanche Development Corporation.


Wu, Gilbert. SGML Theory and Practice. British Library Research Paper 68. British Library Research and Development Department, 1989. ISSN: 0269-9257 No. 68. ISBN 0-7123-3211-1. 93 pages.


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/sgmlbib0.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org