JASIS Special Issue on SGML
The July 1997 issue of JASIS, guest edited by Elisabeth Logan and Marvin Pollard, was dedicated to the theme "Structured Information/Standards for Document Architectures." The online Table of Contents for JASIS 48/7 (July 1997) is available from ASIS, and ordering information for the volume is available from the Wiley WWW server: http://www.wiley.com/compbooks/compjournals/jasis.html. The collection of bibliographic entries below will be integrated into the main bibliographic reference database of the SGML Web Page.
Abstract: "The need for organizations and industries to increase the efficiency of using document information has lead to the development and adoption of standards for document architectures. The use of networked computers to author, exchange, manipulate, store, retrieve, present, use, and re-use information has simultaneously created the possibility and the need for adopting standards for interchanging digital document information. Structured document information systems require the attention of producers and users of information today because growing document repositories are recognized as valuable information assets. Implementing standards-conforming, structured information systems, increases the value of these document repositories, but doing so requires serious rethinking of the ways document information is produced, stored, and distributed. This Special Issue of JASIS addresses the standards of structured information and document architectures, the issues surrounding the implementation of these standards for organizations and persons working towards the goal of using document information more efficiently, and explores the future of structured document information systems."
See other details concerning the original call for papers and the significance of this special issue, appearing as the ninth in a series of several special topics issues of JASIS, following the announcement by Donald H. Kraft (April, 1992 issue of JASIS: "A Call to Action in Response to Happy Days," Editorial, Journal of the American Society for Information Science 43/3, April 1992, page 302).
A related version of Weibel's tribute to Yuri Rubinsky is available online. See also the larger collection of tributes to Yuri Rubinsky in the SGML Web Page, and the Yuri Rubinsky Insight Foundation, "dedicated to commemorating the genius of the late Yuri Rubinsky by bringing together workers from a broad spectrum of disciplines to stimulate research and development of technologies which will enhance access to information of all kinds."
See the predecessor to this article in French: "Pourquoi SGML? Pourquoi maintenant?" in Actes de la conférence "SGML et inforoutes; pour la diffusion optimale de l'information gouvernementale et juridique" organisée par le Centre de recherche en droit public de l'Université de Montréal et l'EBSI, CRDP, 1995, pages 55 - 69.
Abstract: "In 1995 and early 1996, the ISO standards process that includes SGML and related standards has seen a remarkable coalescence of efforts that should be beneficial to all of us. Most notably, DSSSL and HyTime are developing a shared approach to tree structures and query languages. A consequence of this may be the development of a set of general facilities that can be shared among all SGML-based standards and that, when incorporated into products, will make our documents easier to work with and more powerful in their ability to deliver information."
See also the ISO/IEC JTC1/SC18/WG8 Web Service, WWW server for 'Information Technology -- Document Processing and Related Communication.' James Mason is the Convenor for WG8.
Abstract: "DSSSL, the Document Style Semantics and Specification Language, is ISO/IEC 10179, an International Standard for the formatting and other processing of SGML documents. DSSSL was completed in January 1996 after eight (8) years of development. From its inception, DSSSL was conceived as a companion standard to SGML, where SGML is a language for standardizing the way we represent document structures without regard to form or presentation. It is possible to use SGML markup to represent formatting information, but this is discouraged, since doing so makes a document more difficult to reuse and reprocess. Reuse is generally a significant requirement for SGML data so it is not a good idea to 'pollute' your document with presentational markup. Yet formatting of some nature is still desirable, and sometimes critical, for all documents, and in some cases users want to interchange this formatting information (informally known in the industry as style sheets) in a standardized, non-proprietary format. DSSSL is key to enabling this interchange."
See also Anders Berglund and Sharon Adler ("ABCs of DSSSL") in the Conference Proceedings of SGML '95.
Abstract: "Defines a method of using the constructs defined by the HyTime standard (ISO/IEC 10744,1992) to both structure scholarly writing by capturing the abstract relationships within it and to affect its presentation in ways that express those relationships through the use of dynamic multimedia presentations. The design assumes that the data to be accessed comes from an essentially unbounded set of networked resources, rather than from a self-contained database. By using HyTime, the design separates the logical structuring and abstract fictional definition of the system from any specifics of implementation, including details of data location and access, with the specific goal of enabling interchange of both structured source data and presentation specifications among disparate systems, or implementations of the same basic system, while also enabling the use of the data by other SGML or HyTime applications for other unanticipated uses."
Abstract: "The World Wide Web has had over 5 years of intensive development, and has expanded from a text-only technical documentation system to a multimedia information base distributed across the planet. Although its tool for structural definition, the Hypertext Markup Language (HTML), has been under constant development throughout this period, most browsers have been slow to take advantage of all the facilities it offers. At a time when there is much debate over the public future of the Web, it is in danger of partial stagnation. Despite significant innovations in some areas, the field is still open for software developers who are capable of harvesting the benefits of SGML, the language in which HTML is written. This analysis of HTML Document Type Descriptions (DTDs) reveals where some opportunities may lie."
Abstract: "The Text Encoding Initiative is an international collaboration aimed at producing a common encoding scheme for complex texts. The diversity of the texts used by members of the communities served by the project led to a large specification, but the specification is structured to facilitate understanding and use. The requirement for generality is sometimes in tension with the requirement to handle specialized text types. The texts that are encoded often can be viewed or interpreted in several different ways. While many electronic documents can be encoded in very simple ways, some documents and some users will tax the limits of any fixed scheme, so a flexible extensible encoding is required to support research and to facilitate the reuse of texts."
See also the bibliographic entry for Barnard and Ide, "The Text Encoding Initiative: Flexible and Extensible Document Encoding," Technical Report 96-396, Kingston, Ontario, Department of Computing and Information Science, Queen's University. December 1995. This version is available in Postscript format on the Internet.
Complete information on the Text Encoding Initiative is accessible via the main entry in the SGML Web Page, or on the TEI Web Site.
Abstract: "A method for augmenting an SGML document repository with database functionality is presented. SGML (ISO 8879,1986) has been widely accepted as a standard language for writing text with added structural information that gives the text greater applicability. Recently there has been a trend to use this structural information as meta-data in databases. The complex structure of documents, however, makes it difficult to directly map the structural information in documents to database structures. In particular, the flat nature of relational databases makes it extremely difficult to model documents that are inherently hierarchical in nature. Consequently, documents are modeled in object-oriented databases (Abiteboul, Cluet, & Milo, 1993), and object-relational databases (Hoist, 1995), in which SGML documents are mapped into the corresponding database models and are later reconstructed as necessary. However, this mapping strategy is not natural and can potentially cause loss of information in the original SGML documents. Moreover, interfaces for building queries for current document databases are mostly built on form-based query techniques and do not use the 'look and feel' of the documents. This article introduces an implementation method for a complex-object modeling technique specifically for SGML documents and describes interface techniques tailored for text databases. Some of the concepts for a Structured Document Database Management System (SDDBMS) specifically designed for SIL documents are described. A small survey of some current products is also presented to demonstrate the need for such a system."
A Postscript version of the article is available online (also, online abstract); [local archive copy].
Abstract: "SGML is billed as a key to making your data vendor-independent. 'Freedom!' is a rallying cry of the SGML community. Inspired, you migrate your data to SGML, only to discover that important clients and business partners still want it in the format of their favorite word processor, WWW browser, or publishing system and they expect you to translate it for them. How will you translate your data from SGML to other formats? In this article, we discuss several solutions to this translation problem. Along the way, we visit some key features and concepts of tools that address this problem, and we relate the problem to the DSSSL standard. Finally, we investigate the translation problem and the roles of SGML and DSSSL in the context of digital libraries."
Abstract: "A great deal of the collective knowledge of organizations is stored in documents. To be able to use documents effectively, the information structure in the documents should be carefully planned. International standards, for example SGML, have been developed for defining document structures. The definition method however is not enough. For defining effective document standards for an organization, a profound document analysis is needed. In the analysis, current documents and document management practices should be studied and described before developing new document structures and document management practices. The development of a methodology for document analysis is going on in a project studying legislative documents produced in the Finnish government and parliament. The article describes the first results of the project. As the document structure definition method, SGML is used in the project. The analysis method is developed and extended from an object-oriented method. The article introduces the main phases of the analysis: Domain definition, object modeling, state modeling, and content modeling. The application of the methodology in the case project and the data gathering methods used are also described."
Abstract: "This article is a commentary -- over a quarter-century after the fact -- on the first published paper to summit the need for (and hint at the existence of) what is now the Standard Generalized Markup Language. It was presented at the 33rd Annual Meeting of the American Society for Information Science in Philadelphia, October 15, 1970, and published in Volume 7 of the ASIS Proceedings. The editors of this Special Issue of JASIS felt that that meeting was worth remembering here because of its hitherto unpublicized connection with the origin of SGML. In addition, it is also worth remembering because of its closing banquet, which featured an erudite and witty speech by a professor with two doctorates, a piece balalaika orchestra, the entire Philadelphia Mummers band (replete with banjos, saxophones, and feathered headdresses), and a middle-eastern belly dancer who worked on the table tops! I've spoken at some hundred conferences since then and none of them has even come close."
See the bibliographic entry for the original publication: Goldfarb, Charles F.; Mosher, E. J.; Peterson, T. I. "An Online System for Integrated Text Processing." Proceedings of the American Society for Information Science Volume 7 (1970) 147-150.