Bibliography: SGML '96 Conference Proceedings. Celebrating a Decade of SGML
The bibliographic records in the following document have been created for the volume SGML '96 Conference Proceedings. Celebrating a Decade of SGML. As described in the first record (for the proceedings volume itself): each bibliographic entry includes the published abstract, author contact information, an indication of the "track" in which the presentation was delivered, and additional annotations or relevant hypertext links. The published abstracts for the papers, in many cases, are considerably more detailed than the brief abstracts which accompany the online conference program. This bibliographic information is being moved into the bibliographic database of the SGML/XML Web Page [now complete, March 13, 1997]. As a convenience to readers, it will also be retained online as a single document. Kindly report any errors by electronic mail. Authors are also invited to communicate about updates, revisions, retractions, new URLs, etc.
The SGML '96 Conference celebrated a decade of SGML, reckoned from the first publication of SGML as an ISO standard in 1986. The seventy-eight (78) published papers in the proceedings volume are divided into seven major sections, and represent a majority of the eighty-five (85) papers read at the conference. The collection not only documents an impressive milestone for the ISO 8879 standard, but serves as a valuable resource for SGML users. The SGML '96 conference itself was attended by over 1400 people, and included more than 120 speakers, and 100+ poster sessions in addition to conference sessions and exhibits.
Introductory essays in the proceedings volume are from the conference Co-Chairs B. Tommie Usdin and Deborah A. Lapeyre, and from Charles F. Goldfarb ("The Roots of SGML - A Personal Recollection"). The full inventory of published papers includes: Introductions (3 papers); Newcomer (11 papers), User (21 papers), Expert (16 papers), Business Management (5 papers), Case Studies (16 papers), "And More" (6 papers). The volume has complete title and author indexes. It was produced directly from the SGML source (based upon the "GCAPAPER" DTD) using ArborText's ADEPT Series SGML software.
Most of the published conference papers are referenced (by author) in the online bibliography of the SGML/XML Web Page. Each bibliographic entry includes the published abstract, author contact information, an indication of the "track" in which the presentation was delivered, and additional annotations or relevant hypertext links. The published abstracts for the papers, in many cases, are considerably more detailed than the brief abstracts which accompany the online conference program. The SGML '96 Conference Proceedings volume containing the full text of the papers may be obtained from GCA. GCA may also be reached at: GCA Publications, 100 Daingerfield Rd, Alexandria, VA 22314-2888 USA.
Abstract: "This talk looks at different approaches to introducing SGML, at different perceptions of the language and related technology, and at the changing nature of the audience for SGML. It is for those who are just being introduced to SGML and for those who must now make the case for SGML within their organization or industry."
Note: The above presentation was part of the "SGML Business Management" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "The paper introduces a new initiative for SGML in the medical informatics industry. It describes the current state of information processing in medicine, gives some of the requirements for a new, SGML-based approach to medical information processing, introduces the group working for the introduction of SGML into medical informatics and gives a brief description of the umbrella medical information standard called HL7 under which the new initiative is working. The paper concludes with a summary of the challenges facing the new initiative and an invitation to all to participate and contribute. Up-to-date information on contacts and programs will be available at the conference session."
Further information on the SGML Initiative in Health Care (HL7 Health Level-7 and SGML) can be found in the main entry of the SGML Web Page.
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "The annual SGML Conference provides the opportunity to focus on technology, expand our level of knowledge, exchange ideas and experiences with others in similar or related environments. Once the week is concluded, however, we are challenged to sustain the momentum that's been attained. This does not mean that one should only wait for the next year's conference; much can be done in the interim to continue the pursuit of knowledge and exchange of information. Many communities have organized local forums, specifically designed to address these concerns. This talk will focus on some of the major issues in establishing and maintaining such an organization."
Another paper discussing the role and operation of SGML user groups was presented at SGML '96 by Holly Smith.
Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "This presentation looks at using multiple DTDs for different stages in the life of a given piece of information, and examines the issues that should be taken into account when designing DTDs for a given application and deciding just how many DTDs are required.
A number of different models (e.g., a single DTD for the entire process; one DTD for authoring, another for storage, another for output, etc.) are examined, and the pros and cons for each are discussed. These considerations include the costs for each model (cost of maintaining multiple DTDs as well as the transform filters placed between them, versus the inefficiency of authoring with a single huge DTD), as well as the question of 'roll your own' versus using industry-standard DTDs."
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Several studies have tried to address the topic of Object Orientation around SGML.
The question asked was too simple and dichotomic; the answer given far too simple 'yes' or 'no'. The SGML application aspect, that is not covered by the standard, was not considered when searching for commonalities.
This paper intends to show that some application architectures coupled with an SGML parser offer an object mechanism with embedded SGML.
The relation between the parsed tokens and the application methods shows that application objects are connected to parsing objects in a simple and efficient paradigm which fully conforms to the LINK feature of the SGML language.
Adopting this view of an SGML application, makes all the facilities offered by the LINK feature suddenly self-evident and useful."
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Session Abstract: "The DSSSL (Document Style Semantics and Specification Language) Online session will consist of a 45 minute orientation session followed by two or more hours of interactive discussion and a demonstration of Jade, a DSSSL engine. Since the basic motivation behind dsssl-o is the application of semantics to generic SGML documents served out over the Internet, some time will be spent reviewing the case for SGML on the Web and the need for semantic specification methods beyond those being currently developed for HTML before presenting the Application Profile itself.
It is assumed, but not required, that session participants will have already gained some familiarity with the DSSSL standard. The DSSSL tutorial on Sunday, November 17, is highly recommended for persons planning to attend the DSSSL Online workshop."
Further information on DSSSL Online may be found: (1) in the DSSSL entry of the SGML Web Page, or (2) on the SGML Open Web site ("The Case for DSSSL Online," by Jon Bosak).
Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "There are three major components to an SGML Document - the SGML Declaration, Prolog and Document Instance. An understanding of their roles, their inter-dependencies, and their arrangement within a practical working environment is essential for all users of SGML based systems. As well as describing the purpose and content of each major component of an SGML Document, this paper explains how they are managed by an entity manager, and how they integrate with a parser."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Extensible Markup Language (XML for short) is being designed under the auspices of the World Wide Web Consortium; the larger goal of this effort is 'to enable future Web user agents to receive and process generic SGML in the way that they are now able to receive and process HTML. As in the case of HTML, the implementation of SGML on the Web will require attention not just to structure and content (the domain of SGML per se) but also to link semantics and display semantics.' [from the W3C 'Activity' Page] As a subgoal, we are creating an SGML application profile, XML, that is designed to provide many of the benefits of SGML in a lightweight, easy-to-use, easy-to-implement dialect that omits many of the difficult or problematic features of the full standard. This paper is an interim report on the progress of the work on creating an XML specification. This work is proceeding rapidly and we anticipate a draft of the specification being available at the time of SGML '96."
Further information on XML is available in the main XML entry of the SGML Web Page.
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "A variety of SGML authoring and editing tools exist on the market today and new ones are being added all the time. Initially, there seemed to be the need for only one type of tool but as a result of market need there are now a number of different 'flavors' each best suited for a particular SGML application.
This session will discuss the role of SGML authoring within a total publishing system. It will also describe the various types of tools available today for editing and authoring and what broad category each fits into in terms of its 'flavor'. A list of all known authoring and editing tools will be provided."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "The British National Corpus (BNC) is a rather large SGML document, comprising some 4124 samples taken from a rich variety of contemporary British English texts of every kind, written and printed, famous and obscure, learned and ignorant, spoken and written. Each of its hundred million words and six and a quarter million sentences is tagged explicitly in SGML and carries an automatically-generated linguistic analysis. Each sample carries a TEI-conformant header, containing detailed contextual and descriptive information, as well as more conventional SGML mark-up.
The corpus was created over a four year period by a consortium of leading dictionary publishers and academic research centres in the UK, with substantial funding from the British Department of Trade and Industry, the Science and Engineering Research Council, and the British Library. It is currently available under licence within the European Union only, where it is increasingly used in linguistic research and lexicography, in applications ranging from the construction of state of the art language-recognition systems, to the teaching of English as a second language.
This paper begins by describing how the corpus was constructed, and gives an overview of some of the SGML encoding issues raised during the process. A description of the special purpose SGML aware retrieval system developed to analyse the corpus is also provided."
See a longer abstract [mirror copy], and an online version of the SGML '96 presentation: Using SGML for Linguistic Analysis: the case of the BNC [mirror copy, pis aller, but see the canonical source if possible].
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Currently, most mathematics DTDs in widespread use are presentation-based, that is the markup relates to the layout of the mathematics on the page or screen rather than to the mathematical content. Such an approach makes the interchange between different SGML applications, and between SGML applications and computational applications, very difficult. This paper proposes a semantics-based DTD for mathematics, and describes a mechanism for selection of the particular branch of maths in use and extension of the DTD to cover areas of maths not as yet covered. Issues related to presentation, and the implications for applications, are discussed. Examples of possible mappings between the DTD and notations used by a typical computational program are given.
The meeting of the ISO 12083 committee in Munich in May 1996 accepted the proposal as the basis for the Mathematics fragment of the coming revision of the 12083 Standard. The paper reviews the issues raised and the resulting implications for the Mathematics fragment.
Significant progress has been made since the Munich meeting. The DTD has evolved following comments and test cases sent to the authors. Contacts with other interested organisations, such the OpenMath consortium and the W3 HTML mathematics group have been pursued."
Further information on SGML markup for maths may be found in the main SGML-Math entry of the SGML Web Page; see also the entry for the EUROMATH Project.
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Generation of SGML-coded documents as a result of database query processes is a commonly used practice. In most cases, however, the contents of such documents are entirely built from scratch as an SGML-formatted image of the query results. We present an extension to this practice, in cases when documents are made of a combination of human-generated parts and database originated parts. When such documents are updated, human-generated parts should remain untouched, while database originated parts (text, tables and graphics) should be regenerated or updated.
The method used here is that of SGML templates, which embed links targeted to a database. Such a technique can be used in many application fields, ranging from Web applications to industrial catalog publishing, where complex, human-generated document structures coexist with database extracts."
The document is available online in HTML format: http://www.balise.com/current/articles/chahun.htm; [mirror copy].
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Organizational decisionmaking patterns determine SGML investment strategies and potential benefits. A framework for understanding the primary policy objectives that can influence the selection of SGML (inherent policy effects) and application design (user-defined policy goals) will be presented. Competing and often contradictory goals and perceptions of value often make the development of a business case for SGML very difficult. Methods for integrating stakeholder principles, interests, and expectations in the early stages of application conceptualization and design will increase real and perceived benefits and de-fuse potential political problems before they develop."
See the bibliography entry for a related article by Kurt Conrad, "SGML, HyTime, and Organic Information Management Models.".
Note: The above presentation was part of the "SGML Business Management" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "This paper/presentation is an update of the one which was delivered at SGML'95. It is intended to be a general introduction to the issues and concepts involved in the selection of software tools for the electronic delivery and retrieval of SGML (Standard Generalized Markup Language) documents. In addition, some of the issues unique to publishing to CD-ROM or via the World Wide Web will be explored."
A similar version of this paper is available online: "Tools for Implementing SGML-Based Information Systems: Viewers and Browsers, Text Retrieval Engines, and CD-ROMs," based on a paper which was presented at SGML'95, December 4-7, 1995 and published in the conference proceedings. URLs: http://www.3-cities.com/~conrad/delivery.htm, [mirror copy].
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "The joint Air Transport Association/Aerospace Industries Assn (ATA/AIA) Graphics Working Group has developed a specification for Intelligent Graphics (IGEXCHANGE) to support the interchange of graphical application structures containing information which is non-graphical in nature. This paper will cover the development of industry requirements for intelligent graphics, describe Amendment 2 to the Computer Graphics Metafile (CGM) Standard developed to support application structuring of graphics, and describe the ATA industry profile of that standard. In addition, the use of SGML syntax to describe attributes associated with application structures will be discussed."
See: ATA profile -- ATA Specification 2100 Graphics Exchange, and EPCES relationship to ATA 2100 for other information on the ATA profile.
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Consleg Interleaf is an example of an SGML application that is used in a production environment. On a daily basis, operators use the application in order to provide lawyers from the European Community with the most accurate information on the existing legislation. As such, it is an application that illustrates how the SGML concepts can be applied in order to obtain a sophisticated document handling system."
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "There has been much discussion as well as work accomplished worldwide regarding the adoption of SGML as a methodology for the development of information standards in the pharmaceutical industry. This paper describes an example of how SGML-based tools that exist today were used to produce a complete Supplemental New Drug Submission for the Health Protection Branch, Health Canada. The submission was SGML browser-based, running on a Windows 3.1 PC. The system allowed the reviewer to navigate and comment electronically on all the textual documentation, clinical data and Case Record Form images required for the submission, and compiled all comments and relevant information collected during the review process for use in the reviewer's report. Summary tables were linked to the underlying clinical data from the browser so that tables could be verified, the underlying database queries modified and analysis redone as the document was reviewed.
A paper-based submission was made simultaneously to Health Canada to satisfy legal requirements. The electronic version used the same SGML-based instance as the paper, ensuring a one-to-one correspondence between the paper and the electronic data. This made possible, for example, the generation of Hytime hyperlinks for the table of contents and other cross-references required for navigation of the electronic version without any additional authoring or manual markup. The relative ease with which the source documents were taken from the authoring to the publishing phase greatly facilitated the incorporation of late changes to the submission resulting from electronic and manual in-house review.
It was concluded that the adoption of electronic document management and review techniques early in the submission development process greatly enhanced the quality of the final document which also eliminated unnecessary review delays at the government agency due to missing or inaccurate information. Although a government approved DTD was not available for this project, it is clear that the ability to be able to parse a document before submitting it for review is enough justification alone for using SGML in the standardization of information of this type."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "FORMEX (FORMALIZED EXCHANGE) is one of the very first initiatives that adopted the SGML notation. Initially designed around the UNESCO CCF standard (COMMON COMMUNICATION FORMAT), the original FORMEX specification (1986) and its first revision FORMEX V2 supported both notations. This year, the Office for Official Publications of the European Communities EUR-OP released a new version of FORMEX V3 which incorporates more than ten years of experience in the SGML field. FORMEX V3 is based exclusively on the SGML notation and SDIF is the communication standard encapsulating the exchange of data. Though the FORMEX specification is able to support any kind of document, it has a specific target: Legal Publications. The set of tags exhibited in FORMEX V3 is highly semantic and can be combined into a wide variety of legal publications doctypes. FORMEX V3 is the basic mechanism of the EUR-OP editorial work and information exchange. The global workplace is articulated around specialised workshops, handling production, housekeeping, consolidation of law, etc. The consistency of the system is a reference database which links the different workshops logically for document production, archiving and distribution. Whenever required, images are embedded in the SGML tagging. The SGML-structure information is distributed via different media and can be targeted for different users. Workshops for authoring, translating, editing and proof-reading, indexing and cataloging, etc. can be specific systems; some high co-operative workshops are connected to more than 500 workstations. EUR-OP releases the FORMEX V3 specification as a PUBLIC tagging scheme that can be shared by many EUROPEAN and non-EUROPEAN legal and governmental publishers."
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Free SGML software is available to create document instances and SGML processing applications, as well as to analyze complex DTDS. This article describes the origin and use of SGMLC-Lite, Near and Far Lite, the PSGML add-in to the Emacs text editor, the NSGMLS parser, Earl Hood's perlSGML tools, and the sgmls.pl and SGMLS.pm perl application development tools. This paper is excerpted from the book "SGML for Free," available soon from the Prentice-Hall Charles F. Goldfarb Series on Open Information Management."
Bob DuCharme maintains an online resource entitled "DBMS Support of SGML Files." It includes "information collected about database systems that present themselves as reasonable solutions for storing SGML data." See: http://cs.nyu.edu/cs_alumni/duchar96/sgmldbms.html
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "We consider the syntax and semantics of the TL (Transformation Language)in the DSSSL (Document Style Semantics and Specification Language) specification (DSSSL96). At present TEs (Transformation Expressions) are less than first-class language objects - they must all reside at the top level, and cannot be manipulated like other DSSSL/Scheme objects. In particular, there is no means of passing information among TEs, so one TE cannot take advantage of information derived by another, such as passing data about parent nodes to direct the transformation of child nodes. We propose extending the DSSSL syntax to allow a DSSSL program to better exploit the tree-like nature of the source grove by providing a semantics for nesting query expressions, allowing information to be passed around while retaining DSSSL's functional nature. The TEs would also come closer to being first-class objects. We suggest these extensions will make DSSSL programs easier to write and probably easier to optimize."
An online version of the presentation is available: "Why Isn't DSSSL a Tree?", in SGML format; [mirror copy]
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "In this paper we report the use of SGML for the documentation of highly structured engineering data in the telecommunication area. These structures are built by using a method, called Macro Modeling Technique. Macro Modeling Technique provides means for structuring the information about complex technical domains in a most unambiguous and nonredundant way. Models built by using Macro Modeling Technique are highly modular and can be refined and aggregated without overlap. The models also allow very precise access to engineering information because of their elaborated detailed structures.
It was a challenge to use the SGML language to map structures of the Macro Models onto document structures and support certain operations on a model within a document. For this purpose we have defined an unambiguous mapping from our models to content-oriented DTDs. We have developed a systematic approach to construct specifically tailored DTDs by combining parts of various model-based DTDs.
We have successfully applied this approach to the documentation for large systems in the telecommunication area and we implemented a prototype version of the required operations."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "During development of our first-generation online documentation conversion and delivery system, we addressed most of the obvious problems and requirements we foresaw. After the system was in place, we discovered other less obvious areas for improvement. We implemented the changes in a second-generation system and are planning additional changes in a third-generation system. This paper addresses the plans and realities of each of these systems."
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Lately it seems that everyone is talking about HTML. Some of you SGML `96 attendees may believe that this hot topic has nothing to do with SGML. Some of you may believe it has everything to do with SGML. And the rest of you may not be sure whether it's relevant or not.
Whether you consider HTML as a critical element of your information delivery strategy or not, you are probably reacting to .... thinking about ... being asked to put your SGML content on an intranet. This brings up many challenges: how to track revisions, how to manage relationships and links between objects, how to reuse information effectively and efficiently, and how to retain your investment without transforming to HTML.
Getting the most out of your SGML source means exploiting your investment by using that source as the same source for your intranet delivery needs. There's a big payoff in combining HTML, SGML, document component management and internet technologies to achieve a diversity of document products, increase quality of customer service, and ensure accuracy and timeliness. Imagine automatically assembling pieces of information which exactly matches a customer's need, and delivering the most up-to-date information in the form and format requested. Achieving this is possible today.
To help you achieve this 'jackpot' of capabilities, this presentation will:
- describe the need and business case for intranets
- identify a roadmap for exploiting SGML
- list key capabilities of such a system
- identify key technologies that should be integrated
This presentation, aimed at a managerial audience, will examine the aspects, value and impact of several real-world intranet applications. It will describe the relevant technologies and offer guidance on enabling your current technology investment to drive this new type of information delivery. It will also discuss critical features and functions of such a system. You will leave this presentation with a deep understanding of how to build a complete information delivery strategy."
Note: The above presentation was part of the "SGML Business Management" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
This presentation was the text of a keynote address at SGML '96, and is printed in the Introductory section of the proceedings volume.
From the Conclusion: "I like to think of the history of SGML as - what else - a tree structure. One root - from Rice to GML to my basic SGML invention - joined at the base of the trunk by the other - Tunnicliffe to Scharpf and GenCode. The trunk, of course, is the extraordinary 8-year effort to develop ISO 8879, involving hundreds of people from all over the world. The products and tools that came after are the branches, the many applications the leaves, and they are all still growing.
And in all these 30 years, while the technologies of both computers and publishing have undergone overwhelming and unpredictable changes, the tree continues to bear the fruit that I described in 1971:
The principle of separating document description from application function makes it possible to describe the attributes common to all documents of the same type. . . [The] availability of such 'type descriptions' could add new function to the text processing system. Programs could supply markup for an incomplete document, or interactively prompt a user in the entry of a document by displaying the markup. A generalized markup language then, would permit full information about a document to be preserved, regardless of way the document is used or represented."
Note: The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "There are many hypertext authoring tools available for specific outputs. For example, software that enables HTML or Windows Help authoring. These tools provide easy to use solutions for specific outputs, but they lack the benefit of a tailored, structured environment, and of course they do not allow the creation of multiple outputs from raw content stored as SGML -- a requirement we have at Novell.
However, these tools provide distinct advantages to the author that an SGML-based authoring system should strongly consider. To ignore these capabilities is to risk the SGML system being unusable, or incapable of handling large hypertext projects. These advantages center around the management of small information objects we call topics, and the links between them that are inherent in hypertext systems. To combine the power of SGML with the advantages of off-the-shelf authoring tools, Novell has developed a hybrid, named HelpWise. Novell's goal with HelpWise is to leverage the benefits of a structured SGML authoring system, and retain the link management that is crucial while creating hypertext documentation."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Using simple examples concentrating on five characters from an exotic character set, the author shows techniques for describing a document's character set in the SGML Declaration and how different document character sets are treated by the parser. The presentation concludes with examples of how the techniques are used in real life."
Available online in HTML format: "Document Character Sets by Example", by Tony Graham, Consultant, Mulberry Technologies, Inc.
Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
In the category of "Free SGML Transformation Tools" are free software packages "for transforming an SGML instance into something else, be that another SGML instance or a file in some other format." Graham discusses "the criteria for selecting an SGML transformation processing tool."
Available online in HTML format: "Free SGML Transformation Tools", by Tony Graham, Consultant, Mulberry Technologies, Inc. [local archive copy]
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Book publishing is a conservative industry that relies on a tried-and-true process, characterized by a strong division between 'editorial' functions (obtaining and preparing manuscript) and 'production' functions (turning manuscript into printed books), a division commonly known as 'the wall'. SGML has been relegated to the production side in most implementations. While there is much to be gained here, this limited approach also involves a considerable sacrifice of potential benefit. This paper presents a blue-print for maximizing the benefits of SGML in a commercial book-publishing setting by showing how SGML can be leveraged on both sides of the wall, with consideration of practical implications for both process modification and the implementation of technology.
"The proposed approach for taking full advantage of SGML in a publishing setting involves the mark-up of manuscript not at the stage when it is traditionally keyboarded for typesetting, within production, but at the intake stage. Because there is no way to enforce author compliance with an SGML authoring strategy, it must be handled on submission by an 'intake unit' that is under the control of the editorial departments. This association allows those responsible for DTD creation and initial tagging of manuscript to be in direct contact with those whose job it is to dictate the structure of the documents, and who are most familiar with its content.
"Further, this connection makes the editors in charge of decisions regarding repurposing (electronic versions of existing titles on Web or CD-ROM) and reuse (ancillaries, subsequent editions) directly aware of the potential of SGML to help ease the costs of these (often low-profit) publications. If properly implemented, they also avoid the need to learn the more arcane and unfamiliar aspects of SGML; they can rely on their own staff (NOT answerable to the head of production) to supply them with the necessary technical guidance. The 'structured manuscript' allows the automation of repetitive and labor-intensive tasks in the development process, while making sample material readily available for delivery in print or on the Web for early promotion efforts and expert review.
"By the time the manuscript passes to production, many time-consuming production chores (typecoding, identification of ambiguous structural elements, consistency checks) have already been performed. The editorial departments are brought into closer touch with the realities of scheduling (a constant bone of contention between editorial and production arms), while the production department can now create the printed book at a much accelerated rate, again through automated processes enabled by SGML.
"The introduction of the SGML 'intake unit' into what is traditionally a non-technical branch of a publishing company could be a difficult change to implement; through the proper use of conversion and authoring technology for both initial tagging and subsequent development (and with appropriately designed document types), many of these challenges can be overcome. The gains realized in giving the power of SGML to those who can best make use of it will also help to enable the success of this tricky aspect of implementation.
"The antagonism between editorial and production units within a commercial publishing company has many negative effects. The proper implementation of SGML in this setting could actually help to ease these antagonisms, by adjusting the responsibilities and power that accompany the use of this technology. At the same time, such an implementation would allow publishers to realize the full promise of SGML, in terms of reuse, repurposing, and in faster time-to-market, not just in the final phases of book publication, but throughout the publication process."
See the bibliographic entry for a related article by Arofan Gregory: "Commercial Book Publishing and Author Control."
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Success of legacy conversion might be the single most important determinant of your organization's success in a move towards an SGML environment. It can also be the single most costly aspect of the project. This session's goal will be to dispel the myths. We will present an overview of the key issues and illustrate them with real-life experience. We will discuss: keying vs. OCR vs. software conversion; what software can really accomplish; what you can expect in quality and how you measure it; what a 'ballpark' quote includes and what it doesn't; and how to improve the probability of success.
Data Conversion Laboratory prepares data and text for CD-ROM and Web publishing. Going beyond conversion, DCL specializes in enhancing your legacy documents to meet the new demands of SGML, HTML, PDF, and other structured formats. The company supports all major electronic source formats as well as paper and microfilm."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Information is the raw material from which information products are produced. Nowadays, new information products are needed, including CD-ROM, online databases, World Wide Web pages, and electronic browsers, in addition to printed documents, which impacts production processes. The reasons why SGML is ideal for supporting multiple outputs are discussed. Because of the many process changes involved, it is important to cost justify your SGML project. The three keys to a successful cost justification proposal are: 1) understanding your company's goals, 2) understanding your contribution, and 3) understanding your readers. Return on investment and cost/benefit analysis approaches to a cost justification proposal are discussed. Some formulas for associating cost savings with some tangible SGML benefits are presented."
Note: The above presentation was part of the "SGML Business Management" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "The Electronic Publishing Solutions department at Northern Telecom (Nortel) transformed product and price publications from paper to electronic media within a short period of time. Electronic publishing radically improved Nortel's ability to control document quality and reduce information time-to-market. This department incorporated many significant production changes, such as:
- The use of Standard Generalized Markup Language (SGML)
- The sourcing of information directly from legacy and new product and price databases
- The distribution of documents in multiple forms, including CD-ROM, Nortel's Intranet, and paper across multiple systems and platforms
Nortel's previous publication production methods required the use of word processors to replicate and edit large product documents. Document publication was dependent on manual entry via word processors across several departments. Data entry errors and constantly shifting page layout due to changes, updates, and deletions created a vicious cycle of self-generated re-work and ever expanding schedules. Generally, information accuracy and update timeliness prevented consistent publication and use of resultant publications.
Publication is now produced directly from an SQL database source using SGML with embedded SQL statements. Both the source and the resultant documents are true SGML documents compliant to ISO 8879 standards. These SGML documents were created without modification of the legacy database. Replacing the existing database structure was not an option because it would have required re-engineering all of the existing processes that use the database. However, by using an internally developed toolset that expands SGML with embedded SQL statements, Nortel is able to produce SGML documents from legacy databases. These embedded SQL queries produce variable-length documents on-the-fly for printing or for display by the common Internet or CD-ROM browser.
Today, using an Internet or CD-ROM browser, Nortel's marketing and production engineers, sales support staff, distribution managers, and external distributors and customers can immediately access accurate product and price information. In addition, on-line access enables users to query and generate live reports dynamically from legacy information so that they can further target desired information. Information is kept up-to-date in an Automated Price Action application that is accessible on the Internet. Product adjustments are introduced for approval via this Internet service, and once approved, changes to product and price databases become instantaneously available for use. Although paper publishing is still required, Nortel anticipates substantial savings in time, labor, and cost by using SGML in a unique way."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "SGML, which is used for document interchange among various environment, is a meta language to describe documents. Before marking up a document, we need to prepare a DTD that defines a document structure.
In general, a DTD applicable to diverse document classes is incompatible with a DTD focusing on the semantic features of documents. If the number of DTDs grows, the costs of developing application programs for the DTDs would also skyrocket.
To apply a DTD focusing on the semantic features to diverse document classes, we developed a system which, from a base generic DTD, derives a different DTD for each document class. Our system also has a function that translates derived DTD instances to base DTD instances. This function frees us from the burden of developing application programs separately for each of the derived DTDs."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Implementing SGML can be an enormous task. To be successful, an implementor must have a good technical background in SGML and must have a clear understanding of data flow and SGML system functionality. Gaining a understanding of the key components of an SGML system is critical. This afternoon's presentations are designed to provide the SGML newcomer with an overview of the major classes of SGML tools and a brief review of the products commercially available today. Presenters for this session are independent SGML consultants who specialize in the design and implementation of SGML-based information systems."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Information access for people with disabilities is creating numerous opportunities and challenges within the SGML (Standard Generalized Markup Language) community. Additionally, as a result of the increasing paradigm shift by the publishing industry toward Internet and WWW-based document delivery systems, the importance of producing accessible information using SGML mechanisms has increased immeasurably.
The primary focus of this paper involves the production of electronic documents. However, the key principals involved in the design, production, and delivery of information apply regardless of the document medium.
In this showcase the presenters will: identify major problems in information and software design that deny access, demonstrate successful products that can be used by people with disabilities to access publications, point to resources that assist developers in creating accessible products in the future. The goals of the showcase are to educate participants about accessible electronic text delivery systems, and direct participants toward resources which help them create of choose accessible products."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "This paper discusses the issues of SGML re-use and shows why they can only be solved generally through the use of subdocuments. The paper explores the following general issues:
- General text entities are not re-usable
- How to enable interoperation of documents with possibly different document types?
- How to effect the cross-document addressing needed when a single document is composed of many subdocuments?
The SGML standard only defines two object types that can have independent existence: documents and subdocuments. Thus it is clear that only documents and subdocuments can be reliably re-used. In particular, external general text entities are not useful candidates for general re-use. My plea then is for tools to add the functions necessary to support the use of subdocuments for the re-use of semantic fragments. For most applications, such as browsers, this means treating the content of subdocument entities as though it had occurred in a general text entity for the purpose of processing (not parsing). For parsers, it means providing a mechanism to either parse multiple documents in parallel or to suspend the parsing of the parent document while the subdocument is parsed and then integrating the parsing result of the subdocument with the data resulting from the parsing of the parent document. For editors, it means allowing the declaration and editing of subdocument entities. Editors, in particular, may also need to provide ways to define constraints on what document types or architectures are to be allowed for subdocuments in specific application environments (families of DTDs).
I think that these conventions provide a clear and simple way to make the use of subdocuments in general less problematic and more fruitful. The full promise of SGML cannot be realized until the problem of fragment re-use is solved and I am firmly convinced that subdocuments are the key to that solution."
See the online version of the paper: "Re-Usable SGML: Why I Demand SUBDOC", SGML '96 presentation by W. Eliot Kimber of ISOGEN International Corp.; [mirror copy]. An SGML version is also accessible via the ISOGEN server, as well as a package containing HyBrowse styles and instructions for using HyBrowse.
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Thompson Legal Publishing has re-engineered aging SGML-based systems to meet current needs. Tools were chosen from solid companies that did not expose the SGML to users, did not restrict the use of SGML in any way, that have the capacity to emulate structure and that have API's. Users now work in an environment that does not force them to place thirty elements/attributes in the data to enter one judicial case citation. Instead, a couple of clicks of the mouse, and in goes the case cite. Our savings in output processing have been enormous; a process that used to take cost $18.00/page now and costs $0.95 per page. The system's simplicity from the user's point of view will be demonstrated, and the complexity of the data created and the resulting flexible output will be shown.
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "SGML is the logical choice for encoding electronic documents, and Virginia Tech encourages (and will later require) students to submit Electronic Theses and Dissertations (ETDs) in SGML. Our DTD must work with translators as well as be usable for students preparing SGML directly. A usability test for tagging ETDs according to our DTD involved teaching SGML-novice graduate students to code using our DTD, observing them tagging their own documents, and having them narrate their thoughts during the process. Our results show that subjects require high-quality system documentation (replete with examples of correct usage), that learning to author the simplest hypermedia in SGML is inherently nonintuitive, and that our line-edited, batch-processed ETD formatting system is easy to use.
This work was funded in part by the Southeastern Universities Research Association (SURA) 1996 project, 'Development and Beta Testing of the Monticello Electronic Library Thesis and Dissertation Program'."
More detailed information on the Electronic Theses and Dissertations project may be found at: http://etd.vt.edu/etd/. See especially the brief project description [mirror copy], and a related write-up in the September 1996 issue of D-Lib Magazine [mirror copy, December 1996]
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Ten years after SGML was adopted as an international standard, more organizations than ever before are investigating its possibilities. The reason is simple. The problems addressed by Total Quality Management in the manufacturing and general service industries are magnified enormously in knowledge work and are much more difficult to address. Accessibility and reusability of information are important, and so are the relevance and applicability of information in a particular problem-solving context. Redundant knowledge creation and information rework waste organizational effort and dollars and have a profoundly negative effect on programs, processes, and systems. To combat redundancy and rework, organizations are seeking solutions in standard tools and standard data representations."
Note: The above presentation was part of the "SGML Business Management" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Technical and Management Services Corporation (TAMSCO) and Warner Robins Air Logistics Center/LB/LU Directorate recently began a cooperative effort to develop a more efficient way to manage the data for the C-130 flight manuals. WR ALC/LB/LU recognized the tremendous cost and inefficiencies in managing the existing C-130 data. With the assistance of TAMSCO, this cooperative effort is currently reengineering the existing process for creating, distributing, accessing, and reusing the technical information. By using Standard Generalized Markup Language (SGML), this effort will realize the ability to store and reuse technical procedures more efficiently. The SGML data will be accessible to the end users through an electronic information base both digitally and hard-copy. Using SGML and the AF Standards will bring many benefits and lower maintenance costs. The future success of the USAFs C- 130 Technical Manual program depends on how effectively and efficiently the existing data is identified, maintained, managed, and used."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Two approaches are available for specifying transformation processes on SGML documents: a declarative approach, based on context-sensitive rules triggered on SGML parsing events, and a procedural approach, based on explicit manipulation of the document tree."
"This paper shows that each approach is optimal for a certain class of problems, but that both are actually needed and that maximum expressive power is achieved when both can be combined in a same program."
The document is available online in HTML format: http://www.balise.com/current/articles/lecluse.htm; [mirror copy].
An alternative source for information presented in this paper is the Proceedings of SGML Finland '96; see the paper by François Chahuneau, "Event driven or Tree Manipulation Approaches to SGML Transformation - You Should Not Have to Choose."
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Maintaining large amounts of SGML data in separate files on a file system has always been a difficult proposition. Trying to coordinate a distributed workgroup environment is even more difficult. Simple mechanisms such as ID and IDREF can become a nightmare on even small projects. A database environment offers many exciting possibilities for features such as version control, sharing, validation, and distribution. The challenge is to develop a system that is capable of accepting any SGML document and flexible enough to support many different SGML database applications."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Abstract: "Developing SGML applications involves making choices driven by end user requirements and by the availability and functionality of third party SGML parsers, authoring tools, search engines, browsers, and data converters. Capabilities of HTML and the World Wide Web should factor into these decisions as well if users are geographically dispersed or have diverse computing platforms. SGML application developers typically build some or all of the following components: a DTD; legacy data conversion tools; a DTD-tailored authoring environment; a document repository; browsing and searching interfaces; and tools for producing formatted output. For each component, we discuss design and implementation alternatives, the approach we decided to use in building our SGML environment for authoring and accessing STEP product data exchange standards, and our rationale for choosing that approach."
More informtion on SGML and STEP (ISO 10303 Standard for the Exchange of Product Data) is available in the dedicated entry of the SGML Web Page.
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA. See the NIST server for an online version of the document.
Abstract: "This talk will describe how the US National Security Agency, the Central Intelligence Agency, the Defense Intelligence Agency, the National Reconnaissance Office and other top agencies that collectively are known as the United States Intelligence Community are significantly improving their intelligence gathering and reporting operations through the development and implementation of advanced technology including networking concepts and international information standards such as SGML.
The central focus of this talk will be a description and discussion of Intelink, the classified, world wide 'Intranet' for the Intelligence Community. Intelink, and the Intelink community address one of the world's largest data management problems, involving demanding requirements that are at the extreme of what normal enterprises require.
Intelink is now operational for a broad base of intelligence customers and consumers from the warfighter to the White House. Intelink is currently being used in support of several basic and key functional areas. Perhaps the most significant of these areas is the electronic publishing and distribution of our nation's intelligence reports. This talk will discuss how our "Signals Intelligence" (SIGINT) Reports have gone from the world of reports in only ASCII text to robust multimedia formats with distribution, using SGML, over Intelink. The talk will also address other key functional areas including analytical research, collaboration facilities, and training.
The talk will address several of the unique problems, concerns, challenges and special features that distinguish Intelink from other Intranet applications. These issues include networking; architecture and standards; analyst collaboration issues; and finally encryption and other security considerations that are unique to this special environment.
The talk also will provide specific examples of Intelink SGML applications in several agencies within the US Intelligence Community. These examples will present insights into the issues, problems, and solutions for organizations desiring to take advantage of emerging technology allowing them to realize tangible cost savings as well as to enjoy significantly improved capabilities.
The talk will conclude with an examination of the future for Intelink, including plans for enhanced analyst collaboration, security boundaries/access control, and an improved Graphical User Interface."
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

