The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: November 16, 2000
SGML/XML Bibliography Part 7, S - Z

[CR: 19961009]

Sabasteanski, Anna. "Use of the Electronic Manuscript Standard at the New England Journal of Medicine." EPSIG News 2/1 (March 1989) 1-2. ISSN: 1042-3737. Author's affiliation: Medical Publishing Group, Massachusetts Medical Society.

The New England Journal of Medicine is a key publication of the Medical Publishing Group, Massachusetts Medical Society, and it plays part in the Society's adoption of SGML-based publishing technologies. Canonical database files using SGML encoding are used to produce different versions of the journal for CDROM, paper, and online access. The AAP's SGML DTD is the basis for the information structuring in the knowledgebase.

See the entry for Elsevier Scientific Publishers for other information, or the bibliographic entry for the Elsevier DTD documentation.



[CR: 19960812]

Sacks-Davis, R.; Arnold-Moore, T.; Zobel, J. "Database Systems for Structured Documents.." IEICE Transactions on Information and Systems E78-D/11 (November 1995) 1335-1342 (with 26 references). Authors' affiliation: Collaborative Information Technology Research Institute (CITRI), Carlton, Victoria, Australia Home Page Contact: Ron Sacks-Davis.

"Abstract: Documents stored in a database system can have complex internal structure described by languages such as SGML. How to take advantage of this structure presents challenges for database system implementers. We classify the types of queries that need to be supported by SGML conformant database systems. We then describe several data models that have been proposed for representing documents in a database system and discuss the support these models provide for SGML. Finally we consider query evaluation."

For further information on SGML-related research at RMIT/CITRI, see the main entry for RMIT - MDS.



Sacks-Davis, Ron; Arnold-Moore, Timothy; Zobel, Justin. Database Systems for Structured Documents. Technical Report. [Prepared for] International Symposium on Advanced Database Technologies and their Integration (ADTI'94), 1994.. Nara, Japan: [PUBLISHER?], 1994. 13 pages, 33 references.



Said, Carolyn; McManus, Neil. "SGML Standard Will Star in Boston Seybold Show." MacWeek 7/15 (April 12, 1993) 1,124. ISSN: 0736-7260.

Note on the prominence of SGML publishing technologies at the Seybold Seminars '93 trade show.



[CR: 19970817]

Salminen, Airi; Kauppinen, Katri; Lehtovaara, Merja. "Towards a Methodology for Document Analysis." Pages 644-655 (with 24 references) in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: Departmenf of Computer Science and Information Systems, University of Jyväskylä, P.O. Box 35, FIN-40351, Jyväskylä, Finland. Email: airi@cs.jyu.fi.

Abstract: "A great deal of the collective knowledge of organizations is stored in documents. To be able to use documents effectively, the information structure in the documents should be carefully planned. International standards, for example SGML, have been developed for defining document structures. The definition method however is not enough. For defining effective document standards for an organization, a profound document analysis is needed. In the analysis, current documents and document management practices should be studied and described before developing new document structures and document management practices. The development of a methodology for document analysis is going on in a project studying legislative documents produced in the Finnish government and parliament. The article describes the first results of the project. As the document structure definition method, SGML is used in the project. The analysis method is developed and extended from an object-oriented method. The article introduces the main phases of the analysis: Domain definition, object modeling, state modeling, and content modeling. The application of the methodology in the case project and the data gathering methods used are also described."

See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.



[CR: 19951110]

Salminen, Airi; Tompa, Framk Wm. "PAT Expressions: An Algebra for Text Search." Acta Linguistica Hungarica 41/1-4 (1992-1993) 277-306 (with 25 references). Authors' affiliation: [Salminen]Department of Computer Science, University of Jyväskylä; [Tompa] Department of Computer Science, University of Waterloo.

Summary: Text search operations are used to locate and retrieve needed information from some text collection. In traditional information retrieval, text search is a means for identifying relevant documents. By specifying selection criteria for the text of a document, the reader can choose a subset of a given set of documents. If the text collection is defined not as a set of documents, but more generally as a structure containing some parts, then text search involves the specification of those parts of interest to the reader.

The structure of the documents may be determined by the search system, by the author, by the text installer, or by the reader. In the PAT (TM) system, text search operations are expressions that efficiently combine traditional search capabilities with some new, powerful facilities. PAT contains means for lexical search, proximity search, contextual search and Boolean search. It also contains more rare operation types, including position and frequency search. Furthermore, a novel feature in PAT is the capability by which a reader can define structures for a text and use these structures in subsequent operations. One of the goals of this paper is to introduce the powerful search capabilities of PAT expressions.

Text search is usually considered so simple that only a rough description of the operations is given. For example, when word search is discussed, we are seldom told what is meant by a 'word'. The reader has to find out through experimentation how many words are contained in the strings 'Jean-Marie' and 'O'Hara'. However, a careless description of search operations may lead to search errors or unnecessarily long retrieval sessions. A second goal of the paper, therefore, is to introduce a mechanism for precise specification of text search semantics.

Text search using PAT is typically simple and straightforward. However, because of the powerful definition capabilities included in PAT, explaining and understanding the semantics of some operations may be difficult. As a side-effect of our systematic specification of PAT, we have identified some features of PAT expressions that cause problems and thus would benefit from further development. From this we see that precise specification also serves as a means for evaluation and offers a means for comparing text search systems. As is common in information retrieval systems, a PAT search is applied to indexed text. Indexing is usually described from the point of view of implementation, for example, by giving an algorithm for the indexing. However, since the way text is indexed affects search behaviour, our systematic approach to precise description must include mechanisms that accommodate indexing definition capabilities." [adapted from the Introduction]

The authors describe the query capabilities of the PAT system, dividing PAT expressions into six classes, and supplying a discussion of the syntax and semantics for each class. PAT indexing can be specified by productions as a view of PAT text.

Available on the Internet: ; [or mirror copy].



[CR: 19970106]

Sampson, Craig R. "SASOUT: A Context Based Table Model." Pages 235-264 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: SAS Institute Inc., SAS Campus Drive, Cary, North Carolina, USA 27513; Tel: (919) 677-8000 x-7417; FAX: (919) 677-4444; Email: sascrs@unx.sas.com; WWW: http://www.sas.com.

Abstract: "The SASOUT table model was developed to support the tabular documentation needs of the Publications Division of SAS Institute Inc. SASOUT instances contain sufficient meta information to allow them to be presented in both hard and soft copy. The meta data also makes possible non-traditional and interactive online presentations of the tabular data.

In 1995, research on tables produced by SAS software and on the tables previously used in our documentation resulted in our identification of four table types: simple, intersection, drill-down, and show-all. Imaging these tables on paper, as in the past, presented no significant problems even with SGML source data. However, we anticipated problems presenting our tables in soft copy after experimenting with the capabilities of the CALS table model, which was supported by our SGML software tools.

The CALS model does not support markup for indicating relationships between cells in a table nor directly support row header formatting. These relationships are not critical for producing hard copy, but are very important to our interactive online presentations. Header formatting is important for both hard copy and online presentations from a single source.

The SASOUT table model was developed to provide a means of marking up our tabular data while preserving its characteristics. The markup supports row headers and cell relationships in addition to all CALS features, such as column heads, spanning rows and columns, and alignment of data. The SASOUT model also supports behavior characteristics that allow the specification of online presentation methods.

This paper describes our table types, our platform presentation requirements, extensions we added to the CALS model, and the processing we designed to meet our formatting requirements so far.

The SASOUT DTD is freely available and we look forward to vendors providing support for it and other table DTD's that provide the means to fully identify tabular data."

The document is also available online in SGML format: see the download instructions from Craig Sampson, which contain the associated GCAPAPER DTD. URLs for the paper are: ftp://ftp.sas.com/incoming/sasout.tar.Z, (UNIX tar compressed) or ftp://ftp.sas.com/incoming/sasout.zip (.ZIP format); [UNIX format mirror copy] and [ .ZIP format mirror copy]. The SASOUT table DTD has been made available publicly by Craig Sampson on the Usenet News forum comp.text.sgml (CTS): see the local document. A related presentation describing the implementation of SGML by the Publications Division of SAS Institute was given at SGML '96 by Leonard P Olszewski, "Modular DTD Development and Maintenance at SAS Institute: Implementing an Efficient SGML System Using Software Engineering Principles."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19961226]

Samuels, Eloise. "Case Study: Key Learnings from Converting Complex Technical Documents to SGML." Pages 57-64 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Senior Information Design Manager, Bellcore, 6 Corporate Place, PYA-1N158, NJ 08854, USA; Tel: 908-699-6853; FAX: 908-336-2605; Email: ews@rangers.lso.bellcore.com.

Abstract: "Unlocking the benefits of information in your documents may mean that you invest more than human resources, money and equipment. What about the pre-process planning time? Investment in Standard Generalized Markup Language (SGML) does not guarantee you immediate return and does not happen at the drop of a hat without a major management investment in pre-planning.

Because most corporations information management's top goal is to produce a much richer information environment, we must make a management commitment to a document analysis process. The process should identify what information in these documents is important enough to migrate to a rich electronic format such as SGML. It may seem obvious that the way to maximize your information is to break it into intelligible chunks in a data base. However, to get those chunks of information into a format that is acceptable by most applications is not a simple process. When that is complete, next comes the targeted conversion by document type.

For Learning Support the goal was to establish an information database that yielded benefits in the area of:

  • document creation,
  • document updating and revising,,
  • database review and validation,
  • information reuse, and,
  • on-line full-text retrieval and distribution of information.,

The objective was to convert annually some 300,000 pages of technical documents containing complex tables and graphics from several different authoring environments into an industry standard Document Type Definition (DTD), called the Telecommunications Industry Markup (TIM DTD).

This industry standard format, Telecommunication's Industry Markup Document Type Definition (TIMDTD) is an explicit and neutral form of markup. The BCCs, in conjunction with the Telecommunications Industry Forum consisting of representatives from telecommunication vendors, such as Ericson, Siemens, AT&T, and Northern Telecom have unanimously endorsed it as their standard list of SGML markup tag definitions.

This paper identifies key learnings grasped from project management of the SGML Implementation Plan the Learning Support organization at Bellcore. Key outcomes determined were:

  1. Document analysis was critical to the success of the [project]
  2. The DTD writer's interpretation of the data and its structure required an iterative process with document developers and users. DTDs will change.
  3. It was important for acceptance to maintain the document developers view of the textual layout and format of the data while enforcing structure.
  4. Management's buy-in was needed at all points in the process
  5. Not everyone will be on board the train at the same time."

For more information on the TIM DTD as part of the TCIF/IPI (Telecommunications Industry Forum Information Products Interchange) standard, see the main entry in the SGML/XML Web Page.

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19951113]

Sandoval, Victor. SGML: un outil pour la gestion électronique de documents. Techniques de l'information. Paris: Hermes, 1994. Extent: 174 pages, bibliographie, index. ISBN: 2866014405.



[CR: 19971008]

Schatz, Bruce; Mischo, William H.; Cole, Timothy W.; Hardin, Joseph B; Bishop, Ann P. "Federating Diverse Collections of Scientific Literature." IEEE Computer 29/5 (May 1996) 28-36 (with 12 references). ISSN: . Authors' affiliation: Grainger Engineering Library and Information Center, Illinois University, Urbana, IL, USA.

"Abstract: The Digital Library Initiative (DLI) project at the University of Illinois at Urbana-Champaign is developing the information infrastructure to effectively search technical documents on the Internet. The authors are constructing a large testbed of scientific literature, evaluating its effectiveness under significant use, and researching enhanced search technology. They are building repositories (organized collections) of indexed multiple-source collections and federating (merging and mapping) them by searching the material via multiple views of a single virtual collection. Developing widely usable Web technology is also a key goal. Improving Web search beyond full-text retrieval will require using document structure in the short term and document semantics in the long term. Their testbed efforts concentrate on journal articles from the scientific literature, with structure specified by the Standard Generalized Markup Language (SGML). Research efforts extract semantics from documents using the scalable technology of concept spaces based on context frequency. They then merge these efforts with traditional library indexing to provide a single Internet interface to indexes of multiple repositories."

Available online in HTML format: http://computer.org/computer/dli/r500280/r50028.htm; [archive copy, text only].



Scheller, Angela. "Document Standards: Availability and Products." Computer Networks and ISDN Systems 16/1-2 (September 1988) 138-142. ISSN: 0169-7552. CODEN: CNISE9.

Abstract: With the growth in the spread of computer networks the demand by users for document interchange features is becoming increasingly apparent. The prerequirement for the realization of document interchange in a heterogeneous computer environment are internationally accepted standards for the description of documents. Already in early 1986, the Standard Generalized Markup Language SGML was published as an international standard for the structuring of documents. The publication of the Office Document Architecture ODA is expected in the course of 1988. The final text is already available. ODA was originally developed for the pure office environment, whereas the concept for SGML addressed the author/publisher environment. This fact is mirrored in the current pilot projects testing the standards: the manufacturers of office and word-processing systems mainly work with ODA, whereas in the technical scientific and publishing sectors SGML is often implemented. Users requiring an interface both to the office sector as well as to the publishing sector will therefore be confronted with the problems related to working with two different, only partially compatible standards.



Scheller, Angela. "Experience with SGML in the Real World: DAPHNE, a System Integrating Computer Graphics Metafiles into SGML Documents." In Document Exchange: The Use of SGML in the UK Academic and Research Community. Workshop Proceedings 5-7 March 1990. Edited by Anne Mumford. Advisory Group on Computer Graphics, 1990.

Abstract: DAPHNE is a document processing system implemented to support joint editing within the German Research Network DFN. It is based on two international standards in the area of document and graphics processing, the Standard Generalized Markup Language SGML and the Computer Graphics Metafile CGM. This paper presents the functionality offered by DAPHNE today as well as plans for future extensions. It also describes the experience gained with a distributed environment of commercial products for processing SGML documents in general and DAPHNE documents in particular.



Schettini, Stephen; Alschuler, Liora. "SGML is Here to Stay. Coding Documents with Standard Generalized Markup Language Lets You Manipulate and Format Text in Limitless Ways." Publish ? (June 1994) 71-78.



[CR: 19970212]

Schietekat, Raf. "DSSSL: The Promise FOSI Did Not Fulfill." In: Proceedings of the 3rd Annual Conference on the Practical Use of SGML. "A Decade of Power." Third Annual [Belux] Conference on the Practical Use of SGML. Business Faculty, Sint-Lendriksborre 6, Brussels, Belgium. October 31, 1996. Sponsored by SGML Belux (Belgian-Luxembourg Chapter of the International SGML Users' Group). Leuven, Belgium: Belux, 1996. Author's affiliation: Fotek NV, Entrepotstraat 3, B-9100 Sint-Niklaas, Belgium. Email: raf@fotek.com.

Abstract: "SGML (Standard Generalized Markup Language) is designed to encode information at the content level, abstracting away from formatting issues. In a well-designed SGML application, font details are not part of the SGML document, and contents may be rearranged or automatically generated. In this light, for professional purposes HTML (HyperText Markup Language) is better considered to be a presentation language for contents that have been stored separately (e.g. in a dedicated SGML environment) than a reliable repository data storage format in itself. This now probably well understood, at least in the SGML community. HTML is tied to a particular, very limited DTD (even ignoring the reality of emerging Microsoft and Netscape dialects), and requires independently specified semantics (i.e. how the Web browser has to interpret the HTML tags)."

"In general, SGML documents will be typeset using informal descriptions of the style semantics for the various elements that occur in a document instance, which requires good communication between the document publisher and the document typesetter. Once a system is set up to process a particular kind of SGML input, that style specification is generally not portable.

"The new ISO standard DSSSL (Document Style Semantics and Specification Language) aims to become the standard way of linking up SGML information containers with their graphical representation, as part of a suite of complementary ISO standards: SGML, HyTime, SPDL (Standard Page Description Language, based in part on the PostScript language from Adobe). This paper will elaborate on the details of how DSSSL achieves this, how it fits into a complete document production process, how detailed the specification is and what it leaves open, what other functionality is available, and why it was worth waiting for."

Available online in HTML format: "DSSSL, the promise FOSI did not fulfill", by Raf Schietekat; [mirror copy]. For further information on the conference, see: (1) the description in the conference announcement and call for papers, and (2) the full program listing, or (3) the main conference entry in the SGML/XML Web Page.



[CR: 19971125]

Schiller, Jörg. "SGML and Development Documentation." Page(s) 159-160 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Project Manager, debis Systemhaus GEI, Ulm, Germany' Email: jschiller@gei-ulm.daimler-benz.com.

Abstract: "Increasing requirements in development documentation for automotive manufacturers such as the number of world-wide development sites (leading to the support of different languages); the speed of the development cycle; and the number of variants of products (re-use of base documentation), has led to the definition of exchange formats for information. This paper examines how SGML technology can be a good solution for problems in this area." [from the published program]

"The development of ECU (electronic control unit) for cars is a highly parallel and complex job. Requirements from different parts of an automotive manufacturer have to be fulfilled. The interfaces between correlated persons are not defined in a way, that an exchange of information is done easily. Business process reengineering activities discovered a big potential for enhancements by using SGML technology as an overall exchange format in these areas.

"Several projects were started, to implement a new process model. This article describes our experiences in projects we realized since the beginning of 1995. The biggest project deals with diagnosis data that is needed to describe parameters to communicate with ECUs. Today you can get many informations about the actual state of internal and external variables of ECUs (for example a coolant temperature). These informations are used to guide a diagnosis process to determine erroneous behaviour of components of a car. The diagnosis data is used in different parts of the company (development, production, service). Even companies that deliver ECUs can be involved in this process. We started a case study to determine the best format for the description of structures. As a result we decided to take SGML. The Document Type Structure (DTD) is presented to the ASAM/ASAP consortium for standardization. This consortium represents the German automotive industry, suppliers and tool companies.

"The system is now in use by 10 to 15 users and will grow in 1997 to 30 to 50 users. There is a process of standardization of diagnosis data in the moment in a consortium called ASAM/ASAP. Our DTD is a proposal to that committee and we think it will be fixed till the end of 1997. Our experience with the technologie SGML are quite good. We can transport the concept very easily to the users.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19971125]

Schmitt-Rennekamp, Walter. "Digital Documentation Trends for Aircraft Maintenance." Page(s) 153-154 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Senior Consultant, Aircraft Maintenance and Engineering Documentation, Lufthansa Systems GmbH, Hamburg, Germany.

Abstract: "The aviation industry has a long tradition for information interchange standardization. The first generation of on-line documentation was a paper document duplicate based on SGML. In the future, documentation has to move from the document paradigm to an information paradigm. Then the user will get an 'Information Web' and exactly the information he is looking for. This presentation looks at the challenges and trends in aircraft maintenance documentation."

"In aircraft maintenance and operations documentation structure and form are well defined by the ATA SPEC 100 specification. It was a good foundation bringing that documentation into electronic form using SGML. SGML is today the foundation for ATA SPEC 2100, the aviation standard for electronic document interchange. [...] Tagged information at a well defined granularity makes incremental revisions easy. Taking advantage of the progress in electronic networks, an on-line document update will be possible and leads to totally new worksharing concepts between aircraft manufacturers product support organization and airline engineering."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19961108]

Schopen, Michael (Dr. med). "Die logische Struktur der ICD-10 (Systematik) und ihre Beschreibung mit SGML [The Logical Structure of ICD-10 (Tabular List) and its Description with SGML]." Informatik, Biometrie und Epidemiologie in Medizin und Biologie 26/2 (1995) 121-133 (with bibliography and 4 pages of figures. Author's affiliation: Deutsches Institut für medizinische Dokumentation und Information (DIMDI) [German Institute for Medical Documentation and Information].

"Zusammenfassung: Das Deutsche Institut für medizinische Dokumentation und Information, DIMDI, ist beauftragt, die amtliche deutschsprachige Ausgabe der ICD-10 herauszugeben. Ausgehend von den Anforderungen an die maschinenlesbare Version der ICD-10 (Systematik) wird die Standard Generalized Markup Language (SGML) vorgestellt als ein Formalismus, mit dem die logische Struktur der Klassifikation beschrieben werden kann. Der im DIMDI für die ICD-10 verfolgte Ansatz - Reduktion auf die logische Struktur und Verzicht auf Layoutinformation - macht die maschinenlesbare Fassung unabhängig von spezifischer Hard- und Software und hät sie offen für unterschiedlichste Anwendungen. Die vorliegende Arbeit beschreibt in SGML die Grobstruktur der ICD-10-Kapitel sowie den Aufbau der Krankheitsklassen und einiger ihrer Elemente. An einem Beispiel wird gezeigt, wie die SGML- Daten für spezielle Anwendungen der Klassifikation restrukturiert werden können.

Abstract: The German Institute for Medical Documentation and Information, DIMDI, has been authorized to publish the official German language edition of ICD-10. [Internationale Klassifikation der Krankheiten = International Classification of Diseases, ICD.] Based on the requirements for a machine-readable version of ICD-10 (Tabular List), SGML - the Standard Generalized Markup Language - is introduced as a formalism to describe the logical structure of this classification. By specifying the mere logical structure and abandoning layout information, DIMDI's concept for ICD-10 makes the machine-readable version independent of any hardware or software and keeps it open to a broad range of applications. This paper uses SGML to describe the structure of ICD-10 - the chapters, the disease categories and some of their elements. An example is given how to rearrange SGML data for specific applications of the classification.

Die Arbeit erläutert den Ansatz des DIMDI, die ICD-10-Daten SGML-basiert zu bearbeiten.

Available in Postscript format on the Internet: ftp://193.174.240.221/pub/klassi/icdsgml.zip. Mirror copies: original from DIMDI; alternate - edited Postscript that worked locally. For more information, see the database entry for DIMDI.



[CR: 19961210]

Schouten, Han. "Documents in Databases." SGML Users' Group Newsletter 15 (January 1990) 8-11. ISSN: 0952-8008. Author's affiliation: Research Center for Technical and Physical Engineering in Agriculture (TFDL), The Netherlands.

The article explains "why documents should be stored in databases rather than in sequential files." [Because:] "Database technology provides us direct access to facts stored in a database. Here too the application-independent logical structure of information determines how we can get access to and process stored facts. The verification of manipulating such information according to its logical structure is, unless explicitly prescribed, not sequence-specific. Therefore, the storage of documents in databases seems to be the correct answer to our requirements of interaction with respect to document processing in the office environment." [extracted]

This article should be read in conjunction with a second article by Han Schouten, "Draft Tender Re: 'Documents in Databases'", also in number 15 of SGML Users' Group Newsletter.



[CR: 19961210]

Schouten, Han. "Draft Tender Re: 'Documents in Databases'." SGML Users' Group Newsletter 15 (January 1990) 12-14. ISSN: 0952-8008. Author's affiliation: Research Center for Technical and Physical Engineering in Agriculture (TFDL), The Netherlands.

A major draft proposal for SGML DSIG sponsored development of a prototype document processing environment in which documents are stored as databases. The environment would support SGML, but also other SGML-related standards like DSSSL -- "as an alternative for the sequential access strategy characteristic of standard SGML." Details on the objectives, tasks, funding, deliverables, rights and duties of participants, project management, (etc.) are described. Proposed tasks include specification of a gross system architecture, definition of modelling techniques, building and verifying semantic equivalence of all models with SGML and DSSSL, facilities for loading SGML DTDs, facilities to unload DTDs without loss of information, creation of a DTD editor, creation of a structured document editor, building of retrieval facilities, building a document formatter.

This document, as a draft tender, is to be read in conjunction with the companion article in issue 15 of the SGML Users' Group Newsletter, "Documents in Databases," also by Han Schouten.



[CR: 19961210]

Schouten, Han. "Meeting of the [SGML] Database SIG." SGML Users' Group Newsletter 15 (January 1990) 11-12. ISSN: 0952-8008. Author's affiliation: Ministry of Agriculture and Fisheries, Research Center for Technical and Physical Engineering in Agriculture (TFDL), Expert Center for Information Technology (ECIT), the Netherlands.

The article is a report on the meeting of the SGML Database SIG on October 26, 1989 at Alphen aan de Rijn, Netherlands. Presentations included: (1) Han Schouten, "The Storage of Documents in Databases at the Ministry of Agriculture and Fisheries"; (2) François Chahuneau -- experiences with the implementation of a document database for production of the Journal of the EEC, in nine languages, with an emphasis upon support for version management; (3) Ian Williams' presentation "An architecture for hypertext object management" -- this presentation focused on GUIDE, IDEX and OWL's hypermedia products in relation to SGML. OWL is researching SGML applications for information retrieval, object indexing and maintenance of database links; (4) Other meeting participants included Lou Burnard, Frank Dros, Harry Gaylord, Jurgen de Jonghe, Jan Maasdam, Hans Mabelis, Jon Maslin, Koen Mulder, and Gert van der Steen.



[CR: 19961210]

Schouten, Han. "SGML*CASE: The Storage of Documents in Databases." SGML Users' Group Bulletin 4/1 (1989) 1-14 (with 5 references). ISSN: 0269-2538. Author's affiliation: Ministry of Agriculture and Fisheries, Research Center for Technical and Physical Engineering in Agriculture (TFDL), Expert Center for Information Technology (ECIT); POB 356, Mansholtlaan 12, 6700 AJ Wagenhingen, The Netherlands. TEL: +31-8370-19143; FAX: +31-8370-11312.

Abstact: "Despite recent achievements in text editing, desktop publishing, and the hypermedia approach toward information processing, the developments in document processing remain in arrears when compared to data processing. This is highly remarkable, since today 99 percent of all information is still archived as documents on paper."

"Here we analyse the possible causes for this apparent backlog in document processing and the damage it inflicts on office automation. Hitherto the logical structure, the layout, and the presentation of documents have often been insufficiently distinguished. Documents are typically stored and accessed as sequential files. These characteristics strongly remind us of most data processing environments of about twenty years ago. Then, file structures were mainly application-dependent and files could only be processed in batch, because the possibilities for accessing their contents directly were absent. The information systems of those days featured all the bad qualities that most document processing systems feature today; many types of conversion from one application-dependent form to another, loss of information with these conversions, and the practical impossibility of managing stored information as a corporate resource. Conversely, many document processing applications such as document editing, hypermedia applications, and the integrated processing of data and text also require direct access to individual elements of stored documents."

"The logical consequence seems, therefore, to be to devise some application- and device-independent, directly accessible, storage facility for documents and to stimulate developments similar to those that caused data processing to become the success it is today. Building on the results of our analysis, we have made an attempt to store documents in a database and, consequently, have direct access to their structure and contents, maintain information integrity and optimally integrate data and text. A conceptual schema for the storage of documents is proposed here. The obvious advantages of the model are discussed, as well as the topics which remain to be investigated."

See the main entry for the SGML Database Special Interest Group (SGML DSIG/DBSIG) for further information. Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).



[CR: 19970312]

Schouten, Han. "A Utility for the Combined Use of SGML and Ventura ®." SGML Users' Group Bulletin 3/2 (1988) 27-36 (with 4 appendices). ISSN: 0269-2538. Author's affiliation: TFDL/ECIT [Ministry of Agriculture and Fisheries, Research Center for Technical and Physical Engineering in Agriculture (TFDL), Expert Center for Information Technology (ECIT), The Netherlands].

The author explains a strategy used at the Dutch Ministry of Agriculture and Fisheries for converting SGML documents into Ventura documents for printing. The appendices contain examples of the SGML source code, the conversion scripts, and the corresponding representations in Ventura format.

Note: The volume editor for SGML Users' Group Bulletin 3/2 is Anders Berglund (ISO Central Secretariat, 1 Rue de Varambé, CH-1211 Geneva 20, Switzerland).



[CR: 19971125]

Schreier, Richard A. "Supporting SGML in Document Management Systems." Page(s) 95-101 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Director of Professional Services, Microstar Software Ltd, Nepean, Ontario, Canada; Email: ras@microstar.com.

Abstract: "Most Document Management System architectures can be categorized by the ability to handle and organize information of different kinds. Supporting information based on the Standard Generalized Markup Language (SGML) involves unique requirements that bear on the tasks of managing structured documents."

"This report overviews approaches to support SGML documents in a number of Document Management System architectures that were candidates to be used in an actual publishing system supporting the publishing and re-purposing of shared information for technical manuals. This publishing system supports content- and presentation-oriented SGML documents for a supplier of military equipment to a Canadian Department of National Defence (DND) Project Office."

This paper was originally prepared under a slightly different title by G. Ken Holman (Crane Softwrights Ltd.), formerly the Chief Technology Officer of Microstar Software Ltd. Slides for the related paper are among the collection of slide show presentations from Microstar.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19980413]

Schroeder, Bethany. "HL7 Focuses on XML in New Orleans." XML Files: The XML Magazine Issue 04 (March 17, 1998) 14.

A brief note on the January 1998 meeting of HL7's SGML/XML SIG, with updated on DICOM, KONA, and other health industry standards efforts.

Available online



[CR: 19971206]

Schroff, Thomas; Brüggemann-Klein, Anne. "Grammar-Compatible Stylesheets." Pages 51-88 (with 11 references) in Principles of Document Processing. Proceedings of the Third International Workshop. PODP '96, Third International Workshop. Palo Alto, California. September 23, 1996.. Edited by Charles Nicholas (Department of Computer Science and Electrical Engineering, UMBC, Baltimore, MD) and Derick Wood (Department of Computer Science, HKUST, Clear Water Bay, Kowloon, HONG KONG). Lecture notes in artificial intelligence. Lecture notes in computer science, 1293. Berlin / London: Springer-Verlag, 1997. ISBN: 354063620X. Authors' affiliation: [Schroff]: Technische Universität München; [Brüggemann-Klein]: Technische Universität München.

Abstract: "Stylesheets have been used to convert the document type of SGML documents. With a stylesheet, a document conforming to a source grammar can be transformed into a document conforming to a target grammar. The paper discusses the following problem: given a stylesheet, a source and a target SGML grammar, is it decidable whether or not all documents conforming to the source grammar are transformed into documents conforming to the target grammar? Using context free extended context free grammars we give a decision procedure for this problem."



Seaman, David "Campus Publishing in Standardized Electronic Formats -- HTML and TEI." Pages xxx-xxx in Filling the Pipeline and Paying the Piper: Proceedings of the Fourth Symposium [November 5-7, 1994, the Washington Vista Hotel, Washington, DC]. Edited by Anne Okerson, Symposium co-sponsored by the Association of Research Libraries and the Association of American University Presses in collaboration with the University of Virginia Library, the Johns Hopkins University Press, and the American Physical Society. Washington, D.C.: Association of Research Libraries, Office of Scientific & Academic Publishing, 1995. ISBN: 0918006252. Author's affiliation: David Seaman is the Director of the University of Virginia Library's Electronic Text Center.

"Introduction: In the past year, HyperText Markup Language (HTML) has done more to popularize the notion of Standard Generalized Markup Language than any single preceding use of SGML. Used on the World Wide Web through a graphical client such as Netscape or NCSA Mosaic, HTML documents and their associated image, sound, and digital video files result in sophisticated network publications and services. And even when viewed through the plain text (VT100) client Lynx, HTML files can still be exciting clusters of interlinked documents.

In common with Internet users all over the world, the University of Virginia Library now uses and produces HTML documents; unlike most other academic institutions, however, we came to HTML with practical experience in another, more sophisticated, form of SGML -- that of the Text Encoding Initiative Guidelines. For two years the Electronic Text Center has been using the TEI Guidelines, through several drafts, to tag and distribute hundreds of electronic texts. The purpose of this paper is both to explain how we are using these various forms of SGML mark-up to publish a variety of documents, and to sound a cautionary note about the wholesale use of HTML as a primary authoring language."

An online version ia also available at URL in HTML format, and in (only partially-linked) mirror copy here (May 1995). An abstract of the paper by Mary Mallery is available here.



[CR: 19971018]

Seaman, David. "The Electronic Archive of Early American Fiction (1775-1850)." Page 150 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: University of Virginia, Email: .

[Extract:] "This 125,000-page project takes the University of Virginia Library into a level of archival-quality text and image production rarely seen in rare books archives. In preparing for this project we have tackled issues of funding, production-level digital equipment and practices, partnerships with commercial publishers to disseminate the results, and large-scale storage issues. This paper will outline the project, explain the workflow, equipment, and text and image standards that we think appropriate for creating data of long-term viability, and explore the lessons we are learning (and expect to learn) regarding the economics of undertaking a cost-recovery process. The project will combine high-quality color page images of all 125,000 pages (including covers and spines) with TEI-encoded text versions, allowing scholars all over the world a rare sense of the physical reality of the volumes being studied as well as providing a fully-searchable SGML database."

Abstract available online in HTML format: "The Electronic Archive of Early American Fiction (1775-1850)", by David Seaman; [archive copy]. See the Early American Fiction Home Page, or the main SGML/XML Web Page database entry for The Electronic Archive of Early American Fiction (UVA).

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.



[CR: 19961217]

Seaman, David. "The Electronic Text Center and On-line Archive of Electronic Texts." Pages 55-57 in Elektronisches Publizieren und Bibliotheken, die Herausforderung neuer Partnerschaften [Electronic Publishing and Libraries, The Challenges of New Partnerships]. [Conference] 'Elektronisches Publizieren und Bibliotheken'. Bielefeld, Germany. February 5-7, 1996. Frankfurt am Main, Germany: Vittorio Klostermann, 1996. Author's affiliation: University of Virginia, Electronic Text Center.

Abstract: "The Electronic Text Center is both a physical space within the university library, open to all University of Virginia members and also an on-line collection of many thousands of Internet-accessible texts and images. It is important to us that we perform two tasks simultaneously in order to build our digital library: we are both creating a set of electronic resources, and also creating a user community for it, by training our users to become effective consumers and producers of electronic texts and images. Since 1992, the Etext Center has made available hardware and software for the creation and analysis of electronic texts; it provides training for these new tools and techniques; it acts as a focal point for HTML and SGML development in the humanities at Virginia; and it provides a place in which to use those texts that are not yet accessible on the Internet."

The document is available online: in HTML format; [mirror copy]. See also the main entry for the UVA Electronic Text Center.



Seaman, David M. "From Margin to Mainstream: Creating a Broad-Based Humanities Computing User Community at the University of Virginia." Pages 213-214 [partial abstract] in Colloque International "Consensus ex Machina?". Abstracts International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratoire "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages. Author Affiliation: University of Virginia.



[CR: 19950716]

Seaman, David. "Gate-Keeping A Garden of Etext Delights: Electronic Texts and the Humanities at the University of Virginia Library." Pages 63-67 in Gateways, Gatekeepers, and Roles in the Information Omniverse: Proceedings of the Third Symposium. Third [ARL] Symposium, Washington Vista Hotel, Washington, DC, November 13-15.1933. edited by Ann Okerson and Dru Mogge. Washington, D.C.: ssociation of Research Libraries, 1994. Authors' affiliation: University of Virginia.

The paper discusses the use of SGML by the Electronic Text Center. "All the electronic texts are encoded with Standard Generalized Markup Language (SGML). The large-scale electronic text databases -- the OED, the Chadwyck-Healey items - come fully marked up, and increasingly we are seeing producers of individual titles (such as Oxford University Press) also offering them in SGML form. The SGML markup not only means that texts can be added together in conglomerations but also that the data, with all its structural and typographic information, is not inherently wedded to a piece of software. It is, in a real sense, data that will outlive the software we currently use to explore and present it."

Available online from the UVA WWW server.



Seaman, David M. "'A Library and Apparatus of Every Kind': The Electronic Text Center at the University of Virginia." Information Technology and Libraries 13/1 (March 1994) 15-19. 1 reference. Author affiliation: Coordinator of Electronic Texts, University of Virginia Library, Charlottesville, VA.

Abstract: The Electronic Text Center at the University of Virginia combines an online archive of thousands of SGML-encoded electronic texts, all available through a single piece of search software, with a library-based center housing hardware and software suitable for the creation and analysis of text. Through ongoing training sessions and support of individual teaching and research projects, the Center is now building a diverse and expanding user community locally, and providing a potential model for similar enterprises at other institutions.



[CR: 19950716]

Seaman, David. "The University of Virginia's Electronic Texe Center: An Interview with David Seaman." Virginia Librarian 39/2 (April/May/June) 6-10 (with sidebar: "Standard Generalized Markup Language"). Author's affiliation: David Seaman is Director of the Electronic Text Center, Alderman Library, University of Virginia, Charlottesville, Virginia.

"We are also concerned to maintain our on-line data in a standard tagged format-known as SGML, or Standard Generalized Mark-up Language-that will ensure that the electronic texts, with all their typographic, spacial, and structural instructions, will outlive the software we currently use to search and display them. . .The texts in our on-line collection are marked up with SGML tags that use letters and phrases within angled brackets to convey such information as structural divisions-title page, main body of text, scene, stanza, page, paragraph, etc. and typographical elements- changes in typeface, special characters, etc. . ."

Available online: http://www.lib.virginia.edu/etext/articles/VirgLib/virglib.html from the UVA WWW server.



[CR: 19971024]

Seaman, David. "The User Community as Responsibility and Resource: Building a Sustainable Digital Library." D-Lib Magazine ( ). ISSN: 1082-9873. Author's affiliation: Electronic Text Center, University of Virginia.

Summary: "Since opening as a full-time service in 1992, the Electronic Text Center at the University of Virginia Library has pursued twin missions with equal seriousness of purpose: (1) to create an on-line archive of SGML texts; (2) to build a community of humanists adept at the creation and use of online full-text resources. . . this article will focus on the integral place that our user community has in shaping the work of our library-based Etext Center."

See the Web site for The Electronic Text Center.

The article is available online in HTML format; local archive copy. Note that the July/August 1997 double issue of D-Lib Magazine (Amy Friedlander, editor) contains several articles referencing the use of SGML encoding in digital library research.



[CR: 19971201]

Selber, Stuart A. "First Commentary. The OHCO Model of Text: Merits and Concerns." Journal of Computer Documentation 21/3 (August 1997) 26-31 (with 21 references). ISSN: 0731-1001. Author's affiliation: Technical Communication and Rhetoric Program, Department of English, Box 43091, Texas Tech University, Lubbock, Texas 79409-3091; Email: selber@ttu.edu; WWW: http://english.ttu.edu/faculty/selber/vitae.html.

Abstract: "The author discusses the ordered hierarchy of content object (OHCO) model for text representation on the computer. [I have a concern about the OHCO model of text...] Although the model has explanatory power computationally, the way it defines what a text is, what a writer is, and what a reader is may serve to diminish, in potentially damaging ways, what is involved in the processes and practices of technical communication. But before discussing his concerns with the OHCO model of text, the author considers some of its merits, because he would not dismiss the model as invaluable to students and professionals. At times he found the model quite compelling, particularly in its focus on how text can be both productively and unproductively represented in online information space. [In fact, I plan on including this article in a graduate-level course I teach in technology and discourse.]"

The article is a response (commentary) on the publication of DeRose (et al), "What is Text, Really?" reprinted from Journal of Computing in Higher Education 1/2 (Winter 1990) 3-26.

This article appeared with four others in a special issue of JCD which focused upon 'the OHCO model of text [ordered hierarchy of content objects]'. The Journal of Computer Documentation (JCD) is a quarterly publication of the Association for Computing Machinery, Special Interest Group on Systems Documentation [SIGDOC], published by the Association for Computing Machinery. Editor in Chief: Tony R. Girill, Lawrence Livermore National Laboratory and University of California.



[CR: 19960716]

Sengupta, Arijit. "Demand More from Your SGML Database! Bringing SQL Under the SGML Limelight." <TAG>: The SGML Newsletter 9/4 (April 1996) 1-7, with 11 references. ISSN: 1067-9197. Authors' affiliation: PhD candidate at Indiana University, Department of Computer Science.

"Abstract: Have you ever been frustrated by how inadequate SGML databases are in terms of searching or querying your documents? With the current state of the art, you will easily be able to search for a word, phrase, or keywords in the whole document. Some systems let you perform approximate searches or regular expression searches. Even fewer systems let you search for keywords or phrases in certain SGML regions. However, there is much more information already in SGML documents that one can utilize cleverly to design a proper SGML database system. The current trend of modeling SGML documents with object-oriented and object-relational databases has certainly brought SGML closer to a complex object database model, but much research and development remains to be done in this area. This article introduces the popular relational database query language SQL (Structured Query Language) and its applicability in the SGML domain.The capability of this query language to express complex queries with a not-so-complex syntax gives relational databases that support SQL an advantage over other similar systems. The ability to use SQL or an SQL-like query language with SGML has the potential of giving much more power to SGML repositories. This article shows how we can pose complex document-related questions easily with SQL. SQL-capable systems will let you solve problems that would otherwise seem impossible, or at least, tedious."

The author believes that SQL ought to be implemented more completely in SGML systems, as it supplies a widely accepted and powerful language for expressing queries -- many of which are difficult to express in current SGML systems.

Available online in postscript format; [mirror copy]



Sengupta, Arijit. Design and Implementation of a Database Environment for the Manipulation of Structured Documents 1993.. Extent: 30 references.

"Abstract: A method for implementing a structured document database system is presented. The present-day systems dealing with structured or tagged documents have not been able to produce capabilities that even simple database systems possess - the ability to query the database based on the various properties of the database. Research in this area also has not been able to produce query languages and visual query interfaces similar to those that exist in the relational domain. The goal for the present research is to develop a complete database system for structured documents having data definition, manipulation and querying capabilities similar to those in the relational world. Only structured documents tagged with the SGML [13] have been considered, in which detailed and complete information about the document structuring can be obtained from the Document Type Definition (DTD). Special systems that have been considered, used and evaluated are PAT (Open Text 5.0) [22], sgmls 1.1, Exodus [28], Shore [6] for purposes of data structures, parsing, data storage and retrieval, etc. Special considerations have been given to three special cases of data for experimentation purposes: (a) the Oxford English Dictionary (OED) database, (b) the Chadwyck Healy English Poetry full-text database, and (c) an experimental movie database." [from the online text]

Available online at URL http://www.cs.indiana.edu/hyplan/asengupt/thesis/oral/oral.html. [further details are requested from the author]



[CR: 19961226]

Sengupta, Arijit. "Standardizing the Querying Process with SGML The SQL DTD." Pages 323-338 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Indiana University, Computer Science Department, Lindley Hall 215, Bloomington, IN 47405, USA; Tel: 812 855 4318; Fax: 812 855 4829; Email: asengupt@indiana.edu; WWW: http://www.cs.indiana.edu/hyplan/asengupt.html.

Abstract: "One of the most exciting applications of SGML which has emerged in the recent years is its use in document databases. The structural information embedded in SGML documents makes it possible to query SGML documents and extract information in an automatic manner; however, this querying process has not been standardized. As a result, different SGML database implementations use their own query language syntax, thus making the migration from one system to another a difficult process. In the relational database domains, however, the query language SQL has been a standard for over ten years and is universally used in most relational database systems. Although originally designed for relational databases, SQL is quite powerful for specifying complex queries in a relatively easy-to-understand syntax. With a small set of extensions to take advantage of the hierarchical structure of SGML, SQL can be easily adapted for use with SGML document databases (TAG-496).

The powerful 'generalized' nature of SGML makes it easy to implement SQL as an SGML DTD, so that queries can be expressed as document instances of the SQL DTD. Current SGML authors and users can write queries expressed in this DTD without learning a different language or using a separate editor. Moreover, because of the portable nature of SGML, these queries can be used in any SGML database system and can be converted to regular SQL for use in a relational or Object-Relational/Object-Oriented database system, if necessary. Databases that support the SQL DTD can also store the queries without any extra effort, and subsequently query them for inferring optimization parameters.

This paper presents a representative DTD for the SQL query language, with extensions for use with hierarchically structured documents. It also compares this language with languages proposed and implemented, including SDQL - the query language in the DSSSL standard (DSSSL95). This paper explains the advantages of using this language as a query language in document database systems and the necessity for standardizing the querying process in document databases. Finally, it discusses some implementation issues and complexity measures."

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

Available in postscript format, or SGML format; [mirror copy, postscript]



[CR: 19970817]

Sengupta, Arijit; Dillon, Andrew. "Extending SGML to Accommodate Database Functions: A Methodological Overview." Pages 629-637 (with 27 references) in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: [Sengupta:] Computer Science, Indiana University, Bloomington, IN; Email: asengupt@indiana.edu, WWW: http://www.cs.indiana.edu/hyplan/asengupt.html; [Dillon:] School of Library and Information Science, Indiana University, Bloomington, IN; Email: adillon@indiana.edu, WWW: http://www-slis.lib.indiana.edu/adillon/adillon.html.

Abstract: "A method for augmenting an SGML document repository with database functionality is presented. SGML (ISO 8879,1986) has been widely accepted as a standard language for writing text with added structural information that gives the text greater applicability. Recently there has been a trend to use this structural information as meta-data in databases. The complex structure of documents, however, makes it difficult to directly map the structural information in documents to database structures. In particular, the flat nature of relational databases makes it extremely difficult to model documents that are inherently hierarchical in nature. Consequently, documents are modeled in object-oriented databases (Abiteboul, Cluet, & Milo, 1993), and object-relational databases (Hoist, 1995), in which SGML documents are mapped into the corresponding database models and are later reconstructed as necessary. However, this mapping strategy is not natural and can potentially cause loss of information in the original SGML documents. Moreover, interfaces for building queries for current document databases are mostly built on form-based query techniques and do not use the 'look and feel' of the documents. This article introduces an implementation method for a complex-object modeling technique specifically for SGML documents and describes interface techniques tailored for text databases. Some of the concepts for a Structured Document Database Management System (SDDBMS) specifically designed for SIL documents are described. A small survey of some current products is also presented to demonstrate the need for such a system."

A Postscript version of the article is available online (also, online abstract); [local archive copy].

See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.



[CR: 19970627]

Sengupta, Arijit Dillon, Andrew. Query By Templates: A Generalized Approach for Visual Query Formulation for Text Dominated Databases. Technical Report. To appear in the Proceedings of the Conference on Advanced Digital Libraries (ADL'97), Library of Congress, Washington, D.C. May 7-9 1997. []: [], May 1997. Extent: approximately 13 pages.

"Abstract: With the advent of the World Wide Web (WWW), the concept of document databases is becoming more popular. This makes the idea of a globally distributed digital document library realizable. The standard encoding format for the WWW is HTML (HyperText Markup Language), which embeds some structural information in otherwise text-dominated documents. HTML can be viewed as a special instance of SGML (Standard Generalized Markup Language), a very powerful document encoding language capable of describing may different types of languages and formats. The current work is based on designing query languages, processing and visualizing mechanisms for structured documents in general, and SGML documents in particular. We are using the World Wide Web as a platform for this querying mechanism, especially because of its popularity and world-wide availability. However, because of the wide range of users, these systems need to be easy to use. In particular, it is important that users can easily search for information from the database without prior knowledge of the internal structure of the database. This paper outlines a visual query constructing technique for application in databases containing hierarchically structured documents. In this paper, we describe the visual component of this query language, which is essentially a generalization of the Query By Example (QBE) language for relational databases. We call this method ``Query By Templates(QBT)''. Further, we describe the basic properties and usefulness this visual query technique, and show how queries on structured document databases can be performed using this method. We also describe an implementation of QBT on the Web using the Java{TM} programming language."

Available online in postscript format; [mirror copy].



[CR: 19951122]

Severson, Eric. The Art of SGML Conversion: Eating Your Vegetables and Enjoying Dessert. Avalanche Development Corporation/Interleaf, January 1995. 34K (computer file), ca. 15 pages. Author's affiliation: Executive Vice President, Avalanche Development Corporation [Interleaf]; email: eric@avalanche.com; Tel: (303) 449-5032.

"SGML conversions have a reputation for being worthwhile but not necessarily lots of fun. Much like the problem of having to eat your vegetables before you get dessert.

"SGML conversion typically involves building a bridge between the world of hardcopy and word processing documents (where logical structure is perceived visually by the reader) and "intelligent" documents (where logical structure is explicitly encoded). The whole point of SGML conversions is that they necessarily involve information enrichment, adding more than was originally there.

"This white paper explores the issues involved in moving to SGML and offers advice for making the process as effective and painless as possible. It demonstrates how the steps in the SGML conversion process are directly related to the benefits you get once conversion is complete."

Available on the Internet from the Interleaf/Avalanche WWW server: "The Art of SGML Conversion" [mirror copy November 1995]. Apparently also to be available as an SGML Open White Paper, #4001-II.



Severson, Eric. How SGML and HTML Really Fit Together: A Case for the A Scalable HTML Avalanche Development Corporation/Interleaf, January 1995. 24K (computer file), ca. 8 pages. Author's affiliation: Avalanche Development Corporation/Interleaf; email: eric@avalanche.com; Tel: (303) 449-5032.

**Note: Version 2 (April 1995) is available from this WWW server.

This (white) paper was distributed on Newswire, and is available as item 143.1995-01-09 in the Newswire archives, or here. Discussion of the paper took place on the sgml-internet discussion list.



[CR: 19971227]

Severson, Eric. "The Proper Role of SGML and XML in an Enterprise I/T and Intranet Strategy." Pages 513-518 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Eric Severson]: IBM Global Services.

Abstract: "Up to now SGML has tended to be used primarily in technical publishing applications, usually at a departmental level. However, with today's focus on web-based enterprise information management, and the recent introduction of XML, many more opportunities for SGML have become apparent. This whitepaper surveys the current state of the information industry, from both a business and technical point of view, and shows how SGML and XML technology can and should be positioned within an organization's overall I/T and intranet strategy."

This paper was delivered as part of the "Business Management" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19960904]

Severson, Eric; Bingham, Harvey (editors). Table Interoperability: Issues for the CALS Table Model. SGML Open Technical Research Paper 9501:1995. Coraopolis, PA: SGML Open, November 21 1995. Extent: approximately 25 pages. Authors' affiliation: [Eric Severson] Co-chair, Table Interchange Subcommittee, SGML Open; [Harvey Bingham] Co-Chair, Table Interchange Subcommittee, SGML Open; also: Interleaf.

"Abstract: To help address the existing interoperability issues when using tabular material ("tables") in SGML implementations, SGML Open's Technical Committee formed a Table Interchange subcommittee to research these issues.

"Because the CALS table model has proliferated widely, it was chosen as the initial starting point. Although it has evolved to the point of a de facto standard, the specification leaves a large number of semantics open to interpretation which in turn has made interoperability difficult to achieve. As its first major task, the Committee therefore set out to identify and document ambiguities in the CALS table model specifications, identify and document related interoperability issues between SGML Open vendor products, and lay the groundwork for developing a proposed clarification of the standard that will minimize ambiguity and maximize interoperability."

"This paper summarizes the results of this initial work, identifies the sources of current interoperability issues for the CALS model, and summarizes the most common set of practices currently followed by SGML Open vendors."

Available in HTML format: SGML Open - TRP 9501:1995 - "TABLE INTEROPERABILITY: Issues for the CALS Table Model" [mirror copy, December 28, 1995]. Also available from the FTP server at Exoterica Corporation in compressed Postscript format ftp://ftp.exoterica.com/sgmlopen/9501/9501.ps.Z, [mirror copy] or in other formats (files: 9501pack.tar.Z, 9501pack.zip, 9501ps.zip). Document revisions: Technical Research Paper 9501:1995; Committee Draft: 1995 May 10; Committee Draft: 1995 August 5; Final Draft Technical Research Paper: 1995 September 15; Final Technical Research Paper: 1995 November 21.



[CR: 1995]

Sévigny, Martin. Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés. . Travail dirigé présenté à la Faculté des études supérieures de l'Université de Montréal. Quebec: Faculté des études supérieures de l'Université de Montréal,, 1996. Advisor: Yves Marcoux. Affiliation: École de bibliothéconomie et des sciences de l'information (EBSI), de l'Université de Montréal, Québec, Canada. .

See the summary for the thesis in French: Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés. Travail dirigé présenté à la Faculté des études supérieures de l'Université de Montréal, par Martin Sévigny. Pour l'obtention de la Maîtrise en bibliothéconomie et en sciences de l'information de l'EBSI.



[CR: 19970531]

Sévigny, Martin; Marcoux, Yves. "Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés [The Creation and Evaluation of a Human-computer Interface for Information Retrieval in Structured-document Bases", in French]." Revue canadienne des sciences de l'information et de bibliothéconomie [Canadian Journal of Information and Library Science] 21/3-4 (September-December 1996) 59-77 (with 23 references). ISSN: . Authors' affiliation: École de bibliothéconomie et des sciences de l'information (EBSI), de l'Université de Montréal, Québec, Canada. WWW [http://www3.sympatico.ca/msevigny/]: ; WWW [Marcoux]: http://tornade.ere.umontreal.ca/~marcoux/.

Abstract: "The creation of electronic information in the form of structured documents is steadily gaining popularity. It is thus necessary to develop information retrieval tools fitted to this type of document. In this article, we present the results of a research project aimed at identifying human-computer interface elements that can support information retrieval in structured document bases. The research included a review of the literature and of existing systems, as well as the design, development, and user testing of a prototype information retrieval system for SGML (ISO 8879) document bases. We make five recommendations for the design of structured-retrieval systems."

See the summary for the thesis in French: Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés. Travail dirigé présenté à la Faculté des études supérieures de l'Université de Montréal, par Martin Sévigny. Pour l'obtention de la Maîtrise en bibliothéconomie et en sciences de l'information de l'EBSI.



[CR: 19960313]

Seybold, Jonathan. "Remembering Yuri Rubinsky (1952-1996) [In memoriam: Yuri Rubinsky]." Seybold Report on Publishing Systems 25/11 February 29, 1996 20. ISSN: 0736-7260.

The full-page article includes a photograph of Yuri Rubinsky. He was familiar to Seybold readers mostly in connection with role as cofounder of SoftQuad, Inc. The article summarizes the highlights of Yuri's career in the publishing industry, including annotation of his major publications. It also includes eulogy from several friends and colleagues (Jonathan Seybold, Charles Goldfarb, Tim Berners-Lee, Chet Ensign, Tim Bray, Tommie Usdin, Elaine Brennan, and Pam Gennusa. For other memorial tributes to Yuri, see the collection elsewhere in this database.



[CR: 19971008]

[Seybold Publications Staff]. "Inso Adds Math to DynaWeb. IEEE Uses it to Go Live with Online Digital Library." Seybold Report on Internet Publishing 2/2 (October 1997) 28. ISSN: 1090-4808.

"Inso recently beefed up the math support in DynaWeb, its SGML-to-Web publishing system, by enabling Web browsers to display mathematical equations stored in DVI (TEX) format."

Representing and rendering math has always been a special challenge. The DynaWeb technology described in the article is featured in the IEEE (Institute of Electrical and Electronics Engineers, Inc.) Computer Society Digital Library (CSDL). The online database introduction explains: "For those interested in the publication technology, we have created a database of SGML files and linked images. These files are converted and displayed as HTML on the fly. This allows subscribers to manipulate and view the content -- including math -- with standard web browsers without any helper applications or plug-ins." According to the Seybold article, the online IEEE collection is an "SGML-encoded text library [which] offers the equivalent of 35,000 periodical pages and more than 250,000 images. With the new [DynaBase] 3.1 software, the Web database automatically converts DVI math equations into GIF images on the fly as articles are served to visitors...DynaWeb is the first commercial product to generate GIFs from DVI math equations on the fly."

Note: DynaWeb has been chosen by SOFTBANK NetForums as part of a dynamic Web publishing solution. See the press release, [archive copy].



[CR: 19971008]

[Seybold Publications Staff]. "Microsoft, Inso, ArborText Propose Style Sheet Language for XML." Seybold Report on Internet Publishing 2/2 (October 1997) 19. ISSN: 1090-4808.

The article describes the proposal for a style sheet language XSL (Extensible Style Language), comparing it to CSS and DSSSL. See the database entry for Extensible Stylesheet Language (XSL) for further description.



[CR: 19970329]

[Seybold Publications Staff]. "Netscape Replies on XML [editorial]." Seybold Report on Internet Publishing 1/5 (January 1997) 2. ISSN: 1090-4808.

The article summarizes a response from Netscape Communications Corporation to the Seybold offices regarding Netscape's disposition on XML (Extensible Markup Language). Netscape clarifies that it is not currently [ca. January 1997] working on SGML or XML for two reasons: (a) customers have spoken in favor of HTML over SGML, so Netscape believes these customers are better served by improving HTML functionality than by adopting XML; (b) XML can theoretically be supported indirectly via XML-compliant Netscape plugins: "...it is possible and quite straightforward to incorporate SGML-based layout engines into the Netscape Navigator and Netscape Communicator environments using inline plug-ins or Java. This makes it possible for the 40 million users of Netscape Navigator to access XML information today."

According to the article, Alex Edelstein, Netscape's Group Product Manager, also expressed the opinion that the need to deal with user-defined tags and styles (as XML does) could be met via JavaScript: "...JavaScript is rapidly being accepted as a great means to provide user-defined flexibility. User-defined variables can be created dynamically and passed between client and server in this way."

Editorially, the article expresses doubt as to whether Netscape is "up to speed on publishing issues," at least, with respect to Seybold's readership: "that product managers at Netscape think HTML somehow will work as a way to define all the Web documents that we need, or that SGML's main benefit is better layout, illustrates just how far behind Netscape will be should Microsoft decide to leverage its expertise in SGML and XML."

See more on the Extensible Markup Language in the SGML/XML Web Page main entry.



[CR: 19971121]

[Seybold Publications Staff]. "Seybold San Francisco '97: PDF and XML Emerge. [Alternate title: 'Shaping the Future: PDF, XML and the Men of the Hour, Gates and Jobs'." Seybold Report on Publishing Systems 27/5 (November 17, 1997) 1, 3-38. ISSN: 0736-7260.

"By most calculations, the two areas of sharpest focus at Seybold San Francisco 97 were PDF, which increasingly is moving into a role as the format for production workflow, and XML, which is picking up support as a standard for tagging documents intended for use in multiple-media environments. Sections in this issue of Seybold Report on Publishing Systems covering SGML/XML include: "Asset Management, SGML and Database Publishing" (pages 29-38), beginning with: "The boundaries between asset management and document management are starting to blur. So are the boundaries between SGML publishing tools and database-publishing tools." The section "Publishing with SGML" (pages 33-36) provides an update on ArborText's Willow and XML support, Chrystal's Astoria 3.0, Datalogics' Documentum-to-Frame solution, I4I S4-Desktop, PIT's Target 2000, and XyVision's support for FrameMaker+SGML.



[CR: 19971120]

[Seybold Publications Staff]. "XML, Collaborative Tools Shine at Seybold San Francisco '97 [alt. title: 'XML, Content Management Take Center Stage at SSF '97']." Seybold Report on Internet Publishing 2/3 (November 1997) 1, 3-19. ISSN: 1090-4808.

Abstract: This "Special Report" feature article in Seybold Report on Internet Publishing describes the rapidly-changing world of Internet tools and standards that bear on Web publishing, and particularly, the role of XML within the W3C's suite of Internet recommendations. "Seybold San Francisco '97 will be remembered as the first major conference and trade show where XML entered the mainstream vocabulary. It was the buzz of the conference and a draw on the show floor. The demo of XML support in Internet Explorer 4 was one of the highlights of Bill Gates's keynote address on Wednesday."



[CR: 19971120]

[Seybold Publications Staff]. "XML Comes into the Limelight." Seybold Report on Internet Publishing 2/3 (November 1997) 4-5. ISSN: 1090-4808.

Summary: The article describes the growing support for XML, as evidenced in the SSF presentations by Bill Gates (Microsoft), John Warnock (Adobe), and John Gage (Sun Microsystems). Bill Gates is is quoted as saying "XML is important because you won't be able to afford to author for all of the screen form factors and interface techniques." Steve DeRose (chief scientist at Inso) is quoted as saying that the 'quiet revolution' (SGML now emerging in XML) "is no longer quiet, but boisterous, productive, and growing at Web speed."

This is a subsumed article in the longer feature article of the Special Report in this issue: "XML, Collaborative Tools Shine at Seybold San Francisco '97."



[CR: 19961213]

[Seybold Staff]. "W3C Publishes Draft of Simplified SGML. XML Allows User-definable Tags." Seybold Report on Publishing Systems 26/6 (November 30, 1996) 41. ISSN: 0736-7260.

"On the tenth anniversary of the adoption of SGML as an ISO standard, a band of SGML experts announced they have drafted a simplified subset of the language they hope will spur the use of SGML on the Internet. The new language, Extensible Markup Language, or XML, was prepared by a World Wide Web Consortium working group consisting of about 80 members, primarily representing vendors. The announcement was made at SGML '96, being held in Boston this week. The first published draft is available on the Web at http://www.w3.org/pub/WWW/TR/WD-xml-961114.html. XML, like SGML, is a meta-language for describing the markup of different types of documents. It is simpler than SGML, reducing a 500-page reference to 26 pages. Unlike HTML, which has a fixed (albeit changing) set of tags, XML lets you define your own tags and attributes." [extracted] See the main entry for XML in the SGML/XML Web Page for additional information.



[CR: 19961113]

Seybold Publications. Seybold San Francisco '96. Part III: Color Publishing, Page Composition and Hardware. Seybold Special Report, Volume 5, Number 4. Media, PA: Seybold Publications, October 28, 1996. ISSN: 1069-7217.

The Seybold San Francisco '96 Conference was held at the Moscone Convention Center, San Francisco, September 13-17, 1996. The Seybold Special Report Series (3 parts) covered the conference. Part III of the Seybold Special Report 5/4 includes a section entitled "Page Layout Software, SGML Systems and Other Aids to Publishing (pages 31-39). Featured SGML software systems include: (1) "Document management for ArborText Adept" [The 'Willow Initiative' which places software between the editor and the document manager for the purpose of managing small document 'objects']; (2) "Autographics tackles library automation"; (3) "I4I offers on-the-fly SGML concersion" [delivering SGML documents to users lacking SGML software]; (4) Microstar pursues 'Mainstream SGML'" [marketing initiative with Documentum, InfoAccess and Adobe to help SGML penetrate business environments by "making it simple for authors to create and maintain SGML-savvy documents"]; (5) "Passage Systems shows custom system" [PassageNet]; (6) Xyvision sees market in telecommunication" [TEDD and TIM DTDs]. In Parts I and II of the SSF '96 report, coverage is given to HTML/SGML products for Web publishing (SoftQuad HIP and HoTMetaL; Electronic Book Technologies' DynaText (Matterhorn), DynaBase, and DynaWeb 3.0); the other volume titles are: Part I: Overview of the Show and Publishing on the Internet, and Part II: Output Technology and Workflow Developments.



Seybold Publications. Seybold Seminars Boston '95 [March 28-31, 1995. Hynes Convention Center, Boston, MA]. Part I: Electronic Delivery, SGML Issues, Catalogs and Output. Seybold Special Report, Volume 3, Number 8. Media, PA: Seybold Publications, April 21, 1995. ISSN: 1069-7217.

SGML was a major theme at Seybold Seminars once again, and details are available in the two Special Report issues. Part II is less relevant, being dedicated to images and color (Volume 3, Number 9: Seybold Seminars Boston '95 [March 28-31, 1995. Hynes Convention Center, Boston, MA]. Part II: Managing Color; Image Input, Editing and Output; Page Makeup, Etc.). The issue title for Part I includes "SGML", which is becoming more popular in light of widespread acquaintance with HTML. The volume Table of Contents for Part I (much abbreviated) is: I. Introduction. Electronic Publishing: Moving Past Fear and Greed to Commercial Realities (3-7); II. Publishing on the Internet: Strategies and Tools (8-28); A. Tools for Creating Web Pages; B. Other Electronic Delivery Tools [EBT Deals with Phoenix]; III. New Tools for Managing and Writing SGML Documents (29-39); A. SGML-Based Document Management Tools; B. SGML Authoring Tools; IV. Catalog Production Systems (40-43); V. Imagesetters, Platesetters and Digital Presses (44-66).

The section "Tools for Creating Web Pages" includes the following major presentations: (1) InContext's Spider [Web authoring program based upon InContext 2 SGML editor]; (2) NaviSoft [HTML and Webs authoring]; (3) Archetype [HTML viewer supporting multiple views]; (4) SoftPress' Uniqorn; (5) SoftQuad HoTMetaL [support for HTML 3.0] and SoftQuad Panorama [SGML browsing over the Internet]; (6) Electronic Book Technologies [DynaBase, SGML to HTML conversion on the fly]; (7) Open Text [WWW Indexing].

The section "Other Electronic Delivery Tools" includes a significant story under the title "EBT Deals With Phoenix" (pages 21-23; see also the graphic of the virtual digital library on page 7). Phoenix Publishing Systems Inc., a spinoff company from Phoenix Technologies, which produces documentation for some 40% of PC shipped worldwide, has contracted with EBT [Electronic Book Technologies] to create online virtual "digital libraries" storing PC documentation. Virdox (the virtual documentation information system) supports advanced concepts in document versioning, including multilingual versions and multi-vendor versions. Other stories under "Other Electronic Delivery Tools" are: (1) "Frame Adds Olias [SGML browswe], drops R&D"; (2) "Ntergaid's HyperWriter 4.2" [with SGML import facilities]; (3) "Open Text Latitude for delivery, retrieval" [Release 5 of PAT used for managing the display of a broad range of text formats, incorporating Panorama for SGML display].

Under "SGML-Based Document Management Tools" this Seybold Special Report includes description, evaluation, and references for the following products: Auto-Graphics [Smart Editor]; Berger-Levrault/AIS [SGML/Store]; CTMG (or Active Systems) [ActiveSearch]; Documentum [Enterprise Document Management System]; EBT - Electronic Book Technologies [DynaBase]; Ferntree [Structured Information Manager]; Frame [Frame SGML Toolkit]; InfoDesign [WorkSmart]; IDI - Information Dimensions [Basis SGMLServer]; Interleaf [Relational Document Manager]; Odesta [LiveLink]; Texcel [Information Manager]; XSoft [Astoria]; XyVision [Parlance Document Manager].

The section "SGML Authoring Tools" includes reviews of four products in particular: (1) ArborText plans Internet Addition; (2) Frame improves style, attributes handling; (3) Microstar improves on [SGML Author for] Word; (4) WordPerfect prepares SGML Edition.



Seybold Publications. Seybold San Francisco '94 [September 13-16, 1994]. Part I: Electronic Document Delivery and Output Issues. Seybold Special Report, Volume 3, Number 2. Media, PA: Seybold Publications, October 10, 1994. ISSN: 1069-7217.

The abbreviated Table of Contents: Introduction: Publishing on the Net Sparks Industry Resurgence (1-7); Electronic Document Delivery (8-29); Internet and Online Publishing (10-15); Tools for Internet Publishing (16-20); Fonts for Electronic Documents (20-21); Delivering Documents Through Digital Media (22-27); Digital Ad Delivery: Ready to Move Ahead (27-29); Output Issues (30-67).

The issue contains an in-depth discussion of the implication of HTML for the advance of SGML. There is a short presentation "HTML: Becoming an SGML Application" (14-15). SGML tools for the Web are treated in discussion of three products: "EBT's DynaWeb Server" (16-17); "HaL Browser Shows SGML, HTML" (17); "IDI Adds Web Service to BasisPlus" (17-18). HTML authoring tools are treated in: "Tools for Making HTML" [Nice TagWizard; SoftQuad HoTMetaL; Free Tools] (20).

The section "Software for Delivering Document Collections" (24-27) includes discussion of SGML's role in sevral products: Bellcore's SuperBook [SGML import]; Folio support for SGML [SGML to flat-file conversion]; IBM's upgraded BookManager [migration to SGML support from underlying GML]; Sun Microsystems [replacing PostScript-based Answerbook documentation reader with SGML-based documentation using Electronic Book Technologies' DynaText and developer toolkit].



Seybold Publications. Seybold San Francisco '94 [September 13-16, 1994]. Part III: Composition, Font Issues, Platforms, SGML and Other Topics. Seybold Special Report, Volume 3, Number 4. Media, PA: Seybold Publications, October 31, 1994. ISSN: 1069-7217.

The abbreviated Table of Contents: Introduction: Text Composition, Page Layout, Font Issues, and Newspaper Systems (3); Composition Systems and Software (4-12); Newspapers and Magazines (13-20); Xtensions and Additions (21-25); Fonts: New Technology Keeps the Fires Burning (26-30) SGML Coming Into the Mainstream (31-37); SGML Tools: Microsoft Into the Act (32-35); Other Authoring Tools (35-38); Other Document Conversion Services and Tools (37); The Great Platform Debate Continues (38-50).

The special coverage of SGML publishing tools (pages 31-37) includes a major discuss of Microsoft's SGML Author for Word, including companion products for Author. The companion products include Avalanche SureStyle [conversion of text with direct formatting into SGML constructs, additional processing of tables, cross-refrences, OLE embedding information] and SoftQuad Enactor [cleans up SGML errors in Author, including support for SGML constructs not implemented in Author]. Other SGML products reviewed include: SoftQuad Enabler [SGML support for Quark Express], SoftQuad Explorer [SGML browser], SoftQuad HoTMetaL [HTML editor], ArborText's PowerPaste [SGML import facility] and Adept Electronic Review [SGML-based document review tools]; Auto-Graphics' Smart Editor version 5 [SGML-based editorial system]; Frame SGML Toolkit; Nice Technologies' TagWizard [Microsoft Word SGML tagging tool] and AIMS [IETM preparation software]. Also noted: DCL (Data Conversion Laboratories) product SGMLView [conversion tool] and Exoterica OmniMark 4.2 [SGML conversion/translation facilities].



[CR: 19950925]

Seybold Publications. Seybold Special Report. Show Preview: Seybold San Francisco. Seybold Report Editors Name Their Hot Picks. Seybold Special Report Volume 4, Number 1. Media, PA: Seybold Publications, September 26, 1995. ISSN: 1069-7217.

This issue of the Seybold Special Report is dedicated to a preview of publishing software (including SGML products) to be exhibited at the Seybold San Francisco 1995 show. Some SGML highlights include: ArborText (Adept Editor); Auto-Graphics (Smart Editorial System 5.3, and Impact (SGML document searching); Electronic Book Technologies (EBT) DynaWeb 1.0, figleaf, and WebTap (HotJava Applet); Exoterica OmmiMark release 2.5; FrameMaker +SGML; Infrastructures for Information (I4I), SGML DLL toolkit; Novell WordPerfect 6.1 for Windows, SGML Edition; Passage Systems' PassageHub (SGML conversion tool based upon Exoterica's Corporation's OmniMark) and PassagePro; XSoft Astoria; SoftQuad's SGML Enabler. See the main conference and exposition entry for further details, or the Seybold Publications main entry.



[CR: 19951209]

Seybold Publications. Seybold San Francisco '95, Part III. Part III: Color Workflow, Image Databases, Page Layout, SGML, Other Topics. Seybold Special Report, Volume 4, Number 4. Media, PA: Seybold Publications, November 10, 1995. ISSN: 1069-7217.

One section of this Special Report is dedicated to the SGML scene visible at SSF '95. The section is titled "Stability and Growth for SGML Market (pages 31-34). Subsections cover several new releases and announcements:

  • Arbortext expands Windows authoring functionality: Adept smart tag insertion, user control over screen formatting, generation of HTML from SGML, support for more graphics data types; DTDs for HTML 2.0, Docbook 2.2.1, ATI Article, and ATI Book
  • Auto-Graphics: records are not just for reference books: use of the SGML Smart Editorial System 5.3 for maintenance manuals
  • FrameMaker+SGML nears release: better table handling
  • I for I shows SGML services toolkit: SAS (SGML application server)
  • Microstar completes Word add-on: Near & Far Author (Word 6.0 SGML add-on)
  • Miles eases SGML composition: Genera composition facilities
  • Passage Systems shows how to search SGML files on the Web: presentation of the SGML Search Engine and PassageHub (SGML filters)
  • Xyvision readies new PDM version: version 2.3 of Parlance Document Manager to debut at the November Documation conference



[CR: 19960409]

Seybold Publications. Seybold Seminars Boston '96 [February 27 - March 1, 1996. Boston, MA]. Part I. Seybold Seminars Boston '96: When Worlds Collide.. Seybold Special Report, Volume 4, Number 8. Media, PA: Seybold Publications, March 25, 1996. ISSN: 1069-7217.

This issue of the Seybold Special Report contains part 1 of three parts covering Seybold Seminars Boston '96: "State of the Industry, Iinternet Publishing, and Color Output." Several articles and notices in the report provide updates on SGML products and publishing trends that are impacted by SGML. Samples: "Seybold Editors' Awards: Electronic Book Technologies, for DynaWeb" (p. 8); "EBT Shows Netscape Plug-ins," (p. 18); "Jouve Releases GTI Publisher" (p. 20).



[CR: 19970726]

[Seybold Publications Staff]. "Grif Commits to XML Editing Tool." Seybold Report on Internet Publishing 1/9 (May 1997) 37. ISSN: 1090-4808.

[Summary] "...Grif has stepped forward as the first vendor to commit to developing an XML authoring tool. It will be receiving help from Cadmus, one of the largest U.S. suppliers of services to journal publishers. . .The early adoption of XML is a further indication that Grif intends to leverage its SGML expertise in the wider market of Web authoring. Grif also announced plans to open an office in Boston, its first in the U.S. . .At the WWW '97 conference held last month in Santa Clara, CA, Grif announced its intention to adopt XML in its product line. It previewed XML support in both its SGML Editor and Symposia, its HTML authoring tool. Symposia, based on Grif's WYSIWG SGML Editor, provides both WYSIWYG and tag-based editing, with tag validation. It runs on both Windows and Unix platforms."



[CR: 19970815]

[Seybold Publications Staff]. "Web Publishing Systems Struggle for Identity. Seybold Seminars '97. Getting Down to Brass Tacks. SGML and the Web." Seybold Report on Internet Publishing 1/9 (May 1997) 23-24 [1-24]. ISSN: 1090-4808.

A comprehensive report on New York Seybold Seminars '97 is provided by Peter Dyson, Matt McKenzie, Victor Votsch, and Mark Walter. The article concludes with a section on SGML and the Web, covering products from Agave, Inso, and SoftQuad: "Agave links SGML and SQL", "Inso serves books through DynaWeb", and "SoftQuad: success with Panorama". Agave's SQml extensions to SGML are implemented in an SQml-CGI server, and link documents and legacy databases. Inso still has a large market share in the realm of serving electronic books (via DynaWeb), providing control down to the SGML element level. "The DynaWeb server is now client-aware, meaning it will serve html differently, depending on styles you set for different types of browsers. For example, DynaWeb can generate CSS style sheets for HTML documents on the fly. SoftQuad Inc. has released the SoftQuad Panorama Publishing System, and is marketing it for SGML-based document delivery on intranets. "SoftQuad reports installations at Hitachi, the U.S. Department of Defense and one of 35,000 seats at BellSouth. SoftQuad also recently got its Panorama Publishing System included on the U.S. Government's GSA schedule, making it easier for government agencies to order the system."

[Summary:] "For those who have encoded their text in SGML the Web remains an economical output medium. New York was our first opportunity to see Agave, which marries SGML to relational databases. We also looked at new versions of two different, and established, display options: DynaWeb and Panorama."



[CR: 19960826]

SGML Open. "SGML in Education: The TEI and ICADD Initiatives." Computers in Libraries 16/3 (March 1996) 26-28.

"Abstract: SGML Open is a group promoting adoption of the Standard Generalized Markup Language for exchange of data and documents as the international standard. Two groups working in the academic field to adapt and use SGML are the Text Encoding Initiative and the International Committee for Accessible Document Design. TEI uses SGML to encode literary and historical texts and ICADD makes them accessible to blind researchers and other impaired students. Both initiatives are discussed."

[Another abstract: "SGML Open [http://www.sgmlopen.org/] is a consortium dedicated to promoting the use of SGML, an ISO standard for data encoding that enables value-added, reusable, platform-independent documents. This article highlights two international efforts which are using SGML. TEI (Text Encoding Initiative) provides guidelines for encoding literary and historical texts. The TEI guidelines are meant to be flexible and scalable, able to accommodate any body of text and delimit salient features with markup, adding intelligence and meaning. ICADD (International Committee for Accessible Document Design) focuses on making textbooks available in formats such as Braille, large print and voice synthesis. SGML encoding not only provides structured access to documents that could otherwise be unavailable, but also makes that access more democratic." [-- CJC Campbell Crabtree in Current Cites Volume 7, no. 4, April 1996, published by The Library, University of California, Berkeley; ISSN: 1060-2356.]



[CR: 19960310]

The SGML University Board of Regents. SGML Power Tools. Net-Virtual Location in Cyberspace [probably Denver, Colorado or Rochester, New York]: SGML University Press, 1995. ISBN: 0-9649602-0-6.

Abstract: "A CD-ROM full of information, applications, software demonstrations, and other resources needed to get started using SGML. The top companies in the SGML industry provide information about their products. Some have included demonstration software or outright free software on the disc. Also, the world's first Multimedia SGML Tutorial (number one in a series of five) is on the disc."

"SGML University is making this disc free for legitimate users who need to know more about SGML. To get your free copy, send an e-mail mesage telling us about your interest in SGML. Be sure to include your address and phone number. Qualified respondents will receive the disc immediately."

See further information on the SGML University WWW page.



International SGML Users' Group. "A Brief History of the Development of SGML." 3 June 1989. 2 pages.

The publication is available from the SGMLUG office as a separate document, and is printed in the SGML Users' Group Newsletter 14 (October 1989) 6-7. Being free of copyright restrictions, it it also published elsewhere: (1) The SGML Handbook, cited here, Appendix A: pp. 567-570; (2) The SGML Source Guide, also cited; (3) Joan Smith's Book on SGML and Related Standards, Appendix 1.



"SGML Open Establishes SGML/Internet Link." <TAG> 7/11 (November 1994) 5. ISSN: 1067-9197.



[CR: 19950716]

SGML Project (Exeter). What is SGML and Why Should I Use It? Exeter, UK: SGML Project, 1993. Extent: approximately 4 pages.

This brief document provides an excellent overview of SGML using non-technical language. It is available online from the University of Exeter WWW server: see the link to Exeter, or fetch the the document in mirror copy from the local server. It was probably written by Michael G. Popham and/or Paul A. Ellison, both of whom are to be praised for their fine work in administrating the SGML Project as long as funding was available to them. [Possible] contact: Paul A. Ellison, email P.A.Ellison@ex.ac.uk; Deputy Director, University of Exeter IT Services; Laver Building; North Park Road; Exeter EX4 4QE, UK; Tel: (+44) 1 392 263951; Fax: (+44) 1 392 211630.



"SGML Tips & Techniques: Using Noun and Verb Tags to Effect Proper Hyphenation." <TAG> 8/3 (March 1995) 10. ISSN: 1067-9197.

[Bibliographer's note:] The article is significant from the academic point of view in that the "tip" and its motivation arose (apparently) within the business sector -- not, as one might have guessed, within the TEI. How do we achieve proper (automatic) hyphenation when the hyphenation rule depends upon the word's part-of-speech attribute? The author suggests that for homographs like English word "project" we might use the following kind of SGML tagging (e.g., when the word is a noun and not a verb): <word type='noun'>project</word>. According to the tip's proposal "The SGML application can then take the words inside the WORD tag and pass appropriate instructions to the composition engine for proper hyphenation." In the case of the verb, we want 'pro-ject' while the noun is to be hyphenated 'proj-ect'. Whatever the merits of the proposal as a practical solution, it highlights an observation that has been made frequently within the segment of the academic community that has seen the value of SGML: text processing will fail if it treats textual data as a String rather than as Text Objects with linguistic attributes. From an Object perspective, the markup is correct in two ways: it delimits and names the text object ("word"), and it describes a real feature of the object in context ("noun"): knowledge from another domain is brought to bear when the word-noun needs to be hyphenated. This high-level strategy, from an Object point of view, represents an improvement over the procedural encoding that one finds in some word processing systems: control characters or other special characters to encode discretionary hyphen. Of course, SGML markup is just one way to represent information about text objects like "words" such that proper processing is effected.



[CR: 19960730]

Shafer, Keith. Creating DTDs via Fred. Paper presented at Digital Libraries Workshop 1996, Organized by Nancy Ide and Judith Klavans, Held in conjunction with the First ACM International Conference on Digital Libraries, Bethesda, Maryland. Poughkeepsie, New York / New York, NY: Vassar College, Department of Computer Science / Columbia University, Department of Information Services, 1996. Author's affiliation: OCLC Online Computer Library Center, Inc., 6565 Frantz Road, Dublin, Ohio 43017-3395. Email: shafer@oclc.org.

Abstract: "In this paper, we motivate and describe tools we have built to automatically create reduced structural representations of tagged text. These tools are novel in that they let one use the basic tenants of SGML without creating DTDs by hand." [Abstract]

"While the TEI Guidelines and corresponding DTD work provide a good framework from which to tag text, it is possible that the application of these guidelines may result in the creation of documents with no corresponding DTD. When this happens, mechanisms need to be in place to help them generate the appropriate DTDs. This does not imply that the TEI work is incomplete or non-extensible, only that it is difficult to provide a single framework (or set of DTDs) that covers all electronic sources. Many people now know how to tag documents and they may even follow the TEI Guidelines, but some will make mistakes or need to extend the model." [extracted]

The document is available online: ; [mirror copy]. See the main workshop entry or the program listing for other workshop details.



Shafer, Keith E. "SGML Grammar Structure." Annual Review of OCLC Research, July 1992 - June 1993 ? (1993) 39-40. Senior Research Scientist, Online Computer Library Center, Inc. (OCLC).



[CR: 19951207]

Shafer, Keith. Creating DTDs via the GB-Engine [General Grammer Builder] and Fred. Paper presented at SGML '95. Dublin, Ohio>: OCLC Online Computer Library Center, Inc., 1995. Extent: approximately 14 pages. Author's affiliation: OCLC.

"Abstract: In this paper, we motivate and describe tools we have built to automatically create reduced structural representations of tagged text. These tools are novel in that they let one use the basic tenants of SGML without creating DTDs by hand."

Available on the Internet: Creating DTDs (SGML '95) [mirror copy, December 1995. See the OCLC Fred entry or the OCLC Fred Home Page for other details.



Shafer, Keith E. "Manipulating Tagged Text." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 3 pages. Author Affiliation: Senior Research Scientist, OCLC.

"Abstract: While the Standard Generalized Markup Language (SGML) is intended to offer freedom from vendor-dependent data, it is difficult to translate arbitrary SGML into multiple output formats. To address this problem, we have incorporated translation capabilities into the SGML Grammar Builder project."

Available via the Internet on the OCLC WWW server. [mirror copy, text only]



Shafer, Keith E. "Translating Mathematical Markup for Electronic Journals." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 4 pages. Author Affiliation: Senior Research Scientist, OCLC.

"Abstract: While there is now an international standard for mathematical markup, no systems produce formatted documents from the complete standard. This report describes how mathematical markup is translated at OCLC."

"OCLC's Electronic Journals Online (EJO) provides a typeset quality display of journal articles via the Guidon document viewer. Guidon formats files coded in the TeX typesetting language to produce online pages. EJO accepts, however, source documents marked up via the Standard Generalized Markup Language (SGML); thus, source documents must be translated to TeX to produce displayable files. To facilitate this translation to TeX, we added translation capabilities to the SGML Grammar Builder interpreter, Fred. (See "Manipulating Tagged Text" for an overview of Fred's translation processes.) The goal of this project was to use Fred to translate the set of tagged structures that comprise the international standard for SGML mathematical markup (found in ISO 12083) to TeX for use in EJO." [extracted]

Available via the Internet on the OCLC WWW server. [mirror copy, text only]



[CR: 19970312]

Shepherd, Michael A.; Watters, Carolyn R.; Burkowski, Forbes J. "Digital Libraries for Electronic News." Pages 55-62 in Digital Ribraries: Research and Technology Advances. ADL '95 Forum. Selected Papers. Forum on Research and Technology Advances in Digital Libraries, ADL '95. McLean, Virginia, USA, May 15-17, 1995. Sponsored by NASA. Edited by Adam, Nabil R.; Bhargava, Bharat K.; Halem, Milton; Yesha, Yelena. Lecture Notes in Computer Science, volume 1082. Berlin/Heidelberg, Germany: Springer-Verlag, 1996. ISBN: 3-540-61410-9. ISSN: 0302-9743. Authors' affiliation: [Shepherd:] Department of Mathematics, Statistics, and Computer Science, Dalhousie University, Canada, Email shepherd@cs.dal.ca and Web: Dalhousie's multimedia news research - http://bcr2.uwaterloo.ca/dal/; [Watters:] Jodrey School of Computer Science, Acadia University, Canada, Email cwatters@dragon.acadiau.ca; [Burkowski:] Department of Computer Science, University of Waterloo, Canada, Email fjburkow@plg.uwaterloo.ca.

Discussion of electronic news evaluates the "semantic attributes of news items" specified in the Universal Text Format (UTF). This standard, which used SGML encoding, was established by industry bodies as NITF (the News Industry Text Format). The paper was presented in the conference as part of the session "Visualization in Digital Libraries." The document is available electronically on the Internet in Postscript format; [mirror copy]



[CR: 19980430]

Thompson, Henry S.; Anderson, A. H.; Bader, M. "Publishing a Spoken and Written Corpus on CD-ROM: The HCRC Map Task Experience." Pages 168-182 in Spoken English on Computer. Transcription, Mark-up, and Application. Edited by Geoffrey N. Leech, Greg Myers, and Jenny Thomas. New York, London, and Essex, England: Longman, 1995. ISBN: 0582250218.

For additional references and a more recent project description, see David McKelvie, Cris Drew, and Henry S. Thompson: "Using SGML as a Basis for Data-Intensive Natural Language Processing [NLP]." See also the database main entry: The HCRC Map Task Corpus.



[CR: 19960226]

Thompson, Henry S.; Finch, Steve; McKelvie, David. The Normalised SGML Library (NSL). LRE Project 62-050 MULTEXT. Workpackage 2. Milestone C, Deliverable NSL.. Edinburgh, Scotland: Human Communication Research Centre, November 14, 1995. Extent: 38 pages, 2 references. Author's affiliation: Human Communication Research Centre, Edinburgh..

Abstract: "This document describes the Normalised SGML Library (NSL), which consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents."

From the author's notice: "In pursuit of a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation, LTG have developed an integrated set of SGML tools and a developers tool-kit, including a C-based API. This software described here contains everything required to process a very wide range of conformant SGML documents. Its initial parsing module incorporates v1.0.1 of James Clark's SP software, arguably the broadest coverage SGML parser available anywhere, commercial or not.

"The basic architecture is one in which an arbitrary SGML document is processed on the way in, as it were, yielding two results: 1) An optimised representation of the information contained in the document's DOCTYPE; 2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc. The use of the cached DOCTYPE together with the normalisation of the SGML to nSGML means that applications processing nSGML streams can be very efficient."

This work comes out of the MULTEXT project. See the links to NSL (documentation and distribution) from Henry Thompson's Home Page. Mirror copy (of November 14, 1995 version).



[CR: 19951207]

Shafer, Keith; Thompson, Roger. Introduction to Translating Tagged Text via the SGML Document Grammar Builder Engine. OCLC Technical Report. Dublin, Ohio>: OCLC Online Computer Library Center, Inc., 1995. Extent: approximately 18 pages. Authors' affiliation: OCLC, 6565 Frantz Road, Dublin, Ohio 43017-3395.

"Abstract: While the Standard Generalized Markup Language (SGML) promises freedom from proprietary data formats, it is still difficult to translate arbitrary SGML data to other formats. To address the SGML translation needs at OCLC, we have added translation capabilities to the SGML Document Grammar Builder programming engine. Several systems incorporate this programming engine, including our Fred interpreter. In this paper, we describe the general translation capabilities of this programming engine and relate it to Fred."

Available on the Internet: Introduction to Fred Translation [mirror copy, December 1995. See the OCLC Fred entry or the OCLC Fred Home Page for other details.



[CR: 19960110]

Shafer, Keith; Thompson, Roger. Translating Mathematical Markup for Electronic Documents. OCLC Technical Report. Presented at the WWW4 Conference, Boston, December 1995. Dublin, Ohio>: OCLC Online Computer Library Center, Inc., 1995. Extent: approximately 18 pages. Authors' affiliation: OCLC, 6565 Frantz Road, Dublin, Ohio 43017-3395.

"Abstract: In this paper, we describe a general translation tool that can transform tagged text into arbitrary output formats. Specifically, we describe how OCLC makes scientific documents containing mathematical markup available on the World Wide Web. The translation capabilities we developed to do this help realize the potential of the Standard Generalized Markup Language (SGML) to provide users with a single, non-proprietary document representation that can be translated on demand to other output formats. This enables publishers who target the WWW as a delivery medium to use the latest advances in HTML without constant revision of their document archives."

Available on the Internet: "Translating Mathematical Markup into HTML" [mirror copy, January 1996, or from: http://www.w3.org/pub/Conferences/WWW4/Papers/177/. See the OCLC Fred entry or the OCLC Fred Home Page for other details.



[CR: 19951113]

Shaw, Alan C. "Structure Editor Generators for Documents, Programs, and Other Structured Data." Pages 30-50 in Protext III. Proceedings of the Third International Conference on Text Processing Systems. International Conference on Text Processing Systems. Trinity College, Dublin. 22-34 October, 1986. Edited by J. J. H. Miller. Dublin, Ireland: Dún Laoghaire, Co., Boole Press Ltd., January 1987. ISBN: 0-906783-55-0 (hardback); 0-906783-56-9 (paperback).



Shaw, Alan C.; Furuta, Richard K.; Scofield, J. "Document Formatting Systems: Survey, Concepts and Issues." Pages 47-52 (with 20 references) in International Conference on Research and Trends in Document Preparation Systems. Abstracts of the Presented Papers. Conference on Research and Trends in Document Preparation Systems, Lausanne, Switzerland, February 27-28, 1981. Supported by the [Swiss] Conseil des Ecoles Polytechniques Fédérales, Organized by the Swiss Federal Institutes of Technology. J. D. Nicoud, Program Chair. Lausanne/Zürich: Swiss Federal Institutes of Technology, 1981. v + 130 pages. Authors' affilation: FR-35 University of Washington, Department of Computer Science, Seattle, WA USA 98195.



[CR: 1995]

Shaw, Elizabeth. OCR and SGML Mark-up of Documents from the Making of America Project. Report on a Directed Field Experience at Humanities Text Initiative. Humanities Text Initiative (HTI) Technical Report. Ann Arbor, MI: University of Michigan HTI, . Extent: approximately 12 pahes. Author's affiliation: University of Michigan.

Overview: "The purpose of this project has been to explore the feasibility and costs of doing an automated OCR (optical character recognition) conversion of scanned TIFF images for the Making of America Project and automating initial SGML mark-up of the documents. . .Using the automation that we have developed, we can process a CD-ROM with approximately 4,000 pages into roughly marked up documents with an average of less than 2 hours of human intervention per CD-ROM. Moving that unproofed rough markup to a finished valid SGML document takes an additional 2-3 minutes per page for mark-up and 8-9 minutes per page for proofing and entering corrections. Documents with significant differences (two column formats or a significant number of images) from the norm would take additional processing time. However initial analysis of the documents indicates that these anomalies are in the minority - ranging from 0 to 3 documents per CD-ROM. In addition, most of the two column documents are less than 30 pages in length."

For more information see the Making of America Project

Available online: http://dns.hti.umich.edu/htistaff/pubs/1997/ejshaw.01/: "OCR and SGML Mark-up of Documents from the Making of America Project. Report on a Directed Field Experience at Humanities Text Initiative." By Elizabeth Shaw, December, 1996; [mirror copy]



[CR: 19971024]

Shaw, Elizabeth J.; Blumson, Sarr. "Making of America. Online Searching and Page Presentation at the University of Michigan." D-Lib Magazine (July/August 1997). ISSN: 1082-9873. Authors' affiliation: Digital Library Project, University of Michigan.

Summary: "In this paper, we will describe the unique aspects of the first phase of the University of Michigan's implementation of the Making of America Project (http://www.umdl.umich.edu/moa/), a collaborative effort with Cornell University. Using "raw" uncorrected results of automated optical character recognition (OCR) of the page images, and SGML-encoding of the ensuing textual information in minimal Text Encoding Initiative (TEI) conformant markup, we can provide a searchable database of the roughly 650,000 page images that comprise our portion of the Making of America Project. We provide access to the page images on the Web without special viewing tools through a page delivery system that converts the requested pages from TIFF to GIF format on the fly. We will also describe how our approach will allow us to extend functionality as time and resources become available."

The article is available online in HTML format; local archive copy. Note that the July/August 1997 double issue of D-Lib Magazine (Amy Friedlander, editor) contains several articles referencing the use of SGML encoding in digital library research.



[CR: 19990414]

Dongwook Shin; Hyuncheol Jang; Honglan Jin. "BUS: An Effective Indexing and Retrieval Scheme in Structured Documents." Pages 235-243 (with 16 references) in Digital Libraries '98. Proceedings of the Third ACM Conference on Digital Libraries. Third ACM Conference on Digital Libraries. Pittsburgh, PA. June 23-26, 1998. Sponsored by ACM Siglink and SIGIR. Edited by Ian H. Witten, Rob Akscyn, amd Frank M. Shipman, III. New York, N.Y.: Association for Computing Machinery, 1998. ISBN: 0-89791-965-3. Authors' affiliation: Department of Computer Science, Chungnam National University, Taejon, South Korea. Email: shin@comeng.chungnam.ac.kr. Also [1999]: Visiting Scholar, Lister Hill National Center for Biomedical Communications .

Abstract: "In recent digital library systems or the World Wide Web environment, many documents are beginning to be provided in the structured format, tagged in mark up languages like SGML or XML. Hence, indexing and query evaluation of structured documents have been drawing attention since they enable to access and retrieve a certain part of documents easily. However, conventional information retrieval techniques do not scale up well in structured documents. This paper suggests an efficient indexing and query evaluation scheme for structured documents (named BUS) that minimizes the indexing overhead and guarantees fast query processing at any level in the document structure. The basic idea is that indexing is performed at the lowest level of the given structure and query evaluation computes the similarity at a higher level by accumulating the term frequencies at the lowest level in the bottom up way. The accumulators summing up the similarity play the role of accumulating all the term frequencies of the related part at a certain level. This paper also addresses the implementation of BUS and proves that BUS works correctly. In addition, along with several experiments, it shows that BUS facilitates efficient indexing in terms of space and time and guarantees the reasonable retrieval time in response to user queries."

"This paper proposes an indexing and query evaluation scheme (named BUS - Bottom Up Schenze) for structured documents that minimizes the indexing overhead and guarantees fast query response time. The basic idea is that indexing is performed at the leaf elements of the given structure and query evaluation computes the similarity at higher level by accumulating the weights at the lowest level in the bottom up way. It underlies the result of R. Wilkinson that 'the retrieval of whole documents can he carried out effectively using just their parts' in part and the idea of UID (Unique element IDentifier) that enables to compute ancestor element of a given element fast."

The Proceedings volume Table of Contents is available online. See also the main conference entry for Digital Libraries '98: Third ACM Conference on Digital Libraries. For an online version is available in Postscrit format; [local archive copy]. See: "BUS: An Effective Indexing and Retrieval Scheme in Structured Documents." Other references: BUS (Bottom Up Scheme) of indexing and retrieval for SGML/XML documents; "Looking for a partner in commercializing an efficient IR engine for SGML/XML data."



[CR: 19951221]

Shreve, Gregory M. "SGML Representation of Concept Systems -- Identifying, Tagging and Retrieving Term -- Concept Structures in Textual Context." Pages 157-168 (with 5 references) in Standardizing and Harmonizing Terminology: Theory and Practice, edited by S. E. Wright and Richard A. Strehlow. Philadelphia: American Society for Testing Materials [ASTM, Committee on Terminology], 1995. ISBN: 0803119844. 0066-0558 [ASTM Special Technical Publication, 0066-0558, volume 1223]. Author's affiliation: Kent State Univ, Kent, OH, USA.

"Abstract: The technical terminology used by the technical communicator or technical translator is encountered in texts. The terms in the texts are not randomly arranged but are used deliberately to invoke specific concept structures. SGML encoders and parsers can be used to identify and retrieve terminological structures in texts and help translators and terminologists better understand the relationship of terms in their textual context to abstract concept systems and knowledge organization." [abstract from author]



[CR: 19971106]

Siegel, David. "[Work in Progress: People & Projects.] The Web Is Ruined and I Ruined It." Pages 13-21 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: Verso.

Summary: "In 'The Web Is Ruined and I Ruined It' self-proclaimed HTML Terrorist David Siegel discusses how proper separation of structure (HTML), style (CSS), and semantics (XML) makes content more compelling and design more effective."

A version of this document is available online in HTML format: http://webreview.com/97/04/11/feature/index.html; or http://xent.ics.uci.edu/FoRK-archive/spring97/0381.html [local archive copy, text only].



[CR: 19961226]

Simon, Sheila D. "How To Make Data Sharing Work." Pages 77-80 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Encyclopædia Britannica, 310 South Michigan Avenue, Chicago, Illinois 60604, U.S.A.; Tel: +1 (312) 347-7064; FAX: +1 (312) 294-2187; Email: ssimon@eb.com.

Abstract: "Making data sharing work in a publishing system is not as easy as it sounds. There is much to take into consideration. I plan on discussing key points and factors that will enable you to have a better understanding of the concept of sharing data. I will also discuss what things need to be considered in deciding whether or not to share data. Also, key components will be defined as what is needed to make sharing data successful. Real life experience implementing SGML database systems that have the capability of sharing data is the basis of the following discussion."

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



Simons, Gary F. "The Computational Complexity of Writing Systems." Pages 538-553 in The Fifteenth Lacus Forum 1988. Edited by Ruth M. Brend and David G. Lockwood. Lake Bluff, IL: Linguistic Association of Canada and the United States, 1989.

Abstract: In this article the author argues that computer systems, like their users, need to be multilingual. "We need computers, operating systems, and programs that can potentially work in any language and can simultaneously work with many language at the same time." The article proposes a conceptual framework for achieving this goal.

Section 1, "Establishing the baseline," focuses on the problem of graphic rendering and illustrates the range of phenomena which an adequate solution to computational rendering of writing systems must account for. These include phenomena like nonsequential rendering, movable diacritics, positional variants, ligatures, conjuncts, and kerning.

Section 2, "A general solution to the complexities of character rendering," proposes a general solution to the rendering of scripts that can be printed typographically. (The computational rendering of calligraphic scripts adds further complexities which are not addressed.) The author first argues that the proper modeling of writing systems requires a two-level system in which a functional level is distinguished from a formal level. The functional level is the domain of characters (which represent the underlying information units of the writing system). The formal level is the domain of graphs (which represent the distinct graphic signs which appear on the surface). The claim is then made that all the phenomena described in section 1 can be handled by mapping from characters to graphs via finite-state transducers - simple machines guaranteed to produce results in linear time. A brief example using the Greek writing system is given.

Section 3, "Toward a conceptual model for multilingual computing," goes beyond graphic rendering to consider the requirements of a system that would adequately deal with other language-specific issues like keyboarding, sorting, transliteration, hyphenation, and the like. The author observes that every piece of textual data stored in a computer is expressed in a particular language, and it is the identity of that language which determines how the data should be rendered, keyboarded, sorted, and so on. He thus argues that a rendering-centered approach which simply develops a universal character set for all languages will not solve the problem of multilingual computing. Using examples from the world's languages, he goes on to define language, script, and writing system as distinct concepts and argues that a complete system for multilingual computing must model all three.

Availability: Offprints of this article are available from the author at the following Internet address: gary.simons@sil.org. The volume itself is available from LACUS, P.O. Box 101, Lake Bluff, IL 60044.

See a related version of the document on the SIL WWW server. For other information on CELLAR, see the main CELLAR page at SIL and (more recently) "Computing Environment for Linguistic, Literary, and Anthropological Research (CELLAR).".



[CR: 19950716]

Simons, Gary F. A Computing Environment for Linguistic, Literary, and Anthropological Research [CELLAR]: Technical Overview. CELLAR Project, Internal Report. Dallas, TX: SIL Academic Computing, July, 1988. Extent: approximately 21 pages. Author's affiliation: Summer Institute of Linguistics, Department of Academic Computing.

"In this document, I propose the conceptual architecture of a computing environment designed to meet the particular needs of linguists, literary scholars, and anthropologists. In short: we need to process textual information which is (1) multilingual, (2) structured, (3) multidimensional, and (4) integrated, with a database manager that is: (1) seamless, (2) self-validating, and (3) knowledge-based, in a user environment which is: (1) extensible and (2) iconic." [from the Introduction]

Available on the SIL WWW server. Other information about CELLAR is accessible from the main CELLAR page.



[CR: 19960715]

Simons, Gary F. A Conceptual Modeling Language for the Analysis and Interpretation of Text. TEI [Text Encoding Initiative] Working Paper AIW1q2, Committee on Text Analysis and Interpretation. Dallas, TX: Academic Computing Department, Summer Institute of Linguistics, March 10 1990. Extent: approximately 29 pages. Author's affiliation: Summer Institute of Linguistics; Email: Gary.Simons@sil.org.

Abstract: "This document proposes a conceptual modeling language which could provide a framework for designing encoding schemes for the linguistic analysis and interpretation of text. Note the focus on 'designing encoding schemes.' The December 1989 meeting of the TEI-ANA committee concluded that the requirements for encoding linguistic analysis of text are considerably more complex than the requirements for encoding the text itself. While the metalanguage built into SGML (namely, the language for document type definitions) is adequate for expressing the design of the encoding for the text itself, it is not adequate for expressing the design of encoding for linguistic analysis. The committee thus concluded that we needed to begin by designing a metalanguage that would allow us to express the design of encoding schemes for text analysis. This document seeks to explain why this is needed and then gives an initial proposal for such a metalanguage with examples from two domains."

Note: some of the design principles articulated in this 1990 working paper find expression in the TEI's feature-structure markup, with its mechansm for feature-structure declaration. On TEI feature structures, see Simons, "Implementing the TEI's Feature-Structure Markup by Direct Mapping to the Objects and Attributes of an Object-Oriented Database System", below.



[CR: 19970421]

Simons, Gary F. "Conceptual Modeling Versus Visual Modeling: A Technological Key to Building Consensus." Computers and the Humanities (CHUM) 30/4 (1996/1997) 303-319 (with 16 references). ISSN: 0010-4817. Author's affiliation: Director of Academic Computing, Summer Institute of Linguistics; Email: Gary.Simons@sil.org.

Abstract: "Debate has long been a hallmark of the academic endeavor. The recent introduction of computers into academic life has not been the deus ex machina to bring sudden resolution to these debates. There is a new computing technology, however, that has some promise in this regard. It is called conceptual modeling. This paper (see endnotes) demonstrates how a computer-based model of a problem domain can lead to consensus when competing approaches to the domain can be encapsulated in different visual models that are applied to the same underlying conceptual model."

This published article is based upon the author's presentation at the 1994 Paris ACH/ALLC Meeting, "Consensus ex Machina" (Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computing and the Humanities Paris, 19 - 23 April 1994).

A version of the paper is also available online: connect via HTML client to the SIL WWW server (http://www.sil.org/cellar/ach94/ach94.html). See the associated bibliographic entry for discussion of the article's particular relevance to SGML.



[CR: 19950716]

Simons, Gary F. "Conceptual Modeling Versus Visual Modeling: A Technological Key to Building Consensus." Pages 217-218 [partial abstract] in Colloque International "Consensus ex Machina?" Abstracts. International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratoire "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. Extent: 244 pages. Author Affiliation: Summer Institute of Linguistics, Department of Academic Computing.

The paper does not treat SGML as a central issue, but demonstrates how an SGML view (linear representation) of linguistic information can be generated from an object-oriented knowledgebase which understands the data in its own terms semantically, and how to render the information with SGML tag and attribute according to a DTD.

The full text of the presentation is to appear in a volume of the series Research in Humanitites Computing (Oxford University Press). It is also currently available online: connect via HTML client to the SIL WWW server (http://www.sil.org/cellar/ach94/ach94.html). [Note also the report on the ALLC/ACH '94 Conference.]



[CR: 19960714]

Simons, Gary F. "Implementing the TEI's Feature-Structure Markup by Direct Mapping to the Objects and Attributes of an Object-Oriented Database System." Pages 111-114 [extended abstract] in ACH/ALLC '95: The 1995 Joint International Conference. Conference Abstracts, Posters and Demonstrations. ACH/ALLC '95 Joint International Conference, July 11-15, 1995. Santa Barbara, California: University of California/ACH/ALLC, 1995.

The paper describes "how a generalized implementation of TEI feature-structure markup has been achieved by extending an object-oriented database system [CELLAR] to use TEI-style <fs> [feature-structure] tagging as a possible format for the representation of its objects." Initial points in summary: (1) "Feature structures can encode information of nearly any sort. This is because they are just instances of the more general data structure referred to by Donald Knuth as 'nodes' (here called feature structures) and 'fields' (here called features) [D. E. Knuth, The Art of Computer Programming 1:462, 1968]. . ." (2) "Feature structures with features are thus analogous to records with fields, objects with attributes, frames with slots, property lists with properties, and abstract data types with access functions. . ."

From the document Conclusion: "This paper has demonstrated that:" (1) "The TEI's feature-structure markup can be implemented by direct mapping onto the objects and attributes of an object-oriented database system." (2) "The CELLAR system, with its user-definable views for export formatting and parsers for import processing, has proven able to do this task." (3) "The FSD [Feature Structure Declaration of TEI} formalism has the potential for serving as a lingua franca among database systems for the interchange of basic data models."

See the article by D. Terence Langendoen and Gary Simons, "Rationale for the TEI Recommendations for Feature-Structure Markup," pages 191-209 in The Text Encoding Initiative: Background and Contents, edited by Nancy Ide and Jean Véronis [= Computers and the Humanities 29/3, 1995]. The feature structure notation is defined for the TEI in chapter 16 of the Guidelines for Electronic Text Encoding and Interchange; link to chapter 16 online via Electronic Book Technologies or link via UVA.



[CR: 19971125]

Simons, Gary F. Importing SGML data into CELLAR by means of architectural forms. SIL Academic Computing Working Paper. Dallas, TX: Summer Institute of Linguistics, November 12, 1997. Extent: approximately 7 pages [main document] with several subsidiary documents. Author's affiliation: SIL Academic Computing; Email: Gary.Simons@sil.org.

Abstract: "This working paper documents a process for importing SGML data into the CELLAR database. The process, which requires no change to the SGML data and no special-purpose programming on the CELLAR side, is based on a relatively new SGML feature named architectural forms. The user writes a meta-DTD that maps the elements in the SGML data onto architectural forms that express the corresponding objects and attributes in CELLAR. Then an SGML parser uses this to create an 'architectural document' that an existing CELLAR parser reads to build the corresponding structure of objects in the CELLAR database."

"This electronic working paper gives the full details of work that has been presented in two conference papers. Provisional references: (1) Proceedings of SGML/XML '97, Washington, D. C., 8-11 December 1997, and (2) "Using Architectural Forms to Map TEI Data Into an Object-oriented System," in TEI Tenth Anniversary Users' Conference: Conference Abstracts, Providence, R.I., 14-16 November 1997. The abstract for this TEI10 document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/Simonspaper.html; [local archive copy]; see also the full bibliography entry.

The working paper is available online in HTML format. Further information about the CELLAR Project may be found on the SIL server. For other information on SGML architectures, see the database entry Architectural Forms and SGML Architectures.



[CR: 19971018]

Simons, Gary F. "Mapping from objects to markup: a springboard for multiple-strategy electronic publishing." Pages 151 - 153 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: Summer Institute of Linguistics, Email: gary.simons@sil.org .

[Extract:] "This paper reports on the experience of the Summer Institute of Linguistics in developing electronic publishing solutions for its LinguaLinks product (SIL 1996). LinguaLinks is an electronic performance support system designed to assist field workers with a wide range of tasks related to language learning, language analysis, and language development. The paper first introduces the LinguaLinks model of performance support and CELLAR -- the object-oriented database system that is used to implement it. Our approach to electronic publishing is to first build the information as a structure of objects in the database, and then to use multiple CELLAR stylesheets to map the information onto multiple markup schemes. The object database thus serves as a springboard that allows us to vault the information into any number of formats for publishing. The paper illustrates this approach to electronic publishing by focusing on one application area that LinguaLinks supports, namely, lexical database development. It first shows how the tutorial and reference documents that give help on how to build a dictionary are mapped onto different markup schemes for publication as a Folio Views infobase, a Windows help system, and an HTML Web document. It then shows how the dictionaries that are built by using LinguaLinks are mapped onto HTML markup to provide a display format on the Web and onto TEI markup to provide a richer format for information interchange and archiving."

Abstract available online in HTML format: "Mapping from objects to markup: a springboard for multiple-strategy electronic publishing", by Gary F. Simons; [archive copy]. Further information on CELLAR is available via the SIL Web server. Note that the author will present a paper at the SGML/XML '97 Conference on the use of architectural forms to achieve mapping of SGML data into databases: "Using architectural forms to map SGML data into an object-oriented database."

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.



[CR: 19980606]

Simons, Gary F. "The Nature of Linguistic Data and the Requirements of a Computing Environment for Linguistic Research." Pages 10-25 (Chapter 1) in Using Computers in Linguistics. A Practical Guide. Colloquium: Computing and the Ordinary Working Linguist [Linguistic Society of America]. Philadelphia, 1992. Edited by John Lawler (Program in Linguistics, University of Michigan) and Helen Aristar Dry (Linguistics Program, Eastern Michigan University). London/New York: Routledge, [March] 1998. ISBN: 0-415-16792-2 (hardback) and 0-415-16793-0 (paper). Author's affiliation: Gary F. Simons is the Director of Academic Computing in the Summer Institute of Linguistics.

Summary: Simons "discusses language data and the special demands which it makes on computational resources. As Simons puts it: 1) The data are multilingual, so the computing environment must be able to keep track of what language each datum is in, and then display and process it accordingly; 2) The data in text unfold sequentially, so the computing environment must be able to represent the text in proper sequence; 3) The data are hierarchically structured, so the computing environment must be able to build hierarchical structures of arbitrary depth; 4) The data are multidimensional, so the computing environment must be able to attach many kinds of analysis and interpretation to a single datum; 5) The data are highly integrated, so the computing environment must be able to store and follow associative links between related pieces of data; 6) While doing all of the above to model the information structure of the data correctly, the computing environment must be able to present conventionally formatted displays of the data. This chapter prefigures most of the major themes that surface in the other chapters [of the book], and contains some discussion of the CELLAR prototype computing environment now under development by SIL. It should be read first, and in our opinion it should be required reading for anyone planning a research career in linguistics." [from the volume editors]

Simons discusses SGML in section 1.3, "The Hierarchical Nature of Linguistic Data." See the online Appendix for this chapter, with many links to Internet resources. A related version of the full paper is also online: see the following bibliographic entry.

An introduction and overview of the book may be found on the Routledge web site and [provisionally] at the University of Michigan. An online Table of Contents is provided, as well as an online appendix for each chapter in the book.



[CR: 19990114]

Simons, Gary F. The Nature of Linguistic Data and the Requirements of a Computing Environment for Linguistic Research. Paper accepted for publication in: Computers and the Ordinary Working Linguist, edited by John Lawler and Helen Dry. Draft of 28 July 1993. Dallas, TX: SIL, Academic Computing Department, July, 1993.

"This paper was originally drafted in 1993 as a chapter for a book proposed by Lawler and Dry, Computers and the Ordinary Working Linguist. The version presented here is a revision that was published in 1996 as an article in the journal Dutch Studies on Near Eastern Languages and Literature, volume 2, number 1, pages 111-128. (Note, however, that the bibliography has been annotated to add Web links and updated to report the eventual details of works originally cited as 'forthcoming'.) The book finally came out in 1998 with a new title and the paper was further revised and expanded by about 20%. The citation for the full published version is: Simons, Gary F. 1998. The nature of linguistic data and the requirements of a computing environment for linguistic research. In Using Computers in Linguistics: a practical guide, John M. Lawler and Helen Aristar Dry (eds.). London and New York: Routledge. Pages 10-25. Routledge maintains a Web site for the book which includes an on-line appendix that gives links to many information resources that are relevant to topics covered in this paper."

Section 3 of the document ("The hierarchical nature of linguistic data") discusses the important role played by SGML in focusing attention upon the hierarchical nature of many literary and linguistic data.

The paper is available in HTML format on the SIL WWW server. See also the preceding bibliography entry.



[CR: 19971227]

Simons, Gary F. "Using Architectural Forms to Map SGML Data Into an Object-Oriented Database." Pages 449-460 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Gary F. Simons]: Director of Academic Computing, Summer Institute of Linguistics, Dallas, TX 75236; Email: gary.simons@sil.org; Phone: +1 (972) 708-7418; FAX: +1 (972) 708-7363.

Abstract: "This paper develops a solution to the problem of importing existing SGML data into an existing object-oriented database schema without changing the SGML data or the database schema. After investigating the general problem of where the mismatch lies between the SGML model and the object model, the paper proposes a solution based on architectural processing. Two meta-DTDs are used, one to define the architectural forms for the object model and another to map the existing SGML data onto those forms."

"Much of the promise of SGML lies in the fact that descriptively marked up data can be used by multiple applications. Given the fact that an SGML DTD has much in common with the conceptual model that results from an object-oriented analysis of a problem domain, it is logical to conclude that SGML data should be particularly amenable to being imported into software that uses an object-oriented data model. This is not a trivial task, however, since there are some fundamental differences between the SGML model of data and the object model.

"The paper explores this general problem as it develops a solution to a more specific problem, namely, how to import existing SGML data into an existing object-oriented database schema without changing either the SGML data or the database schema. The target system is an object-oriented database system named CELLAR (for Computing Environment for Linguistic, Literary, and Anthropological Research). The solution uses architectural processing to map the SGML data onto architectural forms that the CELLAR system can use to construct the appropriate structure of objects."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

See the related online presentation by G. Simons, Importing SGML data into CELLAR by means of architectural forms, published as an SIL Academic Computing Working Paper; also, "Using Architectural Forms to Map TEI Data Into an Object-oriented System,", as pages 123-129 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative: Abstracts, from the conference of November 14-16, 1997 at Brown University.

Further information on architectural forms processing and SGML architectures is available in the dedicated database section of the SGML/XML Web Page, "Architectural Forms and SGML Architectures."

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971205]

Simons, Gary F. "Using Architectural Forms to Map TEI Data Into an Object-oriented System." Pages 123-129 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Department of Academic Computing, Summer Institute of Linguistics; Email: Gary.Simons@sil.org.

Abstract: "This paper develops a solution to the problem of importing existing TEI data into an existing object-oriented database schema without changing the TEI data or the database schema. After investigating the general problem of where the mismatch lies between the SGML model and the object model, the paper proposes a solution based on architectural processing. Two meta-DTDs are used, one to define the architectural forms for the object model and another to map the existing SGML data onto those forms. A full example using a critical text in TEI markup is developed."

[from the Introduction]: "The paper explores this general problem as it develops a solution to a more specific problem, namely, how to import existing SGML data into an existing object-oriented database schema without changing either the SGML data or the database schema. The target system is an object-oriented database system named CELLAR (for Computing Environment for Linguistic, Literary, and Anthropological Research). The solution uses architectural processing to map the SGML data onto architectural forms that the CELLAR system can use to construct the appropriate structure of objects.

Section 1 of the paper discusses the basic differences between the SGML model of data and the object model, and illustrates why the mapping from SGML elements to objects is not a trivial one. Section 2 introduces the DTD for an architecture that maps SGML data onto objects. Section 3 gives a complete example of the automated process by which the SGML data are mapped onto this architectural DTD via an intermediate meta-DTD that encodes the mapping. The example used is that of a critical text edition encoded in TEI format. Finally, section 4 discusses the implementation and the results that have been achieved thus far.

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/Simonspaper.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.

A related paper Importing SGML data into CELLAR by means of architectural forms is available in HTML format: see http://www.sil.org/cellar/import/. For other information on SGML architectures, see the database entry Architectural Forms and SGML Architectures.



[CR: 19990111]

Simons, Gary. "Using Architectural Processing to Derive Small, Problem-Specific XML Applications from Large, Widely-Used SGML Applications." Pages 51-60 in Markup Technologies '98 Conference Proceedings. Markup Technologies '98 Conference. Hyatt Regency, McCormick Place, Chicago, Illinois, USA. November 19 - 20, 1998. Sponsored by GCA and co-sponsored by MIT Press. Edited by the program chairs, B. Tommie Usdin, Debbie Lapeyre, and Michael Sperberg-McQueen. Alexandria, VA: Graphic Communications Association (GCA), 1998. Author's affiliation: Director of Academic Computing, Summer Institute of Computing.

Abstract: "The large SGML DTDs in widespread use (e.g. HTML, DocBook, CALS, EAD, TEI) offer the advantage of standardization, but for a particular project they often carry the disadvantage of being too large or too general. A given project might be better served by a DTD that is no bigger than is needed to solve the specific problem at hand, and that is even customized to meet special requirements of the problem domain. Furthermore, the project might prefer for the data it produces to meet the different syntactic constraints of XML conformity. This paper demonstrates how architectural processing can be used to develop a problem-specific XML DTD for a particular project without losing the advantage of conforming to a widely used SGML DTD. As an example, the paper develops a small XML application derived from the Text Encoding Initiative DTD. The TEI Guidelines offer a mechanism for building TEI-conformant applications; the paper concludes by proposing an alternative approach to TEI conformance based on architectures."

Keywords: computing, humanities computing, SGML, XML, architectural forms, DTD design, conformance of derived DTDs, TEI (Text Encoding Initiative), lexicography, dictionary, Sikaiana language, Solomon Islands.

An online copy of this paper (HTML) is available in the SIL Electronic Working Papers Series.

Full abstracts and annotations for other presentations given at the Markup Technologies '98 Conference are provided in a separate document.



[CR: 19971216]

Simons, Gary F.; Thomson, John V. "Multilingual data processing in the CELLAR environment." Pages 203-234 in Linguistic Databases. [Conference on] Linguistic Databases. Centre for Language and Cognition and Centre for Behavioral and Cognitive Neuroscience, University of Groningen, Groningen, The Netherlands. March 23-24, 1995. Sponsored by the Dutch National Science Foundation (NWO), Royal Dutch Academy of Science (KNAW), et al.. Edited by John Nerbonne (Computational Linguistics, and Humanities Computing, University of Groningen). CSLI Lecture Notes, Number 77. Stanford, CA: Center for the Study of Language and Information, 1998. ISBN: 1-57586-093-7 (hardback), 1-57586-092-9 (paper). Authors' affiliation: SIL Academic Computing.

Abstract: "This paper describes a database system developed by the Summer Institute of Linguistics to be truly multilingual. It is named CELLAR--Computing Environment for Linguistics, Literary, and Anthropological Research. After elaborating some of the key problems of multilingual computing (section 1), the paper gives a general introduction to the CELLAR system (section 2). CELLAR's approach to multilingualism is then described in terms of six facets of multilingual computing (section 3). The remaining sections of the paper describe details of how CELLAR supports multilingual data processing by presenting the conceptual models for the on-line definitions of multilingual resources."



[CR: 19950716]

Simons, Gary F.; Thomson, John V. Multilingual data processing in the CELLAR environment. Paper presented at: Linguistic Databases, 23-24 March 1995, University of Groningen, Centre for Language and Cognition and Centre for Behavioral and Cognitive Neurosciences. Dallas, TX: SIL Academic Computing, July, 1995. Extent: 99K, approximately 46 pages; 10 figures. Authors' affiliation: SIL Academic Computing, CELLAR Project.

Abstract: "This paper describes a database system developed by the Summer Institute of Linguistics to be truly multilingual. It is named CELLAR--Computing Environment for Linguistics, Literary, and Anthropological Research. After elaborating some of the key problems of multilingual computing (section 1), the paper gives a general introduction to the CELLAR system (section 2). CELLAR's approach to multilingualism is then described in terms of six facets of multilingual computing (section 3). The remaining sections of the paper describe details of how CELLAR supports multilingual data processing by presenting the conceptual models for the on-line definitions of multilingual resources."

The paper is only marginally relevant to SGML, but aligns itself philosophically with many of the central impulses of SGML. SGML is in fact used in the CELLAR Project in several ways (within encoding models), as will be more clearly illustrated in another paper by Simons (bibliographic reference; direct link].

Available on the SIL WWW server. See other information on CELLAR on the main CELLAR page.



Sirrine, Susan. "What is SGML?" InfoWorld 15/13 (March 29 1993) 76-[?].

"Abstract: The Standard Generalized Markup Language (SGML), a methodology for modeling document contents and identifying structural and content elements, has been established as an international standard. An SGML document consists of 3 main elements: (1) the SGML Declaration, a header that establishes the environment, (2) the 2nd is the Document Type Definition, which is like a template of tags identifying the document's structural and contextual elements and the relationship between the elements, (3) the Document Instance, which is the actual marked-up text. The ultimate impact of SGML is far-reaching. The ability to store information centrally solves the problem of keeping it current. It also makes on-demand publishing possible. Both Frame and Interleaf market structured-document and SGML application-development software, and a PC version is expected from both companies by the end of 1993. WordPerfect has also just shipped Intellitag, its Unix version of a package that offers SGML conversion capabilities."



[CR: 19971125]

Skinner, Eric. "Making SGML Easier with Microdocument Databases." Page(s) 319 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Senior Program Manager, OmniMark Technologies Corporation, Canada.

Abstract: "The abilities to deliver vast amounts of corporate information on-line in real time, with sophisticated hypertext navigation aids, and the accelerating system complexity of products and corporate processes have converged to drive a new paradigm: component-based documentation development. The microdocument architecture is a vendor-independent hybrid of SGML and RDBMS methodologies that enables the delivery of personalized virtual documents. Illustrations of successful virtual document implementations and an overview of business and project leader implementation issues are provided."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19970212]

Skinner, Eric; LaSalle, Benoit. "Using Micro-documents and Hybrid Distributed DataBases for Building up Hypertext-rich Content On-line Servers." In: Proceedings of the 3rd Annual Conference on the Practical Use of SGML. "A Decade of Power." Third Annual [Belux] Conference on the Practical Use of SGML. Business Faculty, Sint-Lendriksborre 6, Brussels, Belgium. October 31, 1996. Sponsored by SGML Belux (Belgian-Luxembourg Chapter of the International SGML Users' Group). Leuven, Belgium: Belux, 1996. Author's affiliation: OmniMark Technologies.

Summary: "A presentation of the Hybrid Distributed DataBase : Modelling the information into information units (SGML micro-documents) in combination with RDBMS and Full-text retrieval engines."

See also Skinner and McFadden, "Microdocument Database Architectures," published in <TAG> 1996, with other references. For further information on the conference, see: (1) the description in the conference announcement and call for papers, and (2) the full program listing, or (3) the main conference entry in the SGML/XML Web Page.



[CR: 19961107]

Skinner, Eric; McFadden, John. "Microdocument Database Architectures." <TAG> 9/10 (October 1996) 1-7. ISSN: 1067-9197. Authors' affiliation: [Skinner]: Senior Program Manager, OmniMark Technologies Corporation; [McFadden]: CEO and Founder, OmniMark Technologies Corporation.

"The Microdocument Database (MDDB) is a conceptual model for a system that can deliver user-independent virtual documents. In MDDB, the strengths of SGML are combined with the proven flexibility of relational databases, creating a hybrid data structure. Narrative text is organized into independent information units called microdocuments. Related data objects and and dependencies between microdocuments are expressed in the database schema. Inside a microdocument, SGML markup is used to encode the structure internal to the contained text." [extracted]

Apropos of the 'Microdocument Database (MDDB),' see on the OmniMark WWW server: (1) "OmniMark and the Hybrid Distributed Database Model", and (2) "OmniMark and the Automation of Internet Publishing".



[CR: 19950716]

Sklar, David. "Accelerating Conversion to SGML via the Rainbow Format." <TAG> 7/1 (January 1994) 4-5. ISSN: 1067-9197.

The article describes "up-translation" of data from proprietary formats produced by word-processor or desktop-publishing software to generic SGML encoding. A number of SGML vendors, including EBT (Electronic Book Technologies), have designed an SGML format that can be used as a target for "up-translation." From that format, data can be moved to other industry-standard (SGML) formats, or directly to SGML-compliant applications which can read Rainbow. See the entry for Rainbow in this database.



[CR: 19950716]

Sloan, D. "Aspects of Music Representation in HyTime/SMDL." Computer Music Journal 17/4 (Winter 1993) 51-59 (with 2 references). Author's affiliation: Department of Music, Ashland University, OH, USA.

"Abstract: In 1986, the American National Standards Institute (ANSI) authorized a working group, X3V1.8M, to study the development of a standard for the computer representation of musical information. The work of this group has led to two related standards: Hypermedia/Time-based Structuring Language, HyTime (S. R. Newcomb et al., 1991) and Standard Music Description Language, SMDL. HyTime is a standard for scheduling and addressing in any medium, music or otherwise, while SMDL covers those aspects specific to music. ANSI has proposed both HyTime and SMDL as ISO standards. HyTime has been approved and will shortly be published with the number ISO/IEC IS 10744:1992. SMDL is still in the committee draft stage and has been given the number ISO/IEC CD 10743. There has been much vigorous debate in the computer music community over the work of X3V1.8M. Some have argued that there are de facto standards already in use, obviating the need for a new language. Others have debated the design chosen by the ANSI committee. Still others do not believe that the music community will enjoy more benefit than harm from having a standard at this point in time."



Smit, G. de V.; Cowan, D. D. Manipulating Partial Documents in a Syntax-Directed Environment. Technical Report CS-90-02. Waterloo, Ontario: Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada, January, 1990.



Smith, Craig. "Beyond Document Structure - SGML as a Software Development Tool." Pages 139-144 (with 8 references) in PROTEXT IV. Proceedings of the Fourth International Conference on Text Processing Systems. International Conference on Text Processing Systems, Boston, MA, USA 20-22 October 1987. Sponsored by INCA - Institute for Numerical Computation and Analysis. Edited by John J. H. Miller. Dun Laoghaire, Ireland: Boole Press, Ltd., 1987. vii + 153 pages. ISBN: 0-906783-80-1 (hardback); 0-906783-79-8 (paperback). Author's affiliation: Gesellschaft für Mathematik und Datenverarbeitung, Berlin, West Germany.

Abstract: The paper shows how the document description standard SGML can be applied in software development. It is shown how this can be advantageous when building applications of SGML.



[CR: 19971202]

Smith, David A. "Textual Variation and Version Control in the TEI." Pages 131-136 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Perseus Project, Tufts University; Email: dasmith@perseus.tufts.edu.

Summary: "The Text Encoding Initiative Guidelines for encoding critical apparatus (Chapter 19) draw heavily on the text collation tradition and provide useful tools for basic text variation at the word and character level, but they fail to address the need for encoding variation in text structures other than, or larger than, the words and punctuation of a document. With software version control systems, the problem is often reversed: multiple variants within one line are represented as if they were one. The principles behind the design of software version control systems, nevertheless, can inform our work with tagging textual variants, and lead to some solutions for tagging larger structural variation. These problems with version control and textual variation presented themselves in my work for the Perseus Project, and Perseus texts will illustrate the principal issues. [...] The ease with which we can represent this sort of inter-variant communication makes SGML and the TEI Guidelines a good basis on which to build a textual variant system, which more closely meets the needs of the editors of variant literary texts than available version control systems. With some extensions, the TEI can be made to encode more sophisticated variant structures and to satisfy the requirements, though not the efficiency, of a full-fledged version control system." [extracted]

A Marlowe web site is "currently under construction at Tufts University as part of the Perseus Project, a digital library for the study of ancient Greece and Rome. This SGML-encoded edition of the complete works of Christopher Marlowe and his sources has been produced according to TEI standards."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/smith.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.



[CR: 19961226]

Smith, Holly. "SGML Users' Groups...Who Needs 'Em Anyway." Pages 653-658 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Lexicon Systems, Inc., 6165 Lehman Drive, Suite 204, Colorado Springs, Colorado 80918, USA; Tel: 719-593-8971; FAX: 719-593-9268; Email: hollyd@lexisys.com; WWW: http://www.lexisys.com.

Abstract: "As the SGML community continues to grow, users are seeking new support structures, new sources of information, new technology, and new ways of applying SGML. The result is a number of emerging SGML interest groups, not just around the U.S., but around the world. Just over a year ago, I helped revive the defunct Rocky Mountain SGML Users' Group in Colorado. The journey to a strong, productive users' group has been long, and not without hurdles. However, the benefits are many for everyone involved, and the learning experiences have been invaluable. This paper presents ten good reasons to start an SGML users' group, who should be involved in organizing a users' group, how to get started on the right foot, what people can expect to happen during different stages of users' group development, common problems that tend to crop up and how to deal with them effectively, and the dos and don'ts of managing a users' group.

Another paper discussing the role and operation of SGML user groups was presented at SGML '96 by Richard Barth.

Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19951113]

Smith, Joan M. "The Computer and Publishing: An Opportunity for New Methodology." Pages 107-113 in PROTEXT II. Proceedings of the Second International Conference on Text Processing Systems. International Conference on Text Processing Systems, Dublin, Ireland 23-25 October 1985.. Edited by John J. H. Miller. Dublin, Ireland: Dún Laoghaire, Boole Press, Ltd., 1987. vii + 215 pages. ISBN: 0-906783-50-X (hardback); 0-906783-53-4 (paperback).

"Abstract: Computers and associated devices are used increasingly for the input of copy on a word processor or other text entry system; perhaps sending a copy to a reference who may return an edited form, possibly using a floppy disk; and maybe for this copy to have codes inserted in it before its publication. In general, these codes have related to specific typesetters: they are device-dependent. But generic codes could be inserted, giving increases flexibility. The changing face of publishing is examined, not only computer-assisted publishing and electronic publishing but above all database publishing. Its relevance to publishers in the more traditional sense and those involved with in-house publishing is considered. The Standard Generalized Markup Language (SGML) is presented as the solution."



[CR: 19970314]

Smith, Joan M "A Report of the MarkUp '88 Events." SGML Users' Group Bulletin 3/2 (1988) 62-66. ISSN: 0269-2538. Author's affiliation: Independent Information Consultant.

The author reports on the highlights of the MarkUp conference sponsored by GCA. It was held in Ottawa Ontario, on May 24-26, 1988. Another report of the conference is available in "The MarkUp '88 Conference", published in the SGML Users' Group Newsletter Number 9 (August 1988) 13-14.



[CR: 19961210]

Smith, Joan M. "Report on [International] MarkUp '89 [Conference]." SGML Users' Group Bulletin 4/1 (1989) 39-42. ISSN: 0269-2538. Author's affiliation: [Independent Consultant], 17 Tanza Road, Hampstead, London NW3 2UA, UK.

A detailed account of the Markup '89 Conference sponsored by GCA and the International SGML Users' Group, held in Gmunden, Austria, April 11-14, 1989.

Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).



[CR: 19980126]

Smith, Joan M. SGML and Related Standards. Document Description and Processing Languages. Ellis Horwood Series in Computers and their Applications. New York/London: Ellis Horwood, 1992. xviii + 152 pages. ISBN: 0-13-806506-3.

The book supplies a valuable survey from the perspective of Joan Smith, who served as a leading SGML advocate in the UK for many years. Smith is an independent consultant, and founder of the International SGML Users' Group. See a publisher's description and the volume and the Table of Contents for a document overview. The volume is available for purchase through the International SGML Users' Group.

See also the book review by Simon Wickes in <TAG> magazine, May 1993.



Smith, Joan M. SGML Products and Services. CALS in Europe SIG, 1990- [various].

A document covering primarily CALS-SGML, produced by Joan Smith for the CALS in Europe SIG. Periodically updated. The cost is approximately 20 UK pounds. Contact: David Ardron, Secretary, CALS in Europe SIG; Ferranti Computer Systems Ltd,; Western Road, Bracknell, Berkshire RG12 1RA; UNITED KINGDOM; TEL: +44-344-483232.



Smith, Joan M. "The Standard Generalized Markup Language (SGML) for Humanities Publishing." Literary and Linguistic Computing 2/3 (1987) 171-175. ISSN: 0268-1145.

Abstract: a new methodology, and the core of which is generic coding, has been developed within the International Organization for Standardization (ISO). This is known as the Standard Generalized Markup Language (SGML). Using SGML, the elements of a document are marked up as to their role, be it a paragraph, an abstract, a note, or whatever; the style of presentation is a separate issue and is not addressed by SGML. These elements can form part of a data base, which can be updated at will. So there is the notion of data base publishing. The Standard Generalized Markup Language is presented as a tool for full-text data base publishing, where the options for output are open, an example being given as a marked up document. Its value for all aspects of humanities publishing is addressed: whether for scholarly papers intended for a journal, books, specialist publications, dictionaries, or biographies, indeed whatever is input to an electronic medium with the intention of being imaged subsequently in some form; whether alone, in part, or in combination with other text. SGML represents an advance in publishing methodology, taking advantage of developing technology. It can be exploited as such in an academic environment to give an added dimension to research publications.



Smith, Joan M. "Standard Generalized Markup Language and Related Standards." Computing Communications 12/2 (April 1989) 80-84. ISSN: 0140-3664. CODEN: COCOD7.

Abstract: Projects developed by the International Organization for Standardization-International Electrotechnical Commission Joint Technical Committee 1-Subcommittee 18-Working Group 8 are described here, with the working group concentrating on the formulation of standards for text description and processing languages in the broader domain of text and office systems. Central to the work of WG 8 is ISO 8879 Standard Generalized Markup Language for the description of the information content of documents. Other standards and technical reports produced by the group support SGML in some way, either directly or indirectly. Their role in office publishing is described, and some information is given about office applications and the products that are available in the marketplace.

Joan Smith has contributed numerous articles covering (SGML) standards updates. E.g., see "Standards," Literary and Linguistic Computing 4/4 (1989) 294-296; "Standards," Literary and Linguistic Computing 4/1 (1989) 57-58; "Standards," Literary and Linguistic Computing 1/3 (1986) 191-192.



Smith, Joan M. The Standard Generalized Markup Language (SGML): Guidelines for Editors and Publishers. British National Bibliography Research Fund, 26. Boston Spa [UK]: British National Library, 1987. ISBN: 0-7123-3111-5. ISSN: 0264-2972.

The abstract for Smith's "Authors" volume (see here) generally pertains to this document as well.



Smith, Joan M. The Standard Generalized Markup Language (SGML): Guidelines for Authors. British National Bibliography Research Fund, 27. Boston Spa [UK]: British National Library, 1987. ISBN: 0-7123-3112-3. ISSN: 0264-2972.

Abstract: These guidelines are for authors of scholarly publications who wish to prepare documents for a publisher on existing text entry devices, word processors and personal computers, adding markup to the text in accordance with the Standard Generalized Markup Language (SGML). A simple approach is adopted, based on the concept of a starter set of tags. An explanation of SGML is given and why markup should be used, and advice provided on what is to be done if the author has a publisher, has not yet got a publisher, or is his or her own publisher. As far as the preparation of the document is concerned, there is advice on keying conventions, when not to use stylistic and formatting characteristics of the system, and conditions under which its features and facilities may be used. The starter set of tags is explained, and how to deal with lists, tables, and figures. Cross referencing is addressed and the preparation of an index -- all with examples. Information is given on how to extend the starter set and how to cope with text the author may not be able to mark up for any reason. How to deal with characters for printing, that cannot be imaged on the text entry device, is explained, also how to use abbreviations for lengthy character strings of a repetitive nature. For all other issues, the author is referred to the publisher, to the companion 'Guidelines for Editors and Publishers', and to the standard itself.



[CR: 19951113]

Smith, Joan M. "The Use of SGML in the Information Market." Pages 63-74 in Protext III. Proceedings of the Third International Conference on Text Processing Systems. International Conference on Text Processing Systems. Trinity College, Dublin. 22-34 October, 1986.. Edited by J. J. H. Miller. Dublin, Ireland: Dún Laoghaire, Co., Boole Press Ltd., January 1987. ISBN: 0-906783-55-0 (hardback); 0-906783-56-9 (paperback).

"Abstract: The Standard Generalized Markup Language (SGML) received the seal of approval of member bodies of the International Organization for Standardization (ISO) and its publication as an international standard is expected at the end of 1986. It is a standard for full-text data base publishing where this includes computer-assisted publishing and electronic publishing. The methodology is such that the marked up text may be exploited to produce a multiplicity of products from the same data base, the markup being such that the text can be printed or displayed at will in a variety of styles. Application of generic coding methods will give rise to greater freedom in publishing where there can be exploitation of a corporate data base. Information is given on the way some of the sectors in the information market are taking up SGML. How SGML may be applied by means of a starter document type is described, where this may readily be modified or extended dependent on the specific application."



Smith, Joan M.; Stutely, Robert S. SGML: The Users' Guide to ISO 8879. Chichester/New York: Ellis Horwood/Halsted, 1988. 173 pages. ISBN: 0-7458-0221-4 (Ellis Horwood). ISBN: 0-470-21126-1 (Halsted); LC CALL NO: QA76.73.S44 S44 1988.

The book's features are as follows: (1) it supplies a list of some 200 syntax productions, in numerical and alphabetical sequence; (2) it gives a combined abbreviation list; (3) it includes highly useful subject indices to ISO 8879 and its annexes (4) it supplies graphic representations for the ISO 8879 character entities; (5) it lists SGML keywords and reserved names. A more complete overview of the book may be found in the SGML Users' Group Newsletter 9 (August 1988) 9.



Smith, MacKenzie. "DynaText: An Electronic Publishing System [Review of Electronic Book Technologies' DynaText program]." Computers and the Humanities 27/5-6 (1993-1994) 415-420. 10 references. Author affiliation: Chicago University, IL, USA/Harvard University.

Abstract: "DynaText is an electronic book publishing system that allows you to produce ready-to-ship books, or collections of books, on a variety of media such as diskette and CDROM. Several computer platforms are supported including UNIX (using X-windows), MS-Windows, and Macintoshes. The complete system consists of a compiler and indexer that allow a publisher to build an electronic book, and a browser that allows readers to display and navigate in the book, and perform searches in the text. It is one of the few publishing systems to take full advantage of SGML, while incorporating popular features of electronic books such as hypertext linking. With DynaText you can take ordinary text, vector and raster graphics, tables, equations, audio and video clips, and add several types of hypertext links, context-sensitive keyword search capabilities, or multiple views of a document. It also has the ability to launch other programs from inside a text and return the reader to the text at a later point. The DynaText publishing system is a complex and sophisticated tool for producing high quality electronic books on most of the major computer platforms. Its requirement of SGML compliant documents as input usually means a longer process before the book can be produced, but also means that you are not tied to the system in the future, since your texts can be ported easily to other platforms and systems. The ability of users to annotate texts and create their own hypertext links seems particularly valuable to humanities text publishers. DynaText's support of the full range of hypertext and windowing features makes it very easy for publishers to design and readers to use. For academics with large corpora to publish this type of system, while expensive, is one of the few reasonable options."



Smith, Norman E. Managing WEB Documents With OmniMark. Paper presented at the 1994 OmniMark User's Group Meeting (OMUG) in Tyson's Corner, Virginia. Oak Ridge, TN: DOE, Office of Scientific and Technical Information, Scientific Applications International Corp., November 6, 1994. Author's affiliation: Norman E. Smith; Science Applications International Corp.; P.O. Box 2501; 301 Laboratory Road; Oak Ridge, TN 37831-2501; (615) 576-2276; Email: smithn@zeus.osti.gov.

"Abstract: The Department of Energy (DOE) Office of Scientific and Technical Information (OSTI) set its World Wide Web (WWW) Server up as a Standard Generalized Markup Language (SGML) application from the very beginning. SGML processing is built around OmniMark. Web HyperText Markup Language (HTML) documents are parsed with OmniMark and SGML syntax errors corrected before being loaded on the production Web Server. Automation of hypertext links is an absolute necessity as the number of documents on a server grows to prevent dangling hyperlinks. SGML provides the automation vehicle for the OSTI Web Server. Hypertext links are managed via SGML and the parsing process. Each document is given a logical name which is set up as an SGML entity reference. The SGML entity contains the Universal Resource Locator (URL) for the document. The OmniMark program substitutes the proper URL for the logical name reference automatically generating valid hyperlinks. The SGML approach has made possible several complete reorganizations of the file structure on the Web Server with minimal impact on either outside access or staff sanity. This paper examines using OmniMark in managing Web Servers from an SGML prospective. This document describes work performed at the DOE Office of Scientific and Technical Information under contract DE-ACO5-91MA40061."

The document is available online from DOE/OSTI, or in mirror copy here.



Smith, Norman E. Managing Web Documents with SGML. DOE/OSTI Research Report. Oak Ridge, TN: DOE, Office of Scientific and Technical Information, Scientific Applications International Corp., 1994 [1995?]. approximately 13 pages. .

"Abstract: The DOE Office of Scientific and Technical Information (OSTI) set its World Wide Web (WWW) Server up as an SGML application from the very beginning. Web HyperText Markup Language (HTML) documents are parsed and SGML syntax errors corrected before being loaded on the production Web Server. Automation of hypertext links is an absolute necessity as the number of documents on a server grows to prevent dangling hyperlinks. SGML provides the automation vehicle for the OSTI Web Server. Hypertext links are managed via SGML and the parsing process. Each document is given a logical name which is set up as an SGML entity reference. The SGML entity contains the Universal Resource Locator (URL) for the document. The SGML parser substitutes the proper URL for the logical name reference automatically generating valid hyperlinks. The SGML approach has made possible several complete reorganizations of the file structure on the Web Server with minimal impact on either outside access or staff sanity. This paper examines the issues of managing Web Servers from an SGML prospective. This document describes work performed at the DOE Office of Scientific and Technical Information under contract DE-ACO5-91MA40061."

Available from the DOE/OSTI WWW server Managing Web Documents With SGML, by Norman E. Smith [or in mirror copy here].



[CR: 19971229]

Smith, Norman E. Practical Guide to SGML Filters. Wordware's Advanced Book Series. : Wordware Computer Books, 1996. Extent: 450 pages. ISBN: 1-55622-511-3 ["$44.96, CN $69.95, AU $95.95"]. Author's affiliation: SAIC; Email: norman.e.smith@cpmx.mail.saic.com.

Abstract: "This book provides comprehensive coverage of this important language of the Internet programming environment including case studies and two disks which contain the OmniMark Sampler, a fully functional commercial SGML parser. Also included with the disks is the PC version of Electronic Book Technologies RTF to Rainbow SGML conversion. Norman Smith, CDP, is a senior systems analyst and programmer for Science Applications International, Corp. with over twenty years of experience with programming with SGML as a specialty. Book With Diskettes."

"This book provides coverage of SGML, including case studies and disks containing the OmniMark Sampler, a fully functional commercial SGML parser. The book is logcially divided into three sections. The first covers background material on writing SGML/HTML file filters. The middle section is a chapter on each of five languages used in the case studies. These languages include AWK, C, OmniMark, Perl, and S-Engine (a Forth-based language). The language coverage is more than a "quick reference", but less than a tutorial. The idea is to present enough of the language to give you a feel for it and to aid understanding of the code in the case studies. The final section is a group of case studies, with implementation in two or more of the five languages. The case studies are: - Structured ASCII to SGML - SGML to HTML - SGML to TeX - SGML to SGML - ASCII to HTML - RTF to SGML - SGML to RTF The disks with the book include a demo copy of OmniMark plus AWK, Perl, Rainbow DTD converter, and all of the code from the book.

A more detailed description of the book is available in an announcement posted to CTS. The accompanying diskettes are [January 17, 1997] available for download from the Wordware server; diskette #1, diskette #2. Also, see pre-publication information: Wordware: http://www.wordware.com/page3.html. Wordware: 1-800-229-4949



[CR: 19961018]

Smith, Philip N.; Brailsford, David F. "Towards Structured, Block-Based PDF." Pages 153-165 (with 23 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: .

Abstract: "The Portable Document Format (PDF), defined by Adobe Systems Inc. as the basis of its Acrobat product range, is discussed in some detail. Particular emphasis is given to its flexible object-oriented structure, which has yet to be fully exploited. It is currently used to represent not logical structure but simply a series of pages and associated resources."

"A definition of an Encapsulated PDF (EPDF) is presented, in which EPDF blocks carry with them their own resource requirements, together with geometrical and logical information. A block formatter called Juggler is described which can lay out EPDF blocks from various sources onto new pages. Future revisions of PDF supporting uniquely-named EPDF blocks tagged with semantic information would assist in composite-page makeup and could even lead to fully revisable PDF."

For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.



[CR: 19971125]

Smith, Tracy. "Intuitive SGML: Database Integration in SGML Authoring." Page(s) 119-120 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Documentation Ststems Consultant, Novell Inc.; Email: trsmith@novell.com.

Abstract: "Authoring in SGML is difficult and time consuming. Creating SGML documents is costly and complex. Although many of the SGML authoring tools available provide superior SGML functionality, many are not intuitive. This paper will discuss Novell's approach to creating structured hypertext documents intuitively and efficiently by integrating and customizing current database and SGML authoring technologies. The main goal of the system Novell developed is to optimize the authors ability to create and manage structured content.

"The focus of the presentation will be a demonstration of the tool Novell developed to solve many of these problems.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19971227]

Smith, Walter. "OpenTag Initiative: Common Data Extraction and Abstraction Method for Translation and NLP Activities." Pages 113-132 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Walter Smith]: International Language Engineering Corporation, 1600 Range Street, Boulder, CO 80301; Email: walters@ile.com.

Abstract: "The OpenTag format proposes to use the power of an open standard (XML) to access valuable information hidden away in private-format files. One of the primary benefits of using the OpenTag format to leverage information is that you don't have to change anything about the way you're currently working. Users of FrameMaker or Interleaf can continue to author and publish in their familiar environments, and still benefit without ever converting to a complete SGML/XML solution. Of course, certain tweaks to your development techniques can maximize your return on information investment. One of the biggest challenges is to efficiently access text when it's embedded within code and other non-textual data in a multitude of different formats, so using a standard method of marking up that extracted text can greatly boost the efficiency with which it can be consistently reused.

"The OpenTag Initiative is a working group in which both localization customers and their suppliers are defining a standard that will support open data encoding methods during the localization process, and permit robust data interchange between suppliers and customers. The OpenTag format is a single common markup format to encode text extracted from documents of varying and arbitrary formats. By abstracting a file's heterogeneous formatting information into OpenTag markup, you can produce homogeneously tagged text files, regardless of the original file format. Rather than converting information from 'format X' into the OpenTag format, data are extracted from 'format X', manipulated in an OpenTag environment, and later merged back into the 'format X' file."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

For more on OpenTag markup, see the dedicated database entry for the OpenTag Initiative, and its relationship to other early 'XML' applications.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950903]

SoftQuad, Inc. The SGML Primer. SoftQuad's Quick Reference Guide to the Essentials of the Standard: The SGML Needed for Reading a DTD and Marked-up Documents and Discussing them Reasonably. Version 3.0 = Correction and revision of Version 2.0, May 1991. Toronto: SoftQuad Inc., December, 1991. 36 pages.

This SGML Primer from SoftQuad provides a highly readable and even enjoyable introduction to the essential concepts and features of SGML. It may be one of the best brief treatments of SGML you can find -- something you can lend to colleagues without fear of having them turned off by the unavoidable complexity of SGML. The book consciously attempts a popular presentation, using clever illustrations, some surprising examples (structured events in the world of cuisine art, recipe for a biblical mythology), and a bare minimum of technical language. It is available from SoftQuad Inc.; 56 Aberfoyle Crescent, Suite 810; Toronto, Ontario; Canada M8X 2W4; TEL: +1 (416) 239-4801; FAX: +1 (416) 239-7105.

SoftQuad Inc. deserves our thanks for creating the [1995] online edition of the The SGML PRIMER. The paper print version is probably still prettier, but a lot of work has been done using color graphics to make this online version a highly usable SGML introduction. When someone asks for an online crash course in SGML essentials (e.g., "before tomorrow morning at 8:00"), I recommend that you point them to the URLs below. See:(1) SGML Primer: Introduction, and (2) The SGML Primer: Main Text. Or local copy: introduction, main section.



SoftQuad, Inc. The SGML World Tour. Toronto, Ontario: SoftQuad, Inc., Spring, 1994. ISBN: 1-896172-01-6.

This publication is a large and valuable library of SGML resources on CDROM disk. It may be ordered for $24.00 US from SoftQuad). Tel: 1-800-387-2777 (1 416 239-7105). For more on SoftQuad's SGML products, see their WWW home page, and the SGML World Tour Features: A World of SGML Resources on CD-ROM [was/check: description of the SGML World Tour (mirrored here)].



[CR: 19961030]

Soutberg, Jeroen. "SGML and TeX at Elsevier Science Publishers." MAPS (Minutes and Appendices- Nederlandstalige TeX Gebruikersgroep) 5 (November 1990) 85-88.

[Reference is from the PREMIUM Project]



[CR: 19951113]

Southall, Richard. "Presentation Rules and Rules of Composition in the Formatting of Complex Text." Pages 275-290 (with 27 references) in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Author affiliation: Faculty of Design for Manufacture, London College of Art, UK.

"Abstract: The configuration of the actual document produced when a generically marked-up virtual document is formatted depends on rules of composition which govern the action of the formatting system, as well as on the presentation rules associated with the document. Rules of composition are of two kinds: spacing rules and rules of orthography. Statements of such rules in compositors' manuals from the era of metal-type composition are quoted, and their underlying rationales discussed. The application of rules of cmposition by present-day document formatting systems depends on the explicit delimitation of compositional environments in generically marked-up documents, and on the systems' ability to deal explicitly with visual structure."



[CR: 19950804]

Sperberg-McQueen, C. Michael. "Bare bones TEI: A very very small subset of the TEI Encoding Scheme." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 248-265. ISSN: 1053-900X. Author's affiliation: Senior Research Programmer, University of Illinois at Chicago; TEI editor.

"The volume concludes with a simple introduction to the bare bones of the TEI scheme intended to whet the appetite of the reader for a more detailed and thorough exposition. Written by my esteemed colleague and co-editor of the TEI Guidelines, Michael Sperberg-McQueen, it presents the bare essentials of the TEI encoding scheme, in a copiously illustrated and very accessible form, designed specifically for the novice text encoder." [from the issue Introduction, by Lou Burnard]

See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard. See also the online version of this particular article.



[CR: 19950716]

Sperberg-McQueen, C. Michael. Bare Bones TEI: A Very Very Small Subset of the TEI Encoding Scheme. TEI Document No. TEI U6. 30 Aug 1994, rev. June 1995. Chicago, IL: University of Illinois at Chicago, June, 1995. Extent: approximately 26 pages.. Author's affiliation: Computer Center, University of Illinois at Chicago.

"Bare Bones TEI: A Very Very Small Subset of the TEI Encoding Scheme (document no. TEI U6) describes a very small set of tags for users first learning the TEI encoding scheme. The tag set described is small enough to be non-threatening, but probably not large enough for serious work with real texts --- it's about the same size as the first versions of HTML. Available [July 1995] in three forms."

Availability: SGML form (using the TEI Lite DTD); HTML form in multiple small files (for faster retrieval); or HTML form in a single file (for easier printing).



[CR: 19971227]

Sperberg-McQueen, Michael. "Closing Keynote." Page 19 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [C. M. Sperberg-McQueen]: University of Illinois at Chicago; also Editor in Chief, Text Encoding Initiative, and Co-editor of the XML specification (with Tim Bray); Email: U35395@UICVM.UIC.EDU; WWW: http://www.uic.edu/~cmsmcq/.

Summary: "The major themes of the conference will be recapitulated with observations on the state of the SGML/XML world. Observations on important or telling events at the conference will be interspersed with opinions on their significance." [watch this space for a link to a published summary]

This presentation was delivered as the Closing Keynote Address at the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



Sperberg-McQueen, C. Michael. "Specifying Document Structure: Differences in LaTeX and TEI Markup." TUGboat [Proceedings of the 1991 Annual Meeting] 12/3 (December 1991) 415-421.

The article is available in related version as a TEI document, TEI EDW22, June 9, 1991).



Sperberg-McQueen, C. Michael. "The Standard Generalized Markup Language (SGML): A Brief Introduction." Proceedings of the American Society for Information Science = Proceedings of the ASIS annual meeting [56th ASIS Annual Meeting Proceedings of the 56th Annual Meeting of the American Society for Information Science October 24-28, 1993 Columbus, OH] 30 (1993) 285. ISSN: 0044-7870.



Sperberg-McQueen, C. Michael. "The Text Encoding Initiative: Electronic Text Markup for Research." Pages 35-56 in Literary Texts in an Electronic Age: Scholarly Implications and Library Services. A Collection of the Papers Presented at the 1994 Clinic on Library Applications of Data Processing at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Clinic on Library Applications of Data Processing, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, April 10-12, 1994. Edited by Brett Sutton. University of Illinois, Urbana-Champaign: The Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 1994. ISBN: 0-87845096-3. ISSN: 0069-4789.

"Abstract: This paper describes the goals and work of the Text Encoding Initiative (TEI), an international cooperative project to develop and disseminate guidelines for the encoding and interchange of electronic text for research purposes. It begins by outlining some basic problems that arise in the attempt to represent textual material in computers and some problems that arise in the attempt to encourage the sharing and reuse of electronic textual resources. These problems provide the necessary background for a brief review of the origins and organization of the Text Encoding Initiative itself. Next, the paper describes the rationale for the decision of the TEI to use the Standard Generalized Markup Language (SGML) as the basis for its work. Finally, the work accomplished by the TEI is described in general terms, and some attempt is made to clarify what the project has and has not accomplished."

Another abstract for the article is available from ETEXTCTR Review #2 (Jerry Caswell).



Sperberg-McQueen, C. Michael. "Text in the Electronic Age: Textual Study and Text Encoding, with Examples from Medieval Texts." Literary and Linguistic Computing 6/1 (1991) 34-46. ISSN: 0268-1145.

Abstract: This paper discusses characteristic problems in designing methods of encoding texts in machine-readable form for textual study. Any electronic representation of a text embodies specific ideas of what is important in that text. A well-developed encoding scheme is thus in some sense a theory of the texts it is intended to mark up. This paper describes, with examples, the theory implicit in the Text Encoding Initiative (TEI), a project to develop guidelines for the encoding of machine-readable texts. Any machine-readable representation of texts must use markup, but no finite vocabulary of markup items can be complete, since neither the set of textual features worth marking nor the set of texts to be studied is finite. Any useful markup scheme must therefore be extensible. Additionally, a markup scheme must allow several discrete views of texts. Texts are both linguistic and physical objects. They have simultaneously a linear, a hierarchical and a directed-graph structure. They refer to objects in real or fictive universes. Texts, finally, are cultural and thus historical objects: a useful encoding scheme must be able to represent textual variation, parallel texts, and the gradual accretion of interpretation and commentary with which human culture adorns venerated texts.



[CR: 19960330]

Sperberg-McQueen, C. Michael. Textual Criticism and the Text Encoding Initiative. Presentation at MLA '94, San Diego, Session sponsored by Emerging Technologies Committee of MLA. Chicago, IL: Computer Center, University of Illinois at Chicago, December 1994. Extent: approximately 22 pages, 70K HTML file. Author's affiliation: [University of Illinois at Chicago, and TEI Editor].

"In this paper I want to discuss some of the more obvious issues raised by efforts to create electronic texts, and in particular electronic versions of scholarly editions. [Walter] Benjamin's essay ['Das Kunstwerk im Zeitalter seiner technischen Reproduzierbarkeit'] is particularly suggestive here, in the context of efforts to make literary (and non-literary) texts reproducible by new technological methods. I begin by making explicit some of my assumptions about the goals and requirements of electronic scholarly editions; in the second section I explain why my list of requirements says nothing about the choice of software for the preparation and use of scholarly editions. The third section will describe the work and results of the Text Encoding Initiative, a cooperative international project to develop and disseminate guidelines for the creation and interchange of electronic texts, and show how they relate to the requirements for electronic scholarly editions. In the concluding section, I will outline some of the implications of the TEI for electronic and printed scholarly editions, and some essential requirements for any future consensus on how to go about creating useful electronic scholarly editions." [from the document Introduction]

The document is available via the Internet in HTML and (TEI) SGML format. URLs: "Textual Criticism and the Text Encoding Initiative" [HTML]; SGML version; [mirror copy, HTML]. See also the host page, "Miscellaneous Talks and Papers, http://www.uic.edu:80/orgs/tei/misc/.



Sperberg-McQueen, C. Michael. Trip report: CETH Summer Seminar 1995 Posting to TEI-L, Text Encoding Initiative public discussion list. 10:04:52 CDT, Tue, 27 June, 1995. Author Affiliation: ACH/ACL/ALLC Text Encoding Initiative.

"The Center for Electronic Texts in the Humanities at Princeton and Rutgers Universities held its fourth summer seminar earlier this month under the title ELECRONIC TEXTS IN THE HUMANITIES: METHODS AND TOOLS. . ."

See the text of the report in this database or in the TEI-L archives. See also the link to the seminar description.



Sperberg-McQueen, Michael C. Trip Report, Coalition for Networked Information. Task Force Meeting, Washington, D.C. 10-11 April 1995, CNI/AAUP Joint Initiative Workshop, 11-12 April 1995. Posting submitted to TEI-L Mailing List [TEI-L@UICVM.BITNET], 21-April-1995]. April, 1995. approximately 10 pages.

Several presentations in the sessions summarized uses of SGML within the academic/libraries communities. Included are reports on the DLI (Digital Libraries Initiative) Project and the Model Editions Partnership (using TEI-SGML). TEI-SGML and the TEI Header were featured in some of the talks. An online copy of the report is available from this WWW server as well as from the TEI-L archives.



Sperberg, C. Michael. Trip Report: MLA '94, San Diego. Posting submitted to TEI-L Mailing List [TEI-L@UICVM.BITNET], 3-January-1995. December, 1994. approximately 8 pages.

The report treats several (TEI-)SGML matters, including mention of vendors marketing SGML-aware software. Topics: Chadwyck-Healey (English Poetry); Piers Plowman SGML edition; DOE Corpus; Canterbury Tales Project; TEI Guidelines. A copy of the report is available on this WWW server , as well as in the TEI-L archives at UICVM.



Sperberg-McQueen, Michael C. Trip Report: Society for Technical Scholarship, New York City, 6-8 April 1995. Posting to TEI-L (TEI-L@IUCVM.BITNET), "Subject: Trip Report: Society for Technical Scholarship" April 17, 1995.

An online copy of the report is available from the TEI-L archives and from this WWW server.



[CR: 19971018]

Sperberg-McQueen, C. M; Bray, Tim. "Extensible Markup Language (XML)." Pages 160 - 163 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Authors' affiliation: [Sperberg-McQueen]: University of Illinois at Chicago, Email: u35395@uicvm.uic.edu; [Bray]: Textuality, Email: tbray@textuality.com.

[Extract:] "Extensible Markup Language (XML for short) is being designed under the auspices of the World-Wide-Web Consortium (W3C); the larger goal of this effort is 'to enable future Web user agents to receive and process generic SGML in the way that they are now able to receive and process HTML. As in the case of HTML, the implementation of SGML on the Web will require attention not just to structure and content (the domain of SGML per se) but also to link semantics and display semantics.' (See http://www.w3.org/pub/WWW/MarkUp/SGML/Activity for the W3C's description of this activity.) As a subgoal, we are creating an SGML application profile, XML, that is designed to provide many of the benefits of SGML in a lightweight, easy-to-use, easy-to-implement dialect that omits many of the difficult or problematic features of the full standard. This paper is a report on the XML specification; if time allows, some information will also be provided on the progress of the work toward a typology of links and link behaviors. At the time this abstract is prepared, the XML specification has been made public, but is still officially a working draft."

Abstract available online in HTML format: "Extensible Markup Language (XML)", by C. M. Sperberg-McQueen and Tim Bray. Presentation at ACH/ALLC '97. [archive copy]. Further information on the Extensible Markup Language is available in the main XML page.

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.



[CR: 19950823]

Sperberg-McQueen, C. M.; Burnard, Lou. "The Design of the TEI Encoding Scheme." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/1 (1995) 17-39.

Abstract: "This paper discusses the basic design of the encoding scheme described by the Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange (TEI document number TEI P3, hereafter simply P3 or the Guidelines). It first reviews the basic design goals of the TEI project and their development during the course of the project. Next, it outlines some basic notions relevant for the design of any markup language and uses those notions to describe the basic structure of the TEI encoding scheme. It also describes briefly the 'core' tag set defined in chapter 6 of P3, and the 'default text structure' defined in chapter 7 of that work. The final section of the paper attempts an evaluation of P3 in the light of its original design goals, and outlines areas in which further work is still needed."



Sperberg-McQueen, C. M.; Burnard, Lou. "The ODD System of Tag Set Documentation." Pages 221-222 [partial abstract] in Colloque International "Consensus ex Machina?". Abstracts International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratorie "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages. Authors' Affiliation: [Sperberg-McQueen] University of Illinois at Chicago; [Burnard] Oxford University Computing Services.

The paper describes a system for the documentation of 'document type definitions' (DTDs) used in SGML. "ODD" stands for "One Document Does it all". The system was developed through research of the Text Encoding Initiative (TEI) and British National Corpus projects. An single "ODD" file is used to generate DTD files, reference documentation for all defined elements and entities, and full documentation of the tag set in running prose. [adapted from the abstract].



[CR: 19960206]

Sperberg-McQueen, C. Michael; Goldstein, R. F. "HTML to the Max: A Manifesto for Adding SGML Intelligence to the World-Wide Web." Computer Networks and ISDN Systems 28/1-2 (December 1995) 3-11 (with 4 references). ISSN: . Authors' affiliation: Computing Center, Illinois University, Chicago, IL, USA.

"Abstract: HTML demonstrates that SGML markup is useful for networked information. How can it be made even more useful? One way is to extend the tag set from HTML to HTML2, etc. We argue for a more radical approach: full SGML awareness in WWW. We believe the difficulties are small, the cost affordable, and the advantages overwhelming. SGML is a metalanguage for defining markup languages; HTML is just one instance of this infinite family. At present, documents in other SGML document types must be translated into HTML for display by a Mosaic client-sometimes this imposes unacceptable information loss. World Wide Web (WWW) browsers could handle other SGML document types without translation by launching a general-purpose SGML browser to view them, as they now launch graphics viewers; a better solution overall would be to build SGML display into the WWW browsers themselves. Either way, display of an SGML document would be controlled by a style sheet using a small number of display primitives ("bold", "line break", etc.) to specify the rendition of each element type. For "well-known" document type definitions (DTDs) like HTML, style sheets could be distributed with the browser, or built in. For other DTDs, the browser would fetch a style sheet from the server. Using style sheets, browser software can also make it easy to customize document display. DTDs and style sheets can be designed to accommodate extensions, ensuring that authors can make small extensions to the tag set with no change whatsoever in the target browsers and virtually no performance penalty."

The paper is based upon a presentation delivered at the Second International World-Wide Web Conference: Mosaic and the Web, Chicago, IL, USA, 17-20 Oct. 1994.



Sperberg-McQueen, C. Michael; Goldstein, Robert F. "HTML to the Max: A Manifesto for Adding SGML Intelligence to the World-Wide Web." Presentation at WWW-2 '94. Chicago. IL.. September 15, 1994. Authors addresses: Michael Sperberg-McQueen: cmsmcq@uic.edu; Robert Goldstein: bobg@uic.edu.

"Abstract: HTML demonstrates that SGML markup is useful for networked information. How can it be made even more useful? One way is to extend the tag set from HTML to HTML2, etc. We argue here for a more radical approach: full SGML awareness in WWW. We believe the difficulties are small, the cost affordable, and the advantages overwhelming.

"SGML is a metalanguage for defining markup languages; HTML is just one instance of this infinite family. At present, documents in other SGML document types must be translated into HTML for display by a Mosaic client --- sometimes this imposes unacceptable information loss.

"WWW browsers could handle other SGML document types without translation by launching a general-purpose SGML browser to view them, as they now launch graphics viewers; a better solution overall would be to buildSGML display into the WWW browsers themselves. Either way, display of an SGML document would be controlled by a style sheet using a small number of display primitives ('bold', 'line break', etc.) to specify the rendition of each element type. For 'well-known' document type definitions (DTDs) like HTML, style sheets could be distributed with the browser, or built in. For other DTDs, the browser would fetch a style sheet from the server. Using style sheets, browser software can also make it easy to customize document display.

"DTDs and style sheets can be designed to accommodate extensions, ensuring that authors can make small extensions to the tag set with no change whatsoever in the target browsers and virtually no performance penalty."

Link to the authoritative version of the document at UIC, or in the online conference electronic proceedings, or see a mirrored copy here.

[CR: 19990519]

Sperberg-McQueen, C. Michael; Usdin, B. Tommie. "Welcome to Markup Languages: Theory & Practice." Markup Languages: Theory & Practice 1/1 (Winter 1999) 1-6. ISSN: 1099-6622 [MIT Press]. Authors' affiliation: [Sperberg-McQueen:] Senior Research Programmer, University of Illinois at Chicago; Email: cmsmcq@uic.edu; [Usdin:] President, Mulberry Technologies Inc.; Email: btusdin@mulberrytech.com; WWW: http://www.mulberrytech.com.

Abstract: "In this introductory 'Commentary and Opinion' essay, the "editors of the journal describe why they and publisher decided to start the journal, and what they hope to accomplish."

'Markup Languages: Theory & Practice is a peer-reviewed technical journal publishing papers on research, development, and practical applications of text markup for computer processing, management, manipulation, and/or display. The scope of the journal includes: 1) design and refinement of systems for text markup and document processing; 2) specific text markup languages; 3) theory of markup design and use; 4) applications of text markup; 5) languages for the manipulation of marked up text.'

"The scope of the journal is wide enough to include current and future markup applications but is designed to limit the subject scope sufficiently to make the journal coherent. As may be seen, the journal is not limited to SGML and XML and their applications, though we believe them to be markup languages of considerable interest. SGML was not the first, and XML is unlikely to be the last, language of their kind; we hope this journal will prove a useful forum for discussions of design and implementation issues relating to markup languages present, past, and future. We hope Markup Languages: Theory & Practice will be equally hospitable to articles on theory and articles on practice. In the field of markup languages, theoretical questions may have immediate and obvious practical implications, and practical problems often raise profound and important theoretical issues. The best theorists continually learn from practical experience; the best implementers realize that there is 'nothing so practical as a good theory'."

"Markup Languages: Theory & Practice will include material of a variety of categories, including: 1) articles: especially on theoretical and practical aspects of markup or markup usage; 2) announcements: describing events or activities, especially future events likely to be of interest to our readers; 3) commentary and opinion: essays, such as this one, consisting primarily of the authors' opinions; 4) practice notes: discussions of common practice, suggestions for improved standard practice, or comparisons of methods for achieving similar goals; 5) project reports: descriptions of a project or application reviews: discussion and description of books, software, web sites, etc. that may take the form of essays, short narrative reviews, or annotated tables of contents; 6) squibs: short (from one to a few pages) statements of fact, descriptions of problems, or anecdotes; 7) standards reports: discussions of any of the ever growing set of standards relating to markup."

For other articles in this issue of MLTP, see the annotated Table of Contents.



[CR: 19950716]

Spivak, Jeffrey. The SGML Primer, First Edition Boyd & Fraser, [forthcoming,] 1996. ISBN: 0-7895-0194-5. Author's affiliation: Datalogics, Inc.

Abstract: "An introduction to the SGML standard for document structure definition, this primer guides the user through new terminology and concepts via description and example. Until now, few texts provided information on SGML in an accessible way. Students can and will embrace this beginner's guide to SGML, which explains the difficult concepts behind this popular standard in a basic, easy-to-grasp fashion.

  • Easy to comprehend terminology; important terms concisely defined for beginner
  • Real-world examples show SGML in use in businesses
  • User-friendly SGML 'shortcuts' and their use makes coding less intensive and easier for any student
  • 'Good'' vs. 'Bad'' SGML discussion allows users to avoid common mistakes
  • Appendix lists SGML definitions as listed in the ISO standard"[extracted from the publisher's database]

See a fuller description of the book on the Thomson WWW searchable online catalog.



[CR: 19980425]

St. Laurent, Simon. XML: A Primer. Foster City, CA: MIS Press/IDG Books, [February] 1998. Extent: xx + 348 pages. ISBN: 1-5582-8592-X. Author's affiliation: Systems Integration and Support Services Inc., Greensboro, NC.

From the book's back cover: "XML, an important new technology being developed by the World Wide Web Consortium, promises to replace HTML with a stronger, more extensible architecture. A derivative of SGML, XML will give Web designers the power of SGML scripting without the complexity. Developers will be able to manage information with increased power and flexibility not before possible with HTML. This essential guide will show Web developers how to take advantage of this powerful new technology quickly and painlessly. Techniques for integrating XML with new Web technologies such as Dynamic HTML and Cascading Style Sheets are discussed. Readers will learn to create search tools, Document Type Definitions (DTDs), customized tags, and commercial Web solutions. The accompanying Web site (http://www.mispress.com/xml/) includes the latest updates and information to the world of XML, keeping serious developers abreast of evolving technology." See the volume information available from the publisher. Or: the Amazon.Com description, [local archive copy]. Also: "[Book Review of] XML: A Primer." By Dianne Kennedy. In XML Files: The XML Magazine Issue 7 (August 27, 1998).

As of May 1999, the Web site for the book was: http://www.simonstl.com/xmlprim/index.html. The book is also available in a Korean translation (ISBN 898160019-8) from Powerbook Publishing. As of April 1998, an update page for XML: A Primer had been set up by the author. Or see: http://www.simonstl.com/xmlprim/xmlupdate/. For example, an errata list and an updated section covering the xml:lang and xml:space attributes. See also the author's essay on XML and Filesystems which supplements some of the information in Chapters 11 and 12 of the book.

Note: Simon St. Laurent is also author of Dynamic HTML: A Primer.



[CR: 19990712]

St. Laurent, Simon; Biggar, Robert. Inside XML DTDs: Scientific and Technical. New York, NY: McGraw-Hill, 1999. Extent: xii + 468 pages, CDROM. ISBN: 0-07-134621-X. Author's affiliation: [St. Laurent:] Writer and technical reviewer of computer books for IDG Books and McGraw-Hill publishing companies. WWW: http://www.simonstl.com/; [Biggar:] Professional programmer, PhD in physics..

"Although HTML got its start as a tool for distributing scientific papers, scientists, mathematicians, and other members of that original target audience have received fairly little from HTML's more recent development. The Extensible Markup Language (XML) and a number of key supporting standards promise to improve this situtation by giving scientists and technologists an even more powerful set of tools, however. XML allows the creation and standardization of domain-specific vocabularies (described in Document Type Definitions, or DTDs), making it easy to develop precisely-defined shared standards for exchanging information. Inside XML DTDs: Scientific and Technical provides a guide to XML with a sharp focus on scientific and technical applications of this new technology. In addition to XML itself, MathML, a core W3C standard that can be used in many fields, receives extended coverage. The second half of Inside XML DTDs: Scientific and Technical explores emerging XML standards and tools in a number of fields, including biology, chemistry, astronomy, library science, and meteorology. The conclusion explains what developers will need to do in order to create their own applications of XML, and provides a guide to integrating XML with current information architectures and practices."

[July 1999] Simon St.Laurent posted an announcement concerning the recent publication of Inside XML DTDs: Scientific and Technical. St.Laurent's book Inside XML DTDs: Scientific and Technical "provides a guide to XML with a sharp focus on scientific and technical applications of this new technology. In addition to XML itself, MathML, a core W3C standard that can be used in many fields, receives extended coverage. The second half of Inside XML DTDs: Scientific and Technical explores emerging XML standards and tools in a number of fields, including biology, chemistry, astronomy, library science, and meteorology. The conclusion explains what developers will need to do in order to create their own applications of XML, and provides a guide to integrating XML with current information architectures and practices."

See :http://www.simonstl.com/scitech/index.html.



[CR: 19990603]

St. Laurent, Simon; Cerami, Ethan. Building XML Applications. New York, NY: McGraw-Hill, [May] 1999. Extent: 512 pages, 150 illustrations. ISBN: 0-07-134116-1. Author's affiliation: [St. Laurent:] Writer and technical reviewer of computer books for IDG Books and McGraw-Hill publishing companies. WWW: http://www.simonstl.com/; Email: simonstl@simonstl.com; [Cerami:] New York University and Riptide Communications. WWW: http://cs.nyu.edu/ms_students/cera7013/index.html, Email: cerami@cs.nyu.edu.

"The book focuses on Java XML parsers, including Aelfred, SAX (Simple API for XML), and Microsoft MS-XML. Other topics include XML/database integration and dynamically generated XML via Java Servlets."

[Authors' description:] "XML promises to revolutionize the Web and the nature of distributed computing. XML holds enormous promise as the file format of choice for Web development, document interchange, and data interchange, and presents a new world of opportunities and challenges to programmers. What Java is doing for programming, XML may do for data. Combining the two, as is done throughout this book, makes it possible to build exciting (and useful!) applications and architectures. Building XML Applications provides developers with a solid introduction to XML and key programming tools for building robust, scalable XML applications in Java. After a thorough introduction to XML's place in the developer's toolkit and its syntax, Building XML Applications presents detailed coverage of parsers, a key tool for developers. Focusing on Java development, the sample applications use the Simple API for XML (SAX) to create parser-independent solutions that can fit in a wide variety of situations. Other XML tools, like style sheets, namespaces, linking, and the Document Object Model (DOM) are also explored, giving developers a friendly but approachable introduction to these revolutionary technologies." See the information page on St. Laurent's Web site; [local archive copy].

[July 26, 1999] [Simon says:] 'Minor updates to Building XML Applications.' "I've posted a new version of the prefs.java file from Chapter 20 of Building XML Applications that works with Technology Release 2 of Sun's ProjectX XML parsers. (The version in the book uses Early-Access 1.) This is a very simple class for managing preference files built with XML using the DOM. The constructor has changed slightly to accomodate changed methods for loading XML documents. Otherwise, it isn't a dramatic shift. Also, I've added pointers to some work I've done based on the examples in Chapter 19 that led to my XLinkFilter work. When a new draft of XLink appears, I'll be updating XLinkFilter and those examples yet again. These materials are available at: http://www.simonstl.com/buildxml/index.html#update



[CR: 1995]

Stabler, Hugh R. Experiences with High-Volume, High-Accuracy Document Capture. Rank Xerox Technical Report. Mitcheldean, United Kingdom: Rank Xerox , 1995. Extent: approximately 10 pages. Author's affiliation: Rank Xerox, Document Technology Centre, Mitcheldean, United Kingdom; Email: Hugh@dtc.rankxerox.co.uk.

Abstract: "Rank Xerox have implemented an in-house high-volume data capture operation enabling 100% accurate capture of patent documents as SGML-encoded text plus embedded images. We describe our experiences with setting up and running this operation over the last 4 years."

The document is available online in HTML format: http://www.dtc.rankxerox.co.uk/Hrs_pape.html; [mirror copy]. The paper was presented earlier as part of the International Association for Pattern Recognition Workshop on "Document Analysis Systems" in October 1994, in Kaiserslautern, Germany. For other information on the conversion of EPO documents into SGML format, see: Paul Brewin, "SGML and Patent Document Processing. WIPO standard ST.32."



[CR: 19971125]

Stadler, Thomas. "Publishers Wanted, Authors Needed! The New Information Age is Waiting for Your Works." Page(s) 115-118 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: STEP Stürtz Electronic Publishing GmbH, Germany; Email: ths@step.de.

Abstract: "The new paradigm of information objects has recently emerged that replaces the old one of documents. The new view on information concentrates on smaller bits of information which may be connected in different contexts and that are linked and webbed together under multiple perspectives."

"This paper focuses on the techniques and applications that are available already to produce and maintain information webs. We discuss the fact that many authors and publishers are writing books as they have been doing for the last 500 years. Partly it seems to us to be the publishers and authors turn now to redefine their methods, their products and their markets. What are the new opportunities, what abilities and skills are needed, and what are the problems in the shift to a new way of writing and publishing?

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



Stenerson, Jon. "A LATEX Style File Generator and Editor." TUGboat: The Communication of the TeX Users Group [Proceedings of the 1994 Annual Meeting] 15/3 (September 1994) 247-254. 7 references. Author affiliation: TCI Software Research, Las Cruces, New Mexico; email: Jon_Stenerson@tcisoft.com.

"This article presents a program that facilitates the creation of customized LATEX style files. The user provides a style specification and the style editor writes all the macros. Editing takes place in a graphical user interface composed of windows, menus, and dialog boxes. While the editor may be used in any LATEX environment, it is intended primarily for use with TCI Software Research's word processor Scientific Word. The current style editor runs under any Windows 3.1 system. The performance is acceptable on a 386-based machine and naturally improves on 486's and Pentiums. As Scientific Word is ported to other systems so will the style editor be ported."



[CR: 19971205]

Sterken, James. "<Q> &amp; <A>: James Sterken." <TAG> 10/11 (November 1997) 7-8. ISSN: 1067-9197. Author's affiliation: President, ArborText.

The article provides the text of an interview with James Sterken, co-founder and current President of ArborText. ArborText was created in 1982. Sterken sketches the historical interests and activities of the company, its current endeavors, and its plans to support XML.



[CR: 19950716]

Stern, D. "SGML Documents: A Better System for Communicating Knowledge." Special Libraries 86/2 (Spring 1995) 117-124 (with 4 references). Author's affiliation: Science Library & Information Services, Yale University, New Haven, CT, USA.

"Abstract: The use of SGML (Standard Generalized Markup Language) based documents and databases can provide enhanced access and display capabilities when compared to the files and indexes now available through most local or remote databases. These options are increased tremendously due to the structured nature of the SGML files. This article outlines some of the basic features of SGML and discusses their implications when compared to the utilities of other document and database types. It also identifies the areas needing further development in order to allow these SGML knowledge information systems to improve researchers' searching, display and manipulation of electronically stored data. Particular emphasis is placed upon possible enhancements to the currently limited print display imitation of most current electronic journals."

See a related article by the same author "Expert Systems: HTML, the WWW, and the librarian," Computers in Libraries 15/4 (April 1995) 56-58.



[CR: 19951229 MD: 19980606]

Stinchfield, Don. Using Catalogs and MIME to Exchange SGML Documents. MIMESGML Working Group, INTERNET-DRAFT. Providence, RI: EBT and MIMESGML Working Group, IETF, December 1, 1995. Author's Affiliation: EBT, Inc. [Electronic Book Technologies, Inc.; One Richmond Square; Providence, RI 02906; (401) 421-9550 x280; Email: des@ebt.com.

"This draft proposes a standard for exchanging SGML documents over the World Wide Web using catalogs and MIME. This draft extends SGML Open's definition of catalogs [10] by adding to it new keywords and storage object identifier (SOI) types. The new keywords identify SGML document objects (such as document type declarations and document entities), non-SGML document objects (such as stylesheets), and management information (such as base URL, character encoding, and character repertoire). The new SOI types include URIs and MIME Content-IDs. This document also describes a new MIME content type called Application/SGML-Catalog which identifies a MIME body part as a catalog."

Available online: The latest [December 1995] working copy can be fetched in text format: ftp://ftp.ebt.com/pub/nv/mimesgml/catalog2.txt [mirror copy, December 1995], or in Postscript format: [mirror copy, December 1995]. Don Stinchfield says "...look to have a new version in mid-january [1996]."

Older version(s): ftp://ds.internic.net/internet-drafts/draft-ietf-mimesgml-exch-00.txt [or mirror copy]. Also in Postscript format: ftp://ds.internic.net/internet-drafts/draft-ietf-mimesgml-exch-00.ps [mirror copy].

See now: XML Media/MIME Types.



[CR: 19950828]

Strehlow, Richard A.; Tallant, Thomas O.; Mason, James D.; Kienlen, Philip L.; Barry, Karen T. "Use of SGML for Retrieval of Chemical Data." Pages 138-145 (with 8 references) in Proceedings of the Symposium on Computerized Chemical Data Standards: Databases, Data Interchange, and Information Systems. Symposium on Computerized Chemical Data Standards: Databases, Data Interchange, and Information Systems, Atlanta, GA, USA. ASTM, Committee E-49 on Computerization of Material and Chemical Property Data. Edited by R. Lysakowski and C. E. Gragg. ASTM [American Society for Testing and Materials] Special Technical Publication 1214. Philadelphia, PA: American Society for Testing and Materials, October 1994. ISBN: 0803118767. ISSN: 0066-0558. Authors' affiliation: [?] TERMCO, Inc, Knoxville, TN, USA.

"The encoding of information within a document using Standard Generalized Markup Language (SGML) permits a novel approach to direct retrieval of data from documents. Although SGML is designed primarily for electronic interchange of texts, its features have been found to be useful in the management of data contained within a document. Encoding can include scientific and technical information, as well as associated and ancillary data, management data, and other metadata. This paper describes and gives examples of the use of the technique with special reference to chemical data. Examples of tags used in documents are shown. Retrieval of contained information is conventionally done by means of searches to retrieve a set of documents that have a probability of containing the desired information. The method described here uses a radically different approach to the information retrieval problem."



[CR: 19971227]

Streich, Robert. "Documents Are Software. A Focus on Reuse." Pages 391-400 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Robert Streich]: Researcher and Project Engineer, Schlumberger Austin Research, 8311 N. FM 620, Austin, TX USA; Email: streich@slb.com.

Abstract: "There are many advantages to breaking up complete documents into small, relatively discreet chunks or 'text modules': multiple authors can more easily work on the same document, the text modules could be served up individually as part of an on-line help or performance support system, and the modules can be reused in other documents. But how can we reuse modules between different documents with some assurances that they fit the new context? How will we track the dependencies between modules? In short, how will we address the increased complexity of managing a library of text modules? In the spirit of reuse, this paper explores two fields of research in the software engineering community that might be able to provide some answers to these questions: module interconnection languages and faceted classification."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19970331]

Stribling, Dee; Hunter, Tim; Olszewski, Len; Corrigan, Anne; Mullis, Randy; Allen, Lloyd. "A Real World Conversion to SGML." Pages 75-86 in Conference Proceedings, SIGDOC '96. The 14th Annual International Conference on Computer Documentation. ["Marshalling New Technological Forces: Building a Corporate, Academic, and User-Oriented Triangle"]. ISGDOC '96: 14th Annual International Conference. Research Triangle Park, North Carolina, US. October 20-23, 1996. Sponsored by the Association for Computing Machinery Special Interest Group on Documentation (SIGDOC). New York, NY: Association for Computing Machinery, 1996. ISBN: 0-89-791-799-5. Authors' affiliation: Publications Division, SAS Institute Inc., SAS Campus Drive, Cary, NC, 27513-2414 USA; Email: sasdes@unx.sas.com.

Abstract: In 1994, our Publications Division at the SAS Institute began converting our in-house publishing system. The conversion involved evaluating, selecting and implementing a new publishing system that would take advantage of the SGML paradigm for content markup. Components of the system include an SGML-based editor, routines for one-time conversions of legacy text to SGML, filters for dynamic conversions of SGML text and of graphics to various output formats, a document management system, and customizations that tailor third-party components to fit our environment. Along with new tools, we had to implement the new processes which we designed as we analyzed our documents and workflow for the new system. This paper explores our experiences from the time we began deciding to implement a new publishing system to now, when we have successfully implemented a significant portion of the new SGML-based system with working tools and prototyped processes."

Several other articles in this proceedings volume are germane to SGML: Tom Banfalvi, et al., "Manufacturing Documentation in the Virtual Warehouse"; Betsy Brown, et al., "From Hardcopy to Online: Changes to the Editor's Role and Processes"; Paul Beam and Peter Goldsworthy, "Technical Writing on the Web-Distributed SGML-Based Learning"; Stephanie Copp, "Working with Academe"; Cindy Roposh, et al., "Developing Single-Source Documentation for Multiple Formats"; Paul Prescod, "Multiple Media Publishing in SGML"; Lin-Ju Yeh, et al., "SSQL: a Semi-Structured Query Language for SGML Document Retrievals".



[CR: 19970518]

Sullivan, Eamonn. "Designing Web Sites for Non-Human Audiences." PC Week 14/17 (April 28, 1997) 38-.

Abstract: "Web pages can be used not only as a direct end-user interface but to link one application with another. Future Web sites will be browsed by intelligent software agents, which provide automatic information retrieval, as much or more as by human beings. Such electronic conduits are sensible when there is a lot of information to retrieve or it changes frequently because Web pages can be generated on the fly and impose few compatibility issues. The inherent limitations of HTML, which can only represent certain types of data, are problematic, and overcoming the fact that HTML focuses almost exclusively on visual information is the focus of numerous development efforts. There are already several products that bring sophisticated parsing engines to the Web and can find and automatically recognize data in fast-changing pages. The upcoming Extensible Markup Language (XML) standard lets content providers make their intentions far more explicit."



[CR: 19970828]

Sullivan, Eamonn. "Developing a Card Catalog for the Expansive Web [Intranet Builder. Intersights]." PC Week 14/36 (August 25 1997) 34. ISSN: 0740-1604. Author's affiliation: [PC Week Staff].

"The emergence of XML in a more or less solid form earlier this year has provided a more comprehensive framework for metadata, prompting several organizations to propose solutions based on XML. The main proposals have been XML-Data from Microsoft which is available at www.microsoft.com/standards/xml/xmldata.htm) and MCF (Meta Content Format) from Netscape (available at www.w3.org/TR/NOTE-MCF-XML/). Both proposals provide for a sophisticated method to describe the structure of information, such as properties about authorship and relationships between objects. This week [August 25, 1997], a working group under the auspices of the W3C organization will meet in Redmond, Wash., to begin hammering out a specification that will take the best parts of XML-Data, MCF and PICS. The resulting RDF [Resource Description Framework] specification, if used widely, will enable more efficient searches and exchanges of information between organizations." [Extract]

See more on the Resource Description Framework in the dedicated section. The article is available online: http://www.zdnet.com/pcweek/opinion/0825/25isigh.html; [archive copy].



[CR: 19970518]

Sullivan, Eamonn. "XML Will Take the Web to the Next Level. Labs Explore Enabling Technologies of Next-generation Markup Language." PC Week 14/17 (April 28, 1997) (pages: ).

Summary: "Many companies have jumped wholeheartedly into the Web, only to find that deploying a large Web site is as complex as developing a large application--and that HTML is not up to the task. It's akin to trying to develop an operating system in BASIC. The Extensible Markup Language, or XML, is the World Wide Web Consortium's answer to the limitations of HTML. It is an extremely flexible language that will enable organizations to deploy more sophisticated documents and exchange complex data via the Web. The XML specification was released at the Sixth International World Wide Web Conference in Santa Clara, Calif., earlier this month (see the story). Several software vendors, including Microsoft Corporation and Netscape Communications Corporation, have already endorsed it."

The article is available online in HTML format from ZDNET: see http://www8.zdnet.com/pcweek/reviews/0428/28xml.html; archive copy, text only.



[CR: 19960826]

Sullow, Klaus. "[AMPHORE-a movie documentation workbench] (Article in German)." Nachrichten für Dokumentation 47/2 (March-April 1996) 67-74 (with 13 references).

"Abstract: AMPHORE is a client server system for the documentation of moving image material. The server mainly is formed by a full text database with SGML capabilities while the clients are PC workstations equipped with software for documentation and retrieval of movies and/or movie parts. In AMPHORE, the complete film material is provided in digital form and thus can be used for content-oriented documentation and retrieval in a convenient way. This enables the documentor to build very detailed indexes allowing access by sequence or even by shot. The film descriptions are based upon a syntactical, thesaurus-controlled indexing which reflects the films' diverse action strings and levels."

See: GMD - IPSI, Darmstadt, Germany (SGML & Digitales Video in der Medienarchivierung); Email: suellow@darmstadt.gmd.de. Compare: "Hypermedia Browsing and the Online-Publishing Process", Proceedings of DAGS 95, online.



Sutton, Brett (editor). Literary Texts in an Electronic Age: Scholarly Implications and Library Services. A Collection of the Papers Presented at the 1994 Clinic on Library Applications of Data Processing at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Clinic on Library Applications of Data Processing, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, April 10-12, 1994. University of Illinois, Urbana-Champaign: The Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 1994. ISBN: 0-87845096-3. ISSN: 0069-4789.

A number of articles in this collection address the use of SGML for information structuring within the library science and wider academic community. See, for example, papers by Susan Hockey, C. Michael Sperberg-McQueen, John Price-Wilkin, Mark Day, and Rebecca Guenther. Publisher's address: Publications Office, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 501 E. Daniel Street, Champaign, IL 61820; FAX: 217.244.7329; Tel: 217.333.5218.



[CR: 19971227]

Svenberg, Stefan. "Intention-Based Input Specifications for Automated Document Generation." Pages 417-426 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Stefan Svenberg]: ABB Corporate Research, Department R, Västerås 722 22 Sweden; Email: stsv@crckl8.secrc.abb.se; Phone: +46 21 323247; FAX: +46 21 142190, 323090.

Abstract: "We explore a new structure of input specifications for document generators based on the micro-document approach. The structure is based on the intentional properties of texts. We focus on the writers' intentions and readers' need to be informed, besides the actual content of the document. The generator processes the specification, and decides on the appropriate actions needed to create a document in accordance to the plan. The intentional properties can be marked up using SGML. Some examples are provided."

"[Conclusion]: We believe that the main benefits of using the intentional approach for document structuring in generation, consist in giving an increased awareness of the underlying nature of documentation. In any authoring activity, these matters are very important. If you are careless you will not get the message across, and the documentation will not be used. We have also made a point about distinguishing generic information from product specific information. It allows for a generalization of the generation problem and better opportunities for re-use."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19961226]

Swank, Renée. "Case Study: Maintaining and Developing a Dynamic SGML Environment at Ericsson." Pages 619-622 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Applications Engineer, Isogen Internation Corp., 2200 N. Lamar Suite 230, Dallas, Texas 75202.

In 1991, Ericsson Inc. began implementing Standard Generalized Markup Language (SGML) in their Customer Documentation Department in Richardson, Texas. An SGML working environment for procedural documentation was created first. The second SGML working environment was developed internally for descriptive documents and was based on the first. A user's guide working environment was developed in 1994 which was different than anything done in the past. A system was also put in place for maintaining these SGML environments. Customer Documentation's SGML expertise has enabled it to be in the forefront for SGML implementation in other company groups and also to sell its services in SGML document production."

Document available online from the ISOGEN server: "Case Study: Maintaining and Developing a Dynamic SGML Environment at Ericsson", SGML '96 presentation by Renée Swank.

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19980205]

Swank, Renee; Pratt, Don. "Delivering Documentation to Customers in SGML: How It Works in the Telecommunications Industry." Pages [not abstracted] in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [Swank]: ISOGEN International Corporation; [Pratt]: Bellcore.

Abstract: "Many companies are required to deliver documentation to customers electronically. As a significant step in solving Electronic Document Delivery (EDD) issues, the telecommunications industry has developed an interchange DTD and a packaging guideline that provide a common 'language' for expressing document content and logical structure. Documents created on any system may be translated to this 'language' by document producers, and from this 'language' to any display or production system by document recipients. Although the interchange DTD and packaging guideline were designed by telecommunications industry, they are general enough to be directly used or slightly modified to meet EDD requirements in other industries as well."

Other information on TIM (Telecommunications [or Technical] Interchange Markup) and TEDD (Telecommunications Electronic Document Delivery Package Guideline) is available in the main database entry: TCIF/IPI (Telecommunications Industry Forum Information Products Interchange).

This presentation was delivered as part of the "Introductory Tutorials" track in the SGML/XML '97 Conference. The extended description is available online: "Delivering Documentation to Customers in SGML: How It Works in the Telecommunications Industry." By Renee Swank (ISOGEN) and Don Pratt (Bellcore); [local archive copy].

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



Swiss Federal Institutes of Technology [Lausanne and Zürich]. International Conference on Research and Trends in Document Preparation Systems. Abstracts of the Presented Papers. Conference on Research and Trends in Document Preparation Systems, Lausanne, Switzerland, February 27-28, 1981. Supported and organized by the [Swiss] Conseil des Ecoles Polytechniques Fédérales. J. D. Nicoud, Program Chair. Lausanne/Zürich: Swiss Federal Institutes of Technology, 1981. v + 130 pages.

This 1981 conference sponsored by EPFL, together with the ACM SIGPLAN/SIGOA Symposium on Text Manipulation, was one the early influential conferences bringing together advocates of descriptive markup principles creating a broader forum for discussion of the fundamental insights of formal markup languages. See, in this Lausanne Conference volume, important articles by Brian K. Reid and by Charles F. Goldfarb.



[CR: 19960202]

Szillat, Horst. "SGML and LaTeX." Baskerville [The Annals of the UK TEX Users' Group] 5/2 (March 1995) . ISSN: 1354-5930. Author's affiliation: Email: szillat@berlin.snafu.de.

This issue of Baskerville makes available a number of papers presented at a joint meeting of the UK TEX Users' Group and BCS Electronic Publishing Specialist Group (January 19, 1995) [mirror copy]. See the link to Baskerville, or email: baskerville@tex.ac.uk. Issue 5/2 of Baskerville has other articles on SGML: "Portable Documents: Why use SGML?" (David Barron); "Formatting SGML Documents" (Jonathan Fine); "HTML & TeX: Making them sweat" (Peter Flynn); "The Inside Story of Life at Wiley with SGML, LaTeX and Acrobat" (Geeti Granger); "SGML and LaTeX" (Horst Szillat). See the special bibliography page for other articles on SGML and (LA)TEX.



Szillat, Horst. SGML - Eine praktische Einführung. Bonn, Germany: International Thomson Publishing GmbH, 1995. 226 pages. ISBN: 3-929821-75-3. Author's address: szillat@berlin.snafu.de.

Abstract [supplied by the author] [English] This German SGML-book gives an introduction to SGML. The material is discussed by examples. In the second part of the book the author explains his ideas of what is formatting of a SGML-document and shows that these ideas can be realized by LaTeX. [German] Dieses SGML-Buch gibt eine Einführung in SGML. Das Material wird an Hand von Beispielen diskutiert. Im zweiten Teil des Buches erklärt der Autor seine Idee, was Formatierung eines SGML-Dokumentes bedeutet und zeigt, daß diese Ideen mit LaTeX relisiert werden können.

Further description of the book is available on the following URL: Horst Szillat: Mein SGML-Buch. Email: szillat@berlin.snafu.de. Home Page: http://www.snafu.de/~szillat/.



[CR: 19980203]

[<TAG> Staff Writer]. "(SGML | XML!) at Slash '97. GCA Holds its Annual SGML Event." <TAG>: The SGML Newsletter 11/1 (January 1998) 4-7. ISSN: 1067-9197.

This article provides a summary of vendor news and other initiatives from the SGML/XML '97 Conference, "SGML is Alive, Growing, Evolving!" (December 7 - 12, 1997, The Washington Sheraton, Washington, D.C.). Brief product updates or news summaries are given for Adobe (XML in its FrameMaker product line); Microstar (XML support in Near & Far Designer 3.0); Poet Software's SGML/XML Repository; Microsoft and Xmlu.com (XML Xposed); Progressive Information Technologies (Target 2000); Enigma (Insight 4.0); AIS/Balise (new XML support in Balise; and the Balise HTML package); International Language Engineering (OpenTag version 1.0); OmniMark (Banff Internet application server tools).



[CR: 19961226]

Takahashi, Toru; Higashino, Jun'ichi; Hoshi, Yukio. "And Yet Another Approach for SGML Translation." Pages 381-388 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Takahashi]: Senior Researcher, Hitachi, Ltd., Information Systems R&D Division; Email: t-takaha@isrd.hitachi.co.jp; [Higashino:] Hitachi, Ltd., Information Systems R&D Division; [Hoshi]: Hitachi, Ltd., Software Development Center.

Abstract: "In realizing an SGML-based document processing system, it is required to transform the document structure and/or the data representation, from a source document written in SGML, to data in the format required by the application. In real-world, there is a problem that this transformation often becomes very complex. To solve this problem of complexity, we designed a programming language for SGML transformation (down translation) and implemented its processor. (This language is currently called "Æsop.")

The Æsop processor works on a parsed tree structure (ESIS structure), which is the output of an SGML parser. The processor automatically traverses the ESIS tree structure in depth-first order, selects and executes a script for each node.

To realize the complex transformation with a simple and straightforward program, we designed Æsop as a language which has following features: (1) Ability to select a script for a node, according to any complex condition satisfied by the node. (2) A rich set of built-in functions which enables to modify the document structure itself. (3) Ability to construct a 'process pipeline.' A 'process' is a set of scripts applied to the document tree structure through one traversal action. With Æsop, programmers can divide a complex transformation program to a series of simple processes. A typical Æsop program consists of one or more tree conversion processes and one data output process.

With a prototype processor of Æsop, we succeeded to transform a complex SGML document (written according to a DTD which is very similar to the ISO/IEC TR 9573-11 DTD) to LaTeX. Through this work, we had confirmed the effectiveness of Æsop for transformation from SGML documents containing complex math expressions and tables."

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19950804]

Tauber, James K. "Abandon all hope, ye who enter. A TEI novice recounts his experiences marking up [Dante's] La Divina Commedia and the [UBS] Greek New Testament." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 225-234. ISSN: 1053-900X. Author's affiliation: Centre for Linguistics, University of Western Australia.

See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.



[CR: 19961111]

Taylor, Conrad. "What Has WYSIWYG Done to Us? [WYSIWYG Desktop Publishing Has Duped Us." The Seybold Report on Publishing Systems 26/2 (September 30, 1996) [1], 3-12. ISSN: 0736-7260. Author's affiliation: Information Design Association, email: conrad@ideograf.demon.co.uk.

Abstract: "I argue that vendors of desktop publishing software are selling us short on quality typography; we have been duped by the mere illusion of typographic control. . . This is a paper which I wrote to support a lecture given at a conference in February 1996. It points out that WYSIWYG was only one of five approaches to computerised typesetting under development in the 1980's, but has come to dominate the world of typesetting today. But is it perhaps time to re-examine the virtues of TEX (with its superior H&J algorithms) and SGML (with its ability to carry generic mark-up into different environments)? What would this mean for divisions of labour and responsibility in typesetting? And is there any way of getting the vendors of DTP software to improve the typography and H&J algorithms of their products?" The author also concludes (among other things): "Generic markup needs a comeback." [SGML is discussed along the way]

Online version: http://www.datatext.co.uk/ideography/library/seybold/WYSIWYG.html, [mirror copy, text only version]. Or: the PDF version of the document. The French TeX user group GUT (Groupe des Utilisateurs de TeX) also intends to translate it for publication in Les Cahiers GUTenberg. See also Conrad Taylor's Ideography page for related documents.



[CR: 19971018]

Tetreault, Ronald. "Electrifying Wordsworth -- A Progress Report." Pages 164 - 167 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: Dalhousie University, Email: tetro@is.dal.ca.

[Extract:] ". . . our copy-texts will be taken from the original editions themselves as held in libraries around the world, though of course our procedures will be informed by the findings of previous scholars, especially the editors of the Cornell Wordsworth series. Fourth, our e-texts will be "marked-up" or tagged using SGML (Standard Generalized Markup Language) in conformity with the principles of the Text Encoding Initiative (TEI). Fifth, we plan to link our transcribed e-texts to scanned images of the original printed editions in order to give the reader some sense of the look of the poems upon the page. Finally, this scholarly hypertext edition will be issued on CD-ROM in the first instance, with the intention of proceeding to network distribution as soon as it becomes practical."

Full abstract available online in HTML format: "Electrifying Wordsworth -- A Progress Report", by Ronald Tetreault; [archive copy]. See also a related description, "Electrifying Wordsworth".

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.



[CR: 19971227 MD: 19971229]

Thompson, Henry S. "Element Type Hierarchies for Transparent Document Structure Definition." Pages 341-343 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Henry S. Thompson]: University of Edinburgh, HCRC Language Technology Group, 2 Buccleuch Place, Edinburgh EH8 9LW Scotland; Email: ht@cogsci.ed.ac.uk; WWW: http://www.ltg.ed.ac.uk/~ht/ .

Abstract: "Two recent proposals for meta-applications of XML (XML-Data and MCF) have included DTD fragments for describing document structure, sometimes called 'schemata'. In this paper I describe the XML-Data schemata proposal, concentrating on the motivation for and nature of the provision of an element-type hierarchy, in which element types can inherit attribute declarations and positions in content models from ancestors in the hierarchy. I argue that this represents a major improvement over the use of parameter entities to structure and maintain DTDs."

"Complex document types require rich and complex structural markup. SGML provides powerful mechanisms for defining the grammar of such markup, with element type and attribute declarations in the document type definition (DTD). The structure of the DTD itself, however, finds no explicit expression in SGML. The fact that element types are related in a structured fashion can only be represented implicitly, e.g., through the use of parameter entities. There is a real need, for ease of understanding and ease of maintenance, to address this issue. [...] The only coherent development policy in my view is to introduce things into the schema DTD which we know how to translate into vanilla XML. Not only does this guarantee inter-operability in the limit, but the translation serves to define the semantics of each part of the schema DTD in a concrete and unequivocal way."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Henry S. Thompson is co-author of a paper "proposing a number of extensions to the XML document type declaration model, called XML-Data. Apropos of which: an early draft version of this SGML/XML '97 paper is available online in HTML format: "Why I demand Schemata: Element Type Hierarchies for Transparent Document Structure Definition." Dated: Oct 15 1997. [local archive copy].

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19951220]

Thompson, Henry S.; Finch, Steve; McKelvie, David. The Normalized SGML Library (NSL). HCRC Technical Report, Ref. No. HCRC/TR-74. [LRE Project 62-050 Multext Workpackage 2 Milestone C D NSL: SGML Tools]. Edinburgh, Scotland: Human Communication Research Centre, November 14 1995. Extent: 38 pages, 2 references. Authors' affiliation: Human Communication Research Centre, University of Edinburgh, 2 Buccleuch Place, Edinburgh, Scotland. Email: eucorp@cogsci.ed.ac.uk.

Abstract: "This document describes the Normalised SGML Library (NSL), which consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents."

Summary: "In pursuit of a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation, LTG have developed an integrated set of SGML tools and a developers tool-kit, including a C-based API. This software described here contains everything required to process a very wide range of conformant SGML documents. Its initial parsing module incorporates v0.4 of James Clark's SP software, arguably the broadest coverage SGML parser available anywhere, commercial or not.

"The basic architecture is one in which an arbitrary SGML document is processed on the way in, as it were, yielding two results: 1) An optimised representation of the information contained in the document's DOCTYPE; 2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc. The use of the cached DOCTYPE together with the normalisation of the SGML to nSGML means that applications processing nSGML streams can be very efficient.

"This document assumes that the reader is familiar with SGML [Goldfarb 90] and the C programming language [Kernighan 88]. The structure of this document is as follows. The next section introduces the NSL system. The third and fourth sections describe the user-callable utility programs provided in the NSL system. We then give an overview of the data structures used to represent SGML structure in the API, followed by an annotated example of the use of the NSL API in a complete program. In section 7 we give a description of the NSL query language which provides a convenient way of referring to elements of an SGML document, followed by an annotated program showing the use of the query language. The final three sections give a detailed description of nSGML and the data structures and functions defined in the NSL API." [from the document Introduction]

ftp://scott.cogsci.ed.ac.uk/pub/HCRC-papers/tr-74.ps.gz, or mirror copy, December 1995.



[CR: 19980430]

Thompson, Henry S.; McKelvie, David. "Hyperlink Semantics for Standoff Markup of Read-Only Documents." Page(s) 227-229 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Thompson]: Reader, Department of Artificial Intelligence and the Centre for Cognitive Science, Language Technology Group, University of Edinburgh, Scotland; Email: ht@cogsci.ed.ac.uk; WWW: http://www.cogsci.ed.ac.uk/~ht/ also, WWW: http://www.ltg.ed.ac.uk/software/; [David McKelvie]: Research Fellow, Language Technology Group, Human Communication Research Centre, University of Edinburgh, Scotland; Email: David.McKelvie@cogsci.ed.ac.uk; WWW: http://www.cogsci.ed.ac.uk/~dmck/.

Abstract: "There are at least three reasons why separating markup from the material marked up ('standoff annotation') may be an attractive proposition: 1) The base material may be read-only and/or very large, so copying it to introduce markup may be unacceptable; 2) The markup may involve multiple overlapping hierarchies; 3) Distribution of the base document may be controlled, but the markup is intended to be freely available.

"In this paper, two kinds of semantics for hyperlinks are addressed to facilitate this type of annotation, and describe the LT NSL toolset that supports these semantics. The two kinds of hyperlink semantics that are described are (a) inclusion, where one includes a sequence of SGML elements from the base file; and (b) replacement, where one provides a replacement for material in the base file, incorporating everything else. The speakers address the issue of different kinds of (HyTime and TEI) addressing schemes by means of SGML identifiers, URLs, and character offsets into non-SGML data. We also address the issues of indexing large files to improve the speed of accessing SGML elements in the base files."

A version of this document is available online in HTML format: http://www.ltg.ed.ac.uk/~ht/sgmleu97.html; [local archive copy]. Alternately, abstract in GCA-paper markup: http://www.ltg.hcrc.ed.ac.uk/~dmck/sgml-europe-97.html; [local archive copy]. On the use of a hierarchical database to model (non-) hierarchical structures, see SGML/XML and (Non-) Hierarchy."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19961226]

Thompson, Marcy. "An Element is not a Tag - (and why you should care)." Pages 65-70 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Passage Systems, Email: marcy@squirrel.com; WWW: http://www.squirrel.com.

Abstract: "Too many people say 'tag' when they mean 'element'. While this might seem to be just semantic quibbling, the difference is actually important. The power of SGML-based processing lies precisely in the fact that an element is more than a tag. By examining three systems that exploit the power of SGML to allow sophisticated actions on content, this talk shows that understanding an element as more than just the tags that delimit it is a critical part of exploiting the full power of SGML."

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19971227]

Thompson, Marcy. "How to Make an Industry Standard DTD Work for You (without losing your mind, your marriage or your job)." Pages 71-76 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Marcy Thompson]: CRI Inc., 3245 146th Place SE Suite 270, Bellevue, WA 98007; Phone: +1 425 643-7443 x3027; Email: marcy@squirrel.com.

Abstract: "Implementing SGML is a big task, and one of the obstacles to be overcome is the development of an appropriate DTD or suite of DTDs. In many industries, there are high-profile 'industry standard' DTDs (developed by an industry consortium or a formalized standards activity) which hold out the promise of DTD nirvana: all gain with no pain. To what extent can an industry standard DTD help you achieve your implementation goals? What pitfalls must you avoid in order to prevent this nirvana from becoming just another failed SGML implementation?"

This paper was delivered as part of the "Newcomer" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971227]

Tidwell, Doug. "TaskGuides(tm): An XML-Based System for Creating Wizard-Style Helps." Pages 663-668 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Doug Tidwell]: Advisory Programmer, IBM Corporation E20D/500, P.O. Box 12195, Research Triangle Park, NC USA; Phone: 1+ (919) 254-5128; FAX: 1+ (919) 543-4118; Email: dtidwell@us.ibm.com.

Abstract: "IBM's TaskGuide technology gives Technical Writers and Human Factors professionals the ability to create wizards. Based on the premise that task analysis is the most difficult part of creating an effective wizard, our tools let you focus on design, not writing code.

"This paper discusses the basics of wizard technology, followed by a brief introduction to the XML-based system we have created. We cover some of the key design decisions we had to make, and introduce some of the unique features of our product. Finally, we demonstrate a recursive document, a wizard that creates another wizard."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19980924]

Tittel, Ed; Mikula, Norbert; Chandak, Ramesh. XML for Dummies. Foreword by Dan Connolly. [Series: For Dummies]. Foster City, CA: IDG Books Worldwide, Inc., 1998. Extent: xxviii + 367 pages, CDROM. ISBN: 0-7645-0360-X. Authors' affiliation: [Tittel]: Tivoli Systems, etittel@lanw.com; [Mikula]: Senior Software Engineer, Datachannel, Inc, norbert@datachannel.com; [Chandak]: rksoftware@worldnet.att.net.

Summary: "XML For Dummies takes you through a basic overview of XML -- its capabilities, syntax, and technologies -- before moving into useable information and step-by-step methods for designing, building, and using XML's extensible features. XML's special 'dialects' support advanced tools for using push technology, building dynamic interfaces, and managing or transmitting data across the Web. And freeware and trial software versions of XML software packages, tips for finding online XML resources, a cross-linked glossary, code examples from the book, and other cool features are included on the bonus CD-ROM that comes with this indispensable guidebook." [from the publisher]

A review of the book was published by Dianne Kennedy. ". . . Overall, I found XML for Dummies to be a good addition to my reference library. It clearly will have more value to those who are using HTML rather than SGML as their starting point. SGML folk will likely find many of the SGML-oriented discussions too simplistic. In addition, they may find Chapter 6, which is based on using XML schemas in place of DTDs, rather confusing. But the good discussion of how to read the XML specification and the excellent XML application DTDs makes this a book worth buying, no matter what your background is."

An overview of the book is presented on the LANWrights, Inc. Web site. See also the dedicated web site for the book, with detailed chapter summaries, URL collections, examples, and other resources.



[CR: 19971125]

Toche, Olivier; Melese, Bertrand. "Access to Cultural Heritage through an On-line Multimedia Data Service: Application to the Archive Folders of France's General Inventory of Monuments and Art Treasures." Page(s) 277-282 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Olivier Toche]: French Ministry of Culture, Heritage Management; WWW: http://aquarelle.inria.fr; [Bertrand Melese]: President and Founder, GRIF SA, France.

Abstract: "This document presents the European Aquarelle project and the missions and the documentation system of the General Inventory. It then examines one of the first applications of this research project with Aquarelle project and the missions and the documentation system of the General Inventory. It then examines one of the first applications of this research project with SGML tagging of a digital version of Inventory archive folders dealing with France's monuments and art treasures."

"The technical and documentary specifications and standards selected are TCP/IP for internal and external networks, HTML for pages of text, SGML (Standard Generalized Markup Language (ISO 8879) for digitised content folders and the Z39.50 request protocol for access to data bases, standards ISO 2788 and 5964 for drawing up monolingual and multilingual thesauri, and the CIMI (Consortium for the Computer Interchange of Museum Information) DTD and the Inventory DTD for applications respectively relating to museums/art galleries and monuments

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19971018]

Tompa, Frank. "Capitalizing on Text Structures. [Keynote Address]." Pages 170 - 171 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: Department of Computer Science, University of Waterloo, Waterloo, Ontario; Email: fwtompa@uwaterloo.ca; WWW: Frank Wm. Tompa's Home Page.

[Extract:] "Scholarship increasingly depends on electronic document repositories and the growth of digital libraries. As in physical libraries, the documents to be housed in scholarly collections include historical documents, literary works, reference texts, and government publications. Even more apparent in computer-readable form are collections of business documents (from annual reports and customer literature to procedures manuals and internal communications) and linguistic corpora (collections of spoken and written communication assembled to reflect the uses of language). Gray literature, including technical reports, personal communications, and online help information, also constitutes a growing text resource. SGML provides a method to describe the structure of a complex document in which components, layout, or other chosen features of the text are indicated through markup. The TEI Guidelines use SGML to define a set of comprehensive conventions for representing documents, and thus they establish a basis for scholarly communications. HTML defines another set of tags to delineate text structures. Beyond text representation, however, communications support also requires mechanisms for querying and manipulating structured documents."

Abstract available online in HTML format: "Capitalizing on Text Structures. [Keynote address]", by Frank Tompa; [archive copy]

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.



[CR: 19951113]

Tompa, Frank Wm. Experiences with the OED. University of Waterloo Centre for the New OED and Text Research, Technical Report. Waterloo, Ontario: University of Waterloo Centre for the New OED and Text Research, 1991. Extent: 9 pages, 11 references. Author's affiliation: University of Waterloo, Waterloo, Ontario, Canada N2L 3G1; Email: fwtompa@uwaterloo.ca; Tel: (519) 888-4675; FAX: (519) 885-1208.

"Abstract: According to the Oxford English Dictionary, a dictionary can be either 'a book dealing with the individual words of a language...' or 'a repository of knowledge, convenient for consultation.' An effective dictionary database must serve both roles simultaneously; that is, it must be capable of answering precise questions about the written dictionary text as well as the language described by that text.

An effective representation for the OED has been based on the recent text structuring technique known as 'descriptive markup,' which introduces tags into a text stream. Thus, dictionary components are explicitly identified and delimited, so that, for example, an entry is marked by <E>...</E>, an etymology by <ET>...</ET> , a usage label by <LB>...</LB> , and a cited work by <W>...</W>.

The most visibly successful aspect of our research is embodied in the flexible and efficient search and display software. LECTOR (TM) is a general purpose browser that takes as input a stream of tagged text and formats it to the screen using typography to illustrate its structure. It uses a specially-designed formatting, or display-specification, language to accomplish this, through which the choice of typographical strategies is user-selectable. As a complementary software component, efficient retrieval is provided by the PAT (TM) text search engine. Each entry in the search index designates a 'semi-infinite' string that starts at a critical point in the text (e.g., at a word start) and continues uninterruptedly to the end of the text. Text regions (e.g., those representing individual dictionary components) can be specified to limit the scope of material being searched or displayed. Used together, PAT and LECTOR form a powerful query facility for text databases.

Examples drawn from our experiences with researchers and casual visitors illustrate the application of these tools to exploring the OED.

The document is available in Postscript format on the Internet: http://daisy.uwaterloo.ca/~fwtompa/.papers/hist.dict.ps [mirrored copy, November 1995]. The document was also (?) published under the title "An Overview of Waterloo's Database Software for the OED", as pages 123-143 in Proceedings of the Symposium on Historical Dictionary Databases and Data Retrieval Requirements, Toronto, October, 1991 [= CCH [Toronto Centre for Computing in the Humanities] Working Papers 2, 1992.



[CR: 19951113]

Tompa, Frank Wm.. "Not Just Another Database Project: Developments at UW [University of Waterloo Centre for the NOED]." Pages 82-89 in Reflections on the Future of Text. Proceedings of the Tenth Annual Conference of University of Waterloo Centre for the New OED and Text Research. University of Waterloo NOED Conference, Waterloo, Ontario,. October 20-21, 1994. Waterloo, Ontario: University of Waterloo Centre for the NOED and Text Research, 1994. Author's affiliation: University of Waterloo Centre for the New OED and Text Research, Ontario.

Available in Postscript format on the Internet: http://daisy.uwaterloo.ca/~fwtompa/.papers/oed94.ps, mirrored copy.



[CR: 19951110]

Tompa, Frank W. "What is (Tagged) Text?" [Volume] 2:81-93 in Dictionaries in the Electronic Age: Proceedings of the Fifth Annual Conference of the UW Centre for the New Oxford English Dictionary (St. Catherine's College, Oxford, 18-19 September 1989.) Waterloo, Ontario: UW Centre for the New OED, 1989.

"Abstract: In working on the New OED project, we, like many other researchers, have wrestled with large, intricate bodies of text. Based on this exposure, we have begun to investigate the similarities and differences between managing conventional business data and managing reference text data.

The paper begins with the claim that text can support complex models of the real world that cannot be captured more formally. Thus important information resources must be held as text, but the very absence of a formal model makes it difficult to identify the structures present in a text.

A common text structuring technique is descriptive markup, which introduces tags into a text stream. We present three views of tagged text: one based on tags as text, one on arbitrarily interleaved tags with text, and one on constrained tag placement in the text. Throughout the discussion, examples are drawn from our experience with the OED."

Available on the Intenet in Postscript format: [mirror copy]. For further details on the work of the Waterloo Centre for the New OED and Text Research, including SGML research, see extended overview for a publication by Gaston Gonnet.



[CR: 19951113]

Tompa, Frank Wm.; Raymond, Darrell R. "Database Design for a Dynamic Dictionary." Pages 257-272 (with 12 references) in Research in Humanities Computing I: Selected Papers from the 1989 ALLC/ACH Conference, Toronto. Association for Literary and Linguistic Computing, 16th International Conference; International Conference on Computers in the Humanities, 9th. Toronto, Ontario. June, 1989. Sponsored by ACH/ALLC. Guest edited by Ian Lancashire; Series editors: Susan Hockey and Nancy Ide. Oxford: Clarendon [Oxford University] Press, 1991. ISBN: . Author's affiliation: University of Waterloo Centre for the New OED and Text Research, Ontario.

The article supplies an overview of the NOED project in broad scope, including some discussion of SGML's limitations with respect to the data modeling goals of the UWaterloo researchers.

Available via the Internet in Postscript format: http://daisy.uwaterloo.ca/~fwtompa/.papers/dynamic.ps [mirrored copy, November 1995].



[CR: 19960312]

Travis, Deni C. "Marmalade [Tribute to Yuri Rubinsky]." <TAG> 9/2 (February 1996) 3. ISSN: 1067-9197.

This tribute is printed in a special issue of <TAG> dedicated to the memory of Yuri Rubinsky. See also the main eulogy collection.



[CR: 19951220]

Travis, Deni. "Rocky Montain SGML UG." <TAG> 8/12 (December 1995) 12. ISSN: 1067-9197.

Travis reports on the Rocky Mountain SGML Users' Group meeting, November 1995. Richard Pasewark (Adobe) gave a presentation on FrameMaker+SGML, and Eric Severson (Interleaf) presented a paper "How SGML and HTML Really Fit Together." Contact for the UG: Beth Hayes, bethh@lexisys.com.



[CR: 19960310]

Travis, Brian E. Activate OmniMark. Net-Virtual Location in Cyberspace [probably Denver, Colorado or Rochester, New York]: The SGML University Press, [forthcoming, second-quarter, 1996]. ISBN: 0-9649602-1-4. Author's affiliation: The SGML University.

Abstract: "This book, due second-quarter, 1996, is for the real-world SGML programmer who is using Exoterica's OmniMark translation utility. The author has been using OmniMark since it first came out. This is a collection of tips and techniques for using OmniMark in actual implementations to convert data into SGML, and to translate SGML documents into something else for delivery. A "quick-start" chapter is included to get you up-to-speed right away. Activate OmniMark includes up-to-date information on the newest version of OmniMark."

See further information via the SGML University Press WWW Page.



[CR: 19960325]

Travis, Brian E. "Cal Poly Offers University-level 'Electronic Publishing' Focus." <TAG>: The SGML Newsletter 9/3 (March 1996) 1, 5. ISSN: 1067-9197. Author's affiliation: The SGML University.

Note on a decision by Cal Poly to offer courses with a concentration in "Electronic Publishing and Imaging." The first course will be taught in July-August, 1996.



[CR: 19961029]

Travis, Brian E. "My Summer at Cal Poly [Editorial]." <TAG> 9/9 (September 1996) 1, 10. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc., and Managing Editor of <TAG>.

The author summarizes a summer teaching experience at Cal Poly, San Luis Obispo, California. The school has always had a strong program in printing (industry) arts, and is now developing a concentration in electronic publishing. Travis relates his story about the introduction of SGML into the training classes.



[CR: 19960828]

Travis, Brian E. "Classifiation of SGML Industry Professionals [Editorial]." <TAG> 9/8 (August 1996) 1, 4. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.

The author notes that certification of professionals offering SGML services has made little progress, but some steps are being taken to form classification labels.



[CR: 19980508]

Travis, Brian E. "It's Conference Season." <TAG>: The SGML Newsletter 11/4 (April 1998) 4. ISSN: 1067-9197. Author's affiliation: President, Information Architects; Managing Editor, <TAG>.

The author provides an overview of several recent conferences on XML, SGML, and document management. Spring 1998: XML: The Conference 1998; Documation '98 West; and Seybold Seminars New York / Publishing '98, including the first XML Xposed conference. For the XML Xposed conference, some transcripts are available online; see the database entry.



[CR: 19960716]

Travis, Brian E. "Documation '96: SGML Hangs In There." <TAG>: The SGML Newsletter 9/4 (April 1996) 8-11. ISSN: 1067-9197. Authors' affiliation: President, Information Architects, Inc.

The author reports on the Documation '96 conference, with highlights on SGML developments. Featured in the summary are: XSoft (Astoria, object-oriented SGML database); Exoterica (OmniMark Version 3), Folio-InContext (The SGML Journal Publisher), and Texcel (Texcel Information Manager).



[CR: 19951208]

Travis, Brian E. "Don't Deliver SGML [Editorial]." <TAG>: The SGML Newsletter 8/11 (November 1995) 1, 8. ISSN: 1067-9197. Author's affiliation: President, Information Architects, Inc..

[The author concludes:] "The moral of this story is that you don't need to assume that, since your data is in SGML, you need to use an 'SGML-smart' delivery platform. Doing so will limit your choices, and could have an adverse impact on your data, your propriety, and even on the opperation of your company."



[CR: 19980612]

Travis, Brian E. "Don't Use .xml File Extensions." <TAG> The SGML Newsletter 11/5 (May 1998) 1, 12. ISSN: 1067-9197. Author's affiliation: President, Information Architects, and Managing Editor of <TAG>.

Having observed the increased frequency of Net files with the filename extension .xml, the author reminds <TAG> readers that XML is a metalanguage and not a language, and says: "using '.xml' as an extension doesn't tell us what kind of a file it is . . ."

See the database section XML Media/MIME Types for 'File extension(s): .xml' and related discussion.



[CR: 19971205]

Travis, Brian E. "Flux [Editorial]." <TAG> 10/11 (November 1997) 1, 6. ISSN: 1067-9197. Author's affiliation: President, Information Architects.

Reflections on XML, its rapid rise in popularity, and its relative instability - giving it a lot of promise in the midst of flus, but making it "dangerous" for a developer to tie a project to the emerging specification in terms of details.



[CR: 19950716]

Travis, Brian E. "HTML is Not SGML [Editorial]." <TAG> 8/6 (June 1995) 1, 6. ISSN: 1067-9197.

The article expresses skepticism about the effort to make HTML a substitute for SGML: "HTML is just another output format. The IETF needs to treat it as such and stop pretending it is SGML."



[CR: 19980719]

Travis, Brian E. "Latin Ergo SGML? [Editorial]" <TAG>: The SGML Newsletter 11/7 (July 1998) 1, 3. ISSN: 1067-9197. Author's affiliation: President, Information Architects.

Travis compares SGML to Latin, insofar as Latin, a "dead language," is also healthy as a legacy language. "To someone who asks me if they should use SGML or XML for their document management system, I find it difficult to recommend SGML, except in some very distinct cases: 1) They need to interface with someone else's SGML; 2) They need a particular tool that is not now XML-enabled; 3) Their company is already using SGML in another implementation. [. . . SGML ] is still a great technology for describing the structure of your information, and there are many companies that use SGML to do so. Just don't teach it to your kids."



[CR: 19980413]

Travis, Brian E. "Leaders and Followers [Editorial]." <TAG>: The SGML Newsletter 11/3 (March 1998) 1, 3. ISSN: 1067-9197. Author's affiliation: President, Information Architects.

The author comments on the "self-appointed keepers of the purity of SGML" in the context of broader discussion of XML's need for balance (academic and/versus commercial influence) and openness (not having the standards process "co-opted").



[CR: 19981007]

Travis, Brian . "The New <TAG>." <TAG>: The SGML Newsletter 11/9 (September 1998) 1, 3. ISSN: 1067-9197. Authors' affiliation: Architag.

As publisher and editor of <TAG>, Travis outlines a new plan for the newsletter publication, beginning in Fall 1998. Isssue 11/8 will not be published. New services will be offered to subscribers, and the newsletter will be available electronically. See the new URL, http://www.tagnewsletter.com.



[CR: 19970826]

Travis, Brian E. OmniMark At Work Volume 1: Getting Started. Englewood, CO: SGML University Press, 1997. Extent: xiii + 503 pages, CD-ROM disc. ISBN: 0-9649602-1-4. Author's affiliation: Information Architects; The SGML University.

Summary: "The book is targeted at programmers who are new to the language, and to experienced programmers who are new to version 3. OmniMark At Work starts with a chapter called "OmniMark for the Impatient". This chapter is designed for the person who needs to understand the concepts of OmniMark, and who wants to get a feeling for the language without spending time drudging through reference manuals to get started. This chapter has several programs that are highly documented, to show you what is happening at every step. There are plenty of tips throughout the book, and lots of code that you can integrate into your OmniMark programs. There are routines for converting RTF to SGML, translating SGML to HTML, examples of internal and external functions, transforming one SGML structure to another, creating "well-formed documents" for XML, and many more. . ." [from the author]

See the online description of the book from the the SGML University Press; [archive copy].



[CR: 19970207]

Travis, Brian E. "Predictions for 1997 [Editorial]." <TAG>: The SGML Newsletter 10/1 (December 1997) 1, 5-6. ISSN: 1067-9197. Authors' affiliation: President, Information Architects, Inc.

The author reviews predictions relating to SGML for 1996, and offers new predictions for 1997. For 1997: XML [slow acceptance], the Web, DSSSL ['high-profile applications'], HyTime [lethargic acceptance], and 'Niche Books'.



[CR: 19980203]

Travis, Brian E. "Predictions for 1998 [Editorial]." <TAG>: The SGML Newsletter 11/1 (January 1998) 1, 8-9. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.; Managing Editor, <TAG>.

Brian Travis reviews his predictions for the calendar year 1997 (published in <TAG>), and offers further predictions for 1998. His top picks: XML big-time, XSL advances, SGML remains stable, name changes away from S-G-M-L, and more books on XML.



[CR: 19970824]

Travis, Brian E. "The Role of the Application in Book Production." <TAG>: The SGML Newsletter 10/8 (August 1997) 1-8. ISSN: 1067-9197. Authors' affiliation: President, Information Architects. Managing Editor, <TAG>.

The article is an extract from the author's book OmniMark at Work, Volume 1: Getting Started [SGML University Press, 1997]. The article discusses the techniques "used to create, edit, and print the book [OmniMark at Work], along with some code samples from the book processing."



[CR: 19961113]

Travis, Brian E. "SGML Asia/Pacific '96." <TAG> 9/10 (October 1996) 10-12. ISSN: 1067-9197. Author's affiliation: Author's affiliation: President, Information Architects Inc.

A summary of the closing keynote address given by Brian Travis at the SGML Asia/Pacific '96 Conference, which attracted more than 160 participants. Observations on the important trends: SGML-based databases (Chrystal Software - Astoria, Texcel, XyVision, OmniMark); document management; virtual documents; mainstream SGML and W3C (XML) SGML; SGML and the Web.

See the conference entry for other information.



[CR: 19971229]

Travis, Brian E. "SGML (Alone) is Not the Answer." <TAG> 10/11 (November 1997) 1-6. ISSN: 1067-9197. Author's affiliation: President, Information Architects.

An earlier draft title "SGML is Not the Solution" gave way to the present title, developed along the following lines: "SGML works best when it is applied properly to certain solutions, along with other tools and technoilogies." Travis discusses the use of SGML in conjunction with databases. He provides references to case studies from the aircraft industry, legal publishing, and newspaper publishing.

Also presented at the SGML/XML '97 Conference.



[CR: 19971227 MD: 19971229]

Travis, Brian. "SGML (Alone) is Not the Solution." Pages 519-526 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Brian Travis]: President, Information Architects, Inc., 6989 S. Jordan Road, Suite 5, Englewood, CO 80112; Phone: +1 303-766-1336; FAX: +1 303-699-8331; Email: btravis@sgml.com; WWW: http://www.sgml.com.

Abstract: "SGML is a great technology. It has attracted the attention of some pretty influential companies, which have found that they can save money, get to market faster, and increase the accuracy of their documentation by using SGML.

"However, SGML by itself it not the answer. SGML can only work if it is part of an intelligent document management environment that utilizes other appropriate technologies.

"This talk is about the mixing of SGML and other technologies, like relational and object-oriented databases, internet and intranet servers, email, voice mail, and external protocol servers, and other new and old technologies. It ends with a methodology, called 'microdocument architectures', that can pull all of these technologies together to create an intelligent document management environment.

"You will leave this session with a better understanding of where SGML can fit, and where it might not necessarily the best solution. You will have the ammunition to convince your company that SGML should be part of an intelligent document management system and how you might go about integrating SGML with other technologies."

This paper was delivered as part of the "Business Management" track in the SGML/XML '97 Conference.

A version of this presentation is available in the November 1997 issue of <TAG>; see the bibliographic entry.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971227]

[Travis, Brian]. "SGML and the Desktop. SGML Tools on Low-end Publishing Systems Explored at Seybold Conference." <TAG> 6/5 (May 1993) 1, 4. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.

The author supplies a report on the Seybold Publishing Conference held in April 1993, and particularly, on a special panel session chaired by Yuri Rubinsky (SoftQuad). The panel speakers addressed the role of SGML in desktop publishing systems.



[CR: 19961113]

Travis, Brian E. "SGML for the Masses? [Editorial]." <TAG> 9/10 (October 1996) 1, 13-14. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.

The author addresses the question of whether SGML is "too hard" to implement, and the vendor-initiated concept of "Mainstream SGML." These vendors "do not want SGML reduced to a weekend project . . .[but] want to convince mainstream users that SGML is not really that difficult to adopt in an organization." The author also discusses the current XML effort sponsored by the W3C, and expresses some doubts about the viability of the endeavor: "Mainstream SGML and XML both address technical issues that have already been solved, and do nothing to enlighten the publishing community as to the real advantages of the SGML philosophy."

For "Mainstream SGML," see the Microstar WWW server, or Microstar White Paper ; [mirror copy], or a short short description of the effort to make it simple for authors to create and maintain SGML-savvy documents". For the XML activity, see the main XML entry in this database.



[CR: 19960716]

Travis, Brian E. "SGML and Metadata [Editorial]." <TAG>: The SGML Newsletter 9/6 (June 1996) 1, 6. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc..

The author muses on the important notion of "meta-data," by which he means "the ability to respect or ignore certain [SGML] elements based upon their descriptive markup."



[CR: 19970620]

Travis, Brian E. "SGML in the Pacific Rim." <TAG>: The SGML Newsletter 10/5 (May 1997) 14. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc.

Notes on a series of conferences in Sydney and Tokyo. Hot topics were CALS, XML, and multi-byte character set support in SGML software tools.



[CR: 19971106]

Travis, Brian E. "[Editorial] 'SGML: The Philosophy' Just Got Another Name." <TAG>: The SGML Newsletter 10/10 (October 1997) 1, 3. ISSN: 1067-9197. Authors' affiliation: .

The author reflects upon the rise of XML (as evidenced by the Seybold SF '97 Conference) and what it means for SGML.



[CR: 19960716]

Travis, Brian E. "SGML in the Summertime [Editorial]." <TAG>: The SGML Newsletter 9/6 (July 1996) 1, 12. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc.

A note offering some suggestions for appropriate "summertime" SGML projects.



[CR: 19960206]

Travis, Brian E. "SGML Predictions [1995 and 1996; editorial]." <TAG>: The SGML Newsletter 9/1 (January 1996) 1, 7-8. ISSN: 1067-9197. Authors' affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.

Travis provides an overview of SGML's progress in 1995 and nominates several topics as "hot" for 1996: DSSSL implementations; increased use of SGML in Asia; specialized applications integrating SGML components into their import and export facilities; more books on SGML.



[CR: 19950716]

Travis, Brian E. "Tables in SGML. A proposal for intelligent handling of tabular data." <TAG> 6/6 (June 1993) 1-5. ISSN: 1067-9197.

Part I of a two-part article. Describes "how the SGML NOTATION function can be used to process tables".



[CR: 19950716]

Travis, Brian E. "Tables in SGML. A proposal for intelligent handling of tabular data, Part II." <TAG> 6/7 (July 1993) 1-5. ISSN: 1067-9197.

Part II of a two-part article. Describes how the theoretical model (Part I) would be implemented in a "real life" system.



[CR: 19970620]

Travis, Brian E. "Ten Years of <TAG>: The SGML Newsletter [Editorial]." <TAG>: The SGML Newsletter 10/5 (May 1997) 1, 3. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc.

Retrospects on ten years of publishing <TAG>: The SGML Newsletter, which was founded by Sharon Adler, William Davis, and Dale Waldt. The publishers now offer online copies of articles 18 months and older: http://tag.sgml.com/.



[CR: 19971230]

Travis, Brian E. "Use Care in Selecting Your Consultant [Editorial]." <TAG>: The SGML Newsletter 10/12 (December 1997) 1, 3-4. ISSN: 1067-9197. Authors' affiliation: President, Information Architects Inc.

Based upon personal experience and years of observation, the author constructs principles to guide clients in the process of selecting a consultant, or a consulting team.



[CR: 19950716]

Travis, Brian. "Using SGMLS and Awk as an Inexpensive Translator [SGML Tips & Techniques]." <TAG> 7/7 (July 1995) 10-11. ISSN: 1067-9197.

The author shows how a simple AWK script can be used to transfrom SGMLS output into a more useful notation. AWK, however, is slow and subject to line-length limitations.



[CR: 19960325]

Travis, Brian E. "The World's Cheapest SGML Database Management System." <TAG>: The SGML Newsletter 9/3 (March 1996) 1-5. ISSN: 1067-9197. Author's affiliation: The SGML University.

"Information Architects and SGML University proudly announce the availability of the World's Cheapest SGML Database Management System (TWCSDBMS). The product, version 1.0d1, is available for download now. This is a developmental release, and might not ever be updated from its current state. The download file is 3.4MBytes because of the overhead that Visual Basic requires. It runs on Windows 95 or Windows NT. The product is designed as a learning tool to be used to understand the nature of SGML database management. Because of its hierarchical nature, SGML requires a hierarchical database schema in order to express the relationships between the elements." [from the server> The tool is available at this URL: see the description."

The <TAG> article is available online in HTML format from the SGML University WWW server: http://www.sgml.com/tag/9030101.htm [mirror copy, text only].



[CR: 19970207]

Travis, Brian E. "XML: Evil or Necessary?" <TAG>: The SGML Newsletter 9/12 (December 1996) 1, 10. ISSN: 1067-9197. Authors' affiliation: President, Information Architects, Inc.

The author discusses the cautious misgivings expressed in an earlier editorial, and concludes that the XML (Extensible Markup Language) is a healthy necessity as part of the SGML revision process.



[CR: 19971014]

Travis, Brian E.. "XML: SGML Without the Installed Base [Editorial]." <TAG>: The SGML Newsletter 10/9 (September 1997) 1, 5-6. ISSN: 1067-9197. Author's affiliation: .

The author's editorial discusses the trade-offs he sees in the XML effort (vis-à-vis SGML), concluding that the two are not in competition. The HTML community (for whom XML is designed) may be seen as the "installed basd" for the new markup language.



[CR: 19950716]

Travis, Brian E; Travis, Deni C. "SGML Europe '95: Back to Gmunden." <TAG> 8/6 (June 1995) 1-5. ISSN: 1067-9197.

A report on the SGML Europe '95 conference held in Gmunden, Austria. Product announcements/demonstrations discussed are: Panorama Free, Synex Viewport, GRIF HyTime engine, EBT DynaText support for DSSSL, DRUID-Author, Stilo (SGML-smart editor), etc. See further the database entry for this conference.



[CR: 19980612]

Travis, Brian E.; Hahn, Michael. "HTML, SGML, PDF, XML: What's the Difference?" <TAG> The SGML Newsletter 11/5 (May 1998) 1-4. ISSN: 1067-9197. Authors' affiliation: [Travis]: President, Information Architects, and Managing Editor of <TAG>; [Hahn]: Senior Consultant, Information Architects.

An introduction to four related document computing technologies (HTML, SGML, PDF, XML), published in the form of a white paper. The goals, strengths, and weaknesses of each are presented.



Travis, Brian E.; Waldt, Dale C. "Case Study: How Our Book Was Produced." <TAG>: The SGML Newsletter 8/5 (May 1995) 1-5. ISSN: 1067-9197. Authors' affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>; Dale Waldt is the co-founder and publisher of <TAG>, and Data Development Manager with the Research Institute of America.

"This article is excerpted from The SGML Implementation Guide, by Brian Travis and Dale Waldt. The authors included this case study, among the other case studies in the book, as an example of the kind of document database that could be built using a small amount of money and a little knowledge about the tools that are available to the SGML implementor. The book was produced using the concepts discussed within the book, and this case study outlines some of the methods that were used."

See the full bibliographic entry for further details about the book. The book's table of contents and sample chapter are available online from the authors' WWW site, or (in part) via mirror copy here.



[CR: 19960312]

Travis, Brian; Waldt, Dale "In Memorium: Yuri Rubinsky, 1952-1996 [Remembering Yuri]." <TAG> 9/2 (February 1996) 1-2. ISSN: 1067-9197.

This tribute is printed in a special issue of <TAG> dedicated to the memory of Yuri Rubinsky. See also the main eulogy collection.



Travis, Brian E. "Using the Application to Render Tables." <TAG>: The SGML Newsletter 8/2 (February 1995) 6-9. ISSN: 1067-9197. Author affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.



Travis, Brian E. "Tables, Tables, Tables." <TAG>: The SGML Newsletter 8/2 (February 1995) 1, 8. ISSN: 1067-9197. Author affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.



Travis, Brian E. "The SGML Environment." <TAG>: The SGML Newsletter 8/1 (January 1995) 1-8. ISSN: 1067-9197. Author affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.

Summary: "The SGML standard defines several pieces, each of which has a very well-defined purpose and structure. Some of these pieces are optional for an implementation, some are absloutely mandatory, and some are just nice to have. This article reviews the parts defined in the standard, and provides guidelines for selecting tools for your implementation."



[CR: 19980508]

Travis, Brian E. "XML: Enabling Technology or Silver Bullet? [Editorial]." <TAG>: The SGML Newsletter 11/4 (April 1998) 1, 3. ISSN: 1067-9197. Author's affiliation: President, Information Architects; Managing Editor, <TAG>.

The author tells an interesting story about some students who came to learn about XML at a recent Documation Conference (bringing with them certain assumptions about what XML was): they stayed for the first hour of a tutorial, which ended with an introduction to the DTD and its role in XML validity; some students disappeared during the break and didn`t return. This eposide, Travis says, "brings up an interesting issue, and shows that XML might have some of the same barriers to adoption that SGML has had. The main barrier is that darn DTD." The lesson, according to Travis: "Don't oversell. There has been a lot of hype about XML. Along with the hype comes the tendency to overpromise. . . Referring to XML as 'SGML without the tyranny of the DTD' can lead to problems for people who know what SGML is. For people who don't know what SGML. . . requires people to further investigate the real cost of implementing XML in their environment."



[CR: 19950716]

Travis, Brian E; Waldt, Dale; Laplante, Mary. "It's [SGML] Conference Season." <TAG>: The SGML Newsletter 8/4 (April 1995) 1-8, 11. ISSN: 1067-9197.

The multi-part article reports on highlights of three 1995 conferences in terms of SGML news: Documation '95, Folio's Infobase '95, and Seybold Boston.



[CR: 19960716]

Travis, Brian E; Waldt, Dale C. "SGML Europe '96." <TAG>: The SGML Newsletter 9/6 (June 1996) 9-11. ISSN: 1067-9197. Authors' affiliation: [Travis] President of Information Architects, Inc.; [Waldt] Vice President of Product Systems for the Research Institute of America Group.

A report on the SGML Europe '96 conference held in Munich, attended by some 700 people. High points, according to the authors: (a) the concept of "micro-documents", coined by John McFadden of Exoterica, in reference to "a small document that is not a fragment of a larger document or DTD, but rather is a self-contained unit of information that can be managed independently...in a database..."; Synex announced support for multi-byte Japanese characters; Exoterica revealed more about OmniMark Version 3 (NOX); InContext and Folio demonstrated SGML Journal Publisher; SoftQuad demonstrated HoTMetaL 3.0, which has several new features.



Travis, Brian E. "The SGML Implementation Guide Released [Editorial]." <TAG>: The SGML Newsletter 8/5 (May 1995) 1, 10. ISSN: 1067-9197.

The note discusses the purpose of the book review (The SGML Implementation Guide) presented in <TAG> 8/5; see the bibliographic entry.



[CR: 19980229]

Travis, Brian E. "Why Do We Need SGML? [Editorial]" <TAG> The SGML Newsletter 11/2 (February 1998) 1, 3-4. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc., and Managing Editor of <TAG>.

Brian Travis discusses the "XML" and "SGML" names in terms of politics and marketplace, referencing companies that are warm/cool to the name "SGML" at the present time.



[CR: 19951015]

Travis, Brian E.; Waldt, Dale C. The SGML Implementation Guide: A Blueprint for SGML Migration Berlin/New York: Springer-Verlag, 1995. Extent: Approximately 350 pages. ISBN: 0-387-57730-0; 3-540-57730-0.

Author's abstract: This is the book the authors needed when they were first implementing SGML. At that time, and up until now, there has not been a complete source of information for the SGML implementor. We had to perform major research at every single phase of our implementation process using time-honored systems analysis techniques. While his approach worked, we would have gladly embraced any help we could have found.

The philosophy behind this book is to provide a pragmatic working knowledge of SGML and related disciplines and techniques needed to actually achieve a successful implementation.

The book is not a review of products, but it does contain mention of some products as an example of what is available. It is not an executive briefing offering a high-level view of the advantages of implementing a structured approach to data, nor is it a nuts-and-bolts description of how to write SGML applications. Rather, it strikes a ground between those two extremes, offering to the people who must make the decision to implement, then the implementors, enough information to get well down the road to SGML.

See the [provisional] Table of Contents for further overview, and more authoritatively in an updated announcement. The full outline and sample chapters (Chapter 1 and Appendix 6) are accessible via the Web: point your HTTP client at http://www.sgml.com/SGMLImplementationGuide/. Further, see two articles introducing The SGML Implementation Guide: the book as a case study and an editorial note in <TAG>. Dianne Kennedy reviewed the book in <TAG> 8/10 (October 1995) 5-6.



[CR: 19950804]

Triggs, Jeffery. "Varieties of electronic experience: what should an electronic text be like?" Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 179-189. ISSN: 1053-900X. Author's affiliation: Director of the North American Reading Program, Oxford English Dictionary.

"Triggs shows by a series of examples how easily an electronic text can fall short of realizing the full potential offered by its new medium. He calls to our attention the ways in which preconceived ideas of electronic text as a substitute for printed page can obstruct the goal of multi-purpose plasticity which so attracted us to texts in electronic form in the first place. He also warns us of the dangers of locking away the results of our hard editorial endeavours within a proprietary format, thus limiting its use to particular software systems." [from the issue Introduction, by Lou Burnard]

See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.



[CR: 19950716]

Tritt, Graham. "Starting a[n SGML] User Group." SGML Users' Group Newsletter 29 (November 1994) 5-6. ISSN: 0952-8008. Author's affiliation: Swiss Federal Office of Information Technology; Information and Documentation Center, Steigerhubelstr. 3, 3003 Berne, Switzerland. Tel:+ 41 31 325 9836 fax: -9767. Email: Graham.Tritt@ste3.bfi.admin.ch.

Based upon experiences with the Swiss SGML Users' Group since 1989, Tritt offers advice to others who may wish to learn from the group's accumulated wisdom.



Tucker, Hugh A.; Bogh, Torkil. SGML & ODA. Standards for Document Processing and Interchange. DS/INF 14, [1989]. Dansk Standardiseringsrad, 1989.

The publication is a book form of the technical report SGML/ODA: Standards for Document Processing and Interchange. See a summary and review of the work in "New Book on SGML and ODA Published. SGML & ODA. Standards for Document Processing and Interchange. DS/INF 14, 1989," <TAG> 12 (December 1989) 17-18.



[CR: 19971227 19971230]

Tucker, Hugh; Harvey, Betty. "SGML Documentation Objects within the STEP Environment." Pages 205-211 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Hugh Tucker]: Documenta, ApS, Hellerup, Denmark; Phone: +45 39 46 19 05; FAX: +45 39 46 19 08; Email: hugh@documenta.dk; [Betty Harvey]: Electronic Commerce Connection, Inc., Germantown, Maryland USA 20874; Phone: +1 (301) 540-8251; FAX: +1 (301) 540-4268; Email: harvey@eccnet.com; WWW: http://www.eccnet.com.

Abstract: "ISO 10303, Standard Exchange for Product Data (STEP), is being developed by a broad range of industries to provide extensive support for modelling, automated storage schema generation, life-cycle support, plus many more data management facilities. ISO 8879, Standard Generalized Markup Language (SGML), and the SGML family of standards, including HyTime and DSSSL, is used for modelling and encoding the documentation of industrial products, many of which are produced using STEP.

"There are technical differences between the STEP and SGML as well as differences in their application and spheres of enterprise. For example, STEP is used during the early stages of product development, e.g., design, testing, whereas SGML is more commonly applied during the latter processes of a product's life cycle.

"This paper discusses the technical differences and problems between the two technologies and outlines some of the identified requirements needed to harmonize the two types of data. An approach based on information objects is presented showing how SGML product documentation information can be incorporated and stored together with STEP information. Using an information object methodology could allow textual data such as designer's and testing notes, method annotations, comments, etc. produced during the beginning of the product development cycle to be associated and archived with the actual design models.

"The definition of an information object is discussed and the distinction is drawn between a perceptual documentation object type and the conceptual information object type needed in modelling STEP data. Implementation suggestions are made along with the practical requirements needed to make information objects effective and useful.

"The STEP standard task group, Product Documentation (ISO 184/SC4/WG3/T14) is currently tasked with the responsibility for creating a methodology for the cooperation of the STEP and SGML standards. Information will be provided about how current corporate initiatives could impact and provide pertinent input in the T14 Working Group."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

For more information on STEP, see the dedicated database entry SGML and STEP (ISO 10303 Standard for the Exchange of Product Data), and the STEP/SGML reference page from ECCNet.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971125]

Tucker, Hugh; Harvey, Betty. "STEP/SGML Standards Working Together." Page(s) 39-42 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Hugh Tucker]: Director, Documenta ApS, Hellerup, Denmark; Email: hugh@documenta.dk; [Betty Harvey]: President, Electronic Commerce Connection, Inc., USA; Email: harvey@eccnet.com; WWW: http://www.eccnet.com.

Abstract: "ISO 10303, Standard Exchange for Product Data (STEP), is being developed by a broad range of industries to provide extensive support for modeling, automated storage schema generation, life-cycle support, plus many more data management facilities. ISO 8879, Standard Generalized Markup Language (SGML), and the SGML family of standards, including HyTime (Hypermedia-Time-based Structuring Language, ISO 10744) and DSSSL (Document Style Semantics and Specification Language), is used for the documentation of products. These two standards, STEP and SGML, are used in the same industries and companies. STEP is used during product development and manufacturing, where as SGML products are usually created during the final processes of product development.

"This paper will discuss current initiatives in industry and government organizations for incorporating SGML product information during the beginning of the product development cycle. Several different initiatives from various corporations will be discussed. The benefits of each of the different methodologies will be discussed and analyzed.

"The STEP standard task group, Product Documentation (ISO 184/SC4/WG3/T14) is currently tasked with the responsibility for creating a methodology for the cooperation of the STEP and SGML standards. Information will be provided about how current corporate initiatives could impact and provide pertinent input in the T14 Working Group.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19980906]

Tuong Dao. "An Indexing Model for Structured Documents to Support Queries on Content, Structure and Attributes.." Pages 88-97 (with 18 references) in Proceedings of the IEEE International Forum on Research and Technology Advances in Digital Libraries - ADL 1998. [Fifth] Forum on Research and Technology Advances in Digital Libraries - ADL'98. Santa Barbara, CA. April 22-24, 1988. Sponsored by IEEE Computer Society Technical Committee on Digital Libraries, Library of Congress, Alexandria Digital Library, NASA Goddard Space Flight Center, National Library of Medicine, IBM, etc. Los Alamitos, California: IEEE Computer Society Press, 1998. ISBN: 0818684666. Author's affiliation: Department of Computer Science, Royal Melbourne Institute of Technology (RMIT), Melbourne, Australia; Email: tuong@kbs.citri.edu.au.

Abstract: "The complex internal structure of documents can be described and captured by documentation representation standards such as SGML and SGML related standards like HTML and XML. The hierarchical structure of documents and the attributes of documents as well as attributes of document components at all levels of the document hierarchy can be encoded with markup tags. In traditional text database systems, only queries on content are supported. The rich structural information contained in documents and the attributes of document components are not captured in these systems, and queries on structure and attributes are not supported. We propose a text model, a query language and an indexing scheme which can support queries on content, structure, and attributes of documents as well as attributes of text elements within documents. This model is schema-independent, and query evaluation time is at worst linear. We show that our indexing scheme can efficiently support a wide range of queries in a database for highly heterogeneous collections of structured documents. We provide query examples to show how all the information encoded in documents marked up according to the TEI Guidelines, an encoding standard adopted by the humanities disciplines, can be indexed and queried in our indexing model."

IEEE Computer Society Press Order Number PR08464.

Related papers: "Indexing Structured Text for Queries on Containment Relationships", ACSC '96, Nineteenth Australasian Computer Science Conference, Melbourne, January/February 1996. Or: "Indexing Documents for Queries on Structure, Content, and Attributes," By Ron Sacks-Davis, Tuong Dao, James Thom and Justin Zobel; (RMIT) Friday, November 28, 1997, International Symposium on Digital Media Information Base (DMIB'97)



[CR: 19960402]

Tuong Dao; Sacks-Davis, Ron; Thom, James. "Indexing Structured Text for Queries on Containment Relationships." Australian Computer Science Communications 18/2 (1996) 82-91 (with 12 references). ISSN: [?]. Author's affiliation: Department of Computer Science, RMIT.

"Abstract: Documents consist of logical components such as titles and paragraphs. The complexity of the structure of documents is captured by document representation standards such as SGML. The GCL (Generalized Concordance Lists) query language has been proposed for collections of structured documents such as SGML documents. It uses containment relationships to provide a simple and effective way to formulate traditional boolean queries as well as queries specifying document structure and provides the flexibility to access, within the same database, documents which conform to multiple hierarchical structures and have different markup schemas. GCL also allows the retrieval of structural elements at any level of the document structure. The flexibility allowed by the language and its implementation comes with a significant restriction: no recursive structures are allowed. However such structures are present in many SGML documents where components are defined recursively. The paper proposes to extend GCL to allow recursive structures. An implementation framework based on an interval indexing scheme is provided to demonstrate that only small extensions are required to support recursive structures."

Based upon a paper presented at ADC '96. Seventh Australasian Database Conference, Melbourne, Victoria, Australia, 29-30 January 1996

See alternate entry.



[CR: 19960125]

Tuong Dao; Sacks-Davis, Ron; Thom, James. Indexing Structured Text for Queries on Containment Relationships. Paper to be presented at the 7th Australasian Database Conference in January [29-30] 1996. Melbourne, Australia: Department of Computer Science, RMIT, 1996. Extent: 10 pages, 12 references. Authors' affiliation: Department of Computer Science, RMIT.

"Abstract: Documents consist of logical components such as titles and paragraphs. The complexity of the structure of documents is captured by document representation standards such as SGML. The GCL (Generalized Concordance Lists) query language has been proposed for collections of structured documents such as SGML documents. It uses containment relationships to provide a simple and effective way to formulate traditional boolean queries as well as queries specifying document structure and provides the flexibility to access, within the same database, documents which conform to multiple hierarchical structures and have different markup schemas. GCL also allows the retrieval of structural elements at any level of the document structure."

"The flexibility allowed by the language and its implementation comes with a significant restriction: no recursive structures are allowed. However such structures are present in many SGML documents where components are defined recursively. This paper proposes to extend GCL to allow recursive structures. An implementation framework, based on an interval indexing scheme, is provided to demonstrate that only small extensions are required to support recursive structures." [abstract supplied by the authors]

Available in Postscript format: ftp://phobos.kbs.citri.edu.au/pub/tuong/adcpaper.ps.Z [mirror copy]. The paper will appear in the Proceedings of the 7th Australasian Database Conference, Melbourne, Australia. See alternate entry.



[CR: 19951220]

Turner, Linda. How People are Approaching Business Cases for SGML. Avalanche White Paper. Boulder, CO: Avalanche Development Company/Interleaf Inc., 1993. Extent: approximately 5 pages.

"The implementation of SGML is a strategic move that gives companies a competitive advantage, because they are able to take control of their critical information resources. Considering this factor alone, companies seem to feel that the long-term, intangible benefits outweigh the dollars spent on its implementation in the front-end. Studies of SGML implementors have taught us this Organizations who are already implementing SGML today feel that they are ahead of the game, and that other organizations who value their information resources will sooner or later turn to SGML, unless they want to fall out of the competition." [extracted]

Available online in HTML format from the Avalanche WWW server; [mirror copy].



Turner, Ron; Douglass, Tim; Turner, Audrey. README FIRST: SGML for Writers and Editors. Charles F. Goldfarb Series On Open Information Management. Englewood Cliffs, NJ: PTR Prentice Hall, [forthcoming May] 1995. ISBN: 0-13-432717-9.

Summary: "This is a non-technical introduction to SGML for writers and editors who need to work in an SGML environment. The focus is not on the technical details of the standard but rather on how writers and editors can benefit from and work effectively with SGML. Included with the book is a diskette that contains SGMLAB, a DOS-based SGML application that includes a parser and browser and numerous sample SGML documents. Using SGMLAB, readers can view on-line both the structure and output of SGML documents, and validate those documents". [publisher's pre-publication description]

See a review of the book in <TAG> by Simon Wickes. See also the review in Seybold Report on Publishing Systems25/9 (January 29, 1996) 42, and a review by Lynne Price. Also by Ron Burk. A fuller description from the publisher is also online. See also the "Prentice-Hall SGML Series" web page.



[CR: 19961112]

"[Seybold Staff.] Read It If You Must; Avoid It If You Can. [Seybold Report Review of] Turner, Ronald, README FIRST: SGML for Writers and Editors." Seybold Report on Publishing Systems 25/9 January 29, 1995 42. ISSN: 0736-7260.

The review of Turner's book README FIRST: SGML for Writers and Editors is generally unfavorable, at least with respect to prospective SRPS readership: ". . . this disappointing collaborative effort might be more appropriately subtitled 'Teaching Yourself to Read SGML'." See the review article on the Seybold WWW server.



United States Department of Energy. Office of Administration and Management. Office of Information Resources Management. Office of Scientific and Technical Information.. Electronic Exchange of Scientific and Technical Information (STI) Strategic Plan DOE/OSTI Report. Oak Ridge, TN: OSTI, Scientific and Technical Information Services Division, January, 1993. approximately 16 pages.

"On August 28, 1991, a memo from R. S. Barrow, Director of the Office of IRM Policy, Plans, and Oversight (AD-24), announced that the Standard Generalized Markup Language (SGML) as defined in Federal Information Processing Standard (FIPS) 152 is adopted as the DOE standard for accomplishing this electronic exchange. The Office of Scientific and Technical Information (OSTI) (AD-21) was given the overall responsibility for managing the adoption and transition to the use of SGML for scientific and technical documents. SGML, along with other standards such as the Government Open Systems Interconnection Profile (GOSIP), will provide a common standard for electronic document processing and exchange. . . This initiative will emphasize and enhance full life-cycle management of STI. Electronic exchange of STI will benefit users, generators, and managers of information. The full realization of the implementation of SGML will facilitate interchange among the members of the scientific and technical community by providing increased versatility of information (new ways to use information), encouraging multiple uses of information, stimulating increased use of STI, and providing more flexible access to many types of information. The potential for global electronic interchange of STI and the focus on content rather than format is expected to revolutionize the use of information." [from the document Introduction]

Available from the DOE/OSTI WWW server as ELECTRONIC EXCHANGE OF SCIENTIFIC AND TECHNICAL INFORMATION (STI) [or in mirror copy here].



United States Department of Energy. Office of Scientific and Technical Information.. Guide for Transmitting Standard Generalized Markup Language (SGML) Encoded Bibliographic Records DOE/OSTI 11865. Oak Ridge, TN: DOE/OSTI, March1995.

"This document provides the guidance necessary to transmit to the Department of Energy Office of Scientific and Technical Information (OSTI) an encoded bibliographic record that conforms to International Standard ISO 8879, Information Processing Text and office systems Standard Generalized Markup Language (SGML). Included in this document are element and attribute tag definitions, sample bibliographic records, the bibliographic document type definition, and instructions on how to transmit a bibliographic record electronically to OSTI." [from the Introduction]

Apparently still under development [June 1995]. The DTD and documentation are online. Available as a series of HTML documents from the DOE/OSTI WWW Server. See also the main bibliography page.



[CR: 19971227]

Usdin, B. Tommie. "View from the Chair." Pages 7-10 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [B. Tommie Usdin]: President, Mulberry Technologies, Inc., 17 West Jefferson Street, Suite 207, Rockville, Maryland 20850 USA; Phone: +1 301/315-9634; FAX: +1 301/315-8285; Email: btusdin@mulberrytech.com; WWW: http://www.mulberrytech.com/

Summary: Usdin's presentation was in the form of a "Welcome to SGML/XML'97." Usdin reflected on the previous SGML conferences, where both HTML and XML have been important: "We have been discussing HTML at SGML conferences since 1994. XML was publicly introduced at SGML'96, and is in the name of the conference (SGML/XML '97) in 1997."

[Excerpted:] "I'm not sure how we as a community will feel about XML by the end of this week, but coming into SGML/XML '97 I detect an attitude as different from our 1994 attitude on HTML as day is from night. We don't despise XML, we worship it. We aren't worried about the threat XML poses to SGML, we worry about the threat SGML poses to XML. We remove the word SGML from our marketing materials, our web sites, and our products. The dirty little secret is that underneath our new XML toys lies an SGML core. Shhhh. Don't tell anyone.

"There are vendors who want to be associated with XML but not SGML. They want to sponsor XML events but to avoid having their names sullied by an association with SGML. The revisionist historians are claiming five-year old SGML projects as XML experience. We've gone completely nuts over XML. At least, some of the noisiest of us have. Well, we certainly are a moody group, aren't we. It seems to me that the 1994 hysteria was unreasonable; HTML did not destroy SGML. It seems to me that the 1997 hysteria is equally unreasonable; SGML will not destroy XML.

"The relationship between SGML and XML is deep and complex. XML is SGML. And it can be because SGML has grown quickly and significantly in order to accommodate the requirements of XML. In the process, SGML has been improved for all applications, not just for XML. SGML has given XML a rational structure and discipline within which to grow and an installed base of users and tools; XML has given SGML momentum and visibility."

This paper was delivered as part of the "Introductions" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950716]

Usdin, Tommie; Rubinsky, Yuri. "The SGML Year in Review - 1993." <TAG> 7/1 (January 1994) 6-13. ISSN: 1067-9197.

Detailed report on SGML events in 1993. See the pointers to online copies of the report and print copies in other publications.



[CR: 19950716]

Usdin, Tommie; Rubinsky, Yuri. "The SGML Year in Review 1994." SGML Users' Group Newsletter 29 (November 1994) 3-5. ISSN: 0952-8008.

Detailed report on SGML events in 1994. See the pointers to online copies of the report and print copies in other publications.



[CR: 19970909]

Vacca, Dick. "Planning for Document Management - How to Get Started." The Gilbane Report on Open Information & Document Systems 4/1 (March - April 1996) 1-23. ISSN: 1067-8719. Author's affiliation: University of Wisconsin.

A comprehensive article on issues and concerns in planning a document management implementation. SGML is discussed at several points, including DTD design (or its equivalent) and document conversion.



[CR: 19971202]

van Dam, Andy. "Looking Back Thirty Years and Forward Three: Critical Themes in the Development of the Electronic Book [Opening Keynote Address]." Pages [?] in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Brown University.

See the main database entry for additional information about the conference, or the Brown University web site.



van Dam, L.; van Loenen, E. A Programmer's Interface to SGML. Technical Report. Geneva: CERN, 1989.



[CR: 19971202]

van den Hout, Erik. "Independent Links - A Maintenance Advantage?" Pages 79-83 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Groningen University; Email: E.H.M.van.den.Hout@Let.RuG.NL; WWW: http://thok.let.rug.nl/evdh/.

Summary: "This paper will focus on so-called independent links (ilinks). These are links whose definitions are located separately from the document in which their link-ends reside. Use of independent links is attractive, because they might provide solutions to current maintenance problems. Additionally, some of their problems will be described. And finally, a solution for maintaining reliable links in complex, evolving documents might be found in independent links. The crucial issues related to maintenance difficulties will be the focus. The central thesis of this paper is that independent links can improve the maintenance of a hypertext, and therefore its reliability over time."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/vandenhout.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.



[CR: 19950716]

van Kirk, Doug. "Getting a Grip on Unstructured Data." InfoWorld 17/21 (May 22, 1995) 51, 54.

The author discusses the role of SGML in structuring data as the notion of "corporate document" (as a central metaphor for information locus) becomes stronger within the business community.



Vanoirbeek, Christine. "Formatting Structured Tables." Pages 291-309 (with 21 references) in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Author affiliation: Swiss Feberal Institute of Technology, Lausanne, Switzerland.

Abstract: The objective of this paper is to analyse the problem of integrating tables with structured documents. After specific problems related to both editing and formatting activities have been analysed, an overview of different existing approaches is given. The paper emphasizes some shortcomings the usual table representations. It describes a new approach based on the distinction of logical and physical structure and argues for a multi-dimensional representation that properly integrates tables in a logical document structure. It concludes with the description of a prototype that implements these ideas.



[CR: 19970826]

Veen, Jeffrey. "XML: Metadata for the Rest of Us [Part 1]; XML: Roll Your Own Markup [Part 2]." Wired News [Technology] [?]/[?] (July 8 and July 14, 1997) .

A two-part article on XML consisting of an interview with Tim Bray, one of the chief architects of the XML (Extensible Markup Language) standard.

Summary: [Part 1]: "What if you could merge the simplicity of HTML with the flexibility of Standard Generalized Markup Language (SGML)?, [Part 2]: This week, we talk about some of the underlying workings of XML and take a look at some practical applications."

Available in HTML format: Part 1 online, and Part 2 online. [part 1 archive copy], [part 2 archive copy]



[CR: 19971227]

Vercio, LtCol Carl F. "Implementing SGML in the Office of the Secretary of Defense (OSD)." Pages 581-584 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [LtCol Carl F. Vercio]: Directorate Information Technology Officer, Directorate for Correspondence and Directives, Washington Headquarters Services, Pentagon, Washington, DC 20310; Phone: +1 (703) 697-9285; DSN 227-9285; FAX: +1 (703) 695-1219; DSN 225-1219; Email: vercio@osd.pentagon.mil.

Abstract: "The Office of the Secretary of Defense (OSD) recently adopted SGML as the standard to create a non-proprietary publishing database to produce policy and procedure documents for dissemination on the World Wide Web (WWW).

"From April 1995 to November 1996, a complete SGML subsystem was developed from a detailed library analysis through the development of DTDs and style sheets, to active production of new and revised documents. After conducting a market survey in June 1996, SoftQuad's Panorama PRO was selected to post the documents to the WWW because no conversion to HTML was necessary. Style sheets were developed and in November 1966 the first SGML-tagged DoD issuances were placed on the WWW.

"A multi-site production process has grown from these humble beginnings and is being enhanced to include electronic coordination with digital signatures and integration with the DoD electronic forms library. Along the way, many lessons were learned, that can be shared with newcomers to SGML, to make the transition to SGML easier for those who contemplate starting an SGML project."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19980931]

Vercoustre, Anne-Marie; Paradis, François. A Descriptive Language for Information Object Reuse through Virtual Documents. Paper presented at the 4th International Conference on Object-Oriented Information Systems (OOIS '97). Victoria, Australia: , [November] 1997. Authors' affiliation: Commonwealth Scientific and Industrial Research Organisation (CSIRO) Mathematical and Information Systems, Australia..

Abstract: "The importance of reuse is well recognised for electronic document writing. However, it is rarely achieved satisfactorily because of the complexity of the task: integrating different formats, handling updates of information, addressing document author's need for intuitiveness and simplicity, etc. In this paper, we present a language for information reuse that allows users to write virtual documents, where dynamic information objects can be retrieved from various sources, transformed, and included along with static information in SGML documents. The language uses a treelike structure for the representation of information objects, and allows querying without a complete knowledge of the structure or the types of information. The data structures and the syntax of the language are presented through an example application. A major strength of our approach is to treat the document as a non-monolithic set of reusable information objects."

The document is online: "A Descriptive Language for Information Object Reuse through Virtual Documents." [check re: archive copy]

See: "Reuse of Linked Documents through Virtual Document Prescriptions." By Anne-Marie Vercoustre and François Paradis [INRIA (France) and CSIRO (Australia)]. Pages 499-512 in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). Saint Malo, France, March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. New York/Berlin/Heidelberg: Springer-Verlag, 1998. See also "A Virtual Document Approach for Reusing SGML/XML Information Objects," by François Paradis, Anne-Marie Vercoustre, and Brendan Hills; Paper presented at the SGML/XML Asia Pacific, Sydney, Australia, 22-24 September, 1997. See also: Publications related to RIO - "Reuse of Information Objects through virtual documents", and the RIO Home Page.



[CR: 19980907]

Vercoustre, Anne-Marie; Paradis, François. "Reuse of Linked Documents through Virtual Document Prescriptions." Pages 499-512 (with 24 references) in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). EP '98 and RIDT '98, Saint Malo, France. March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. Lecture Notes in Computer Science Series, Number 1375. New York/Berlin/Heidelberg: Springer-Verlag, 1998. ISBN: 3-540-64298-6, and 3-540-64298-6. Authors' affiliation: Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia and Institut national de recherche en informatique et en automatique (INRIA), Le Chesnay, France..

Abstract: "As the WWW becomes a major source of information, a lot of interest has arisen, not only for searching for information, but for reusing this information in new pages, or directly within applications. Unfortunately HTML tags do not provide a significant level of structure for identifying and extracting information, since they are mostly used for presentation issues. Moreover the simple link mechanism of the Web does not support the controlled traversal of links to related pages. Particularly promising is the proposal for a new standard, XML, which could bring the power of SGML to the Web while keeping the simplicity of HTML. We present a system and a language that allow reusing of information from various sources, including databases and SGML-like documents, by combining it dynamically to produce a virtual document. The language uses a tree like structure for the representation of the information objects as well as link objects. The paper focuses on the selection and the traversal of XML links to extract information from linked pages. The strength of our approach is to be an SGML-compliant solution, which makes it ready to take full advantage of XML for reusing information from the Web as soon as it is widely used."

"Reusing information contained in electronic documents is becoming a major issue, whether it is proprietary information or information available from the Internet. In this paper we have presented a language for reusing information objects from heterogeneous sources, including SGML, XML, and HTML documents. Our approach is to use a middleware format to integrate the results of queries from the various sources and to map them into a new (virtual) document. We have demonstrated more specifically how to use the language for following XML links and to control the traversal of links using their types and properties. Since our solution is generic and fully SGML compatible we are ready to benefit from the intelligence that XML, or any HTML extension, will bring to the Web for supporting the extraction and reuse of information. The approach is currently being implemented using Java for the interpreter and the database server; our prototypal application is a virtual document prescription for generating activity reports that reuse information from our Intranet, an SQL database of staff and an OO database of documents. Other potential applications are: flexible and manageable generation of large documentation, or configuration of Intranet servers. Further extensions to the language would include control instructions to make the virtual document more adaptable to the actual results of queries, and explicit instructions for building a set of related pages."

Send email to François Paradis to request a paper version or electronic copy. See also: Slides, "Reuse of Linked Documents through Virtual Document Prescriptions." Or: the online presentation abstract, and the full text in PDF; [local archive copy]



[CR: 19961012]

Vercoustre, Anne-Marie; Quint, Vincent; Paoli, Jean; Vatton, Irène. Turning an Authoring Tool Wired to the Web into a Browser. Paper presented at the AUUG'& Asia-Pacific WWW '95 Conference, September 18-21, 1995 [Proceedings, 95-104]. UNRIA Rocquencourt: [copyrignt by] AUUG95 and APWWW95, October 1995. Extent: approximately 13 pages, 15 references. ISBN: 1-875781-43-9. Authors' affiliation: Grif/INRIA. WWW: Anne-Marie Vercoustre Home Page.

Abstract: "The success of the WWW came with browsers such as NCSA Mosaic that provide a user-friendly graphic interface for accessing information on the Internet. For displaying documents, all browsers parse the HTML files they receive from servers and interpret the HTML tags they contain. Accessing documents is done through link activation or by typing URL's. These features are at the core of any browser. Symposia is an SGML-based WYSIWYG authoring system that has been the first editor wired to the Web. We study in this paper how Symposia has been turned into a browser by taking advantage of its generic structured approach and its extensibility through the GATE API. More advances features for browsing are also discussed and suggested."

Available on the Internet: http://www.csu.edu.au/special/conference/apwww95/papers95/avercous/avercous.html; [mirror copy].



[CR: 19960907]

Vercoustre, Anne-Marie; Lindley, Craig A. Information Retrieval and Link Authoring in an SGML-Based Editor. INRIA Report RR-2591. Rocquencourt: INRIA, Juin 1995. Extent: approximately 14 pages. ISSN: 0249-6399. Authors' affiliation: INRIA, Domaine de Voluceau-Rocquencourt, B.P.105 78153 Le Chesnay Cedex. Email: Anne-Marie.Vercoustre@inria.fr. WWW: Anne-Marie Vercoustre Home Page.

Abstract: This document describes the integration of Grif, an SGML editor developed at INRIA and marketed by Grif, SA. with Sigma, a text retrieval tool developed by CSIRO. The integration provides Grif with flexible search and dynamic hypertext linking functions, and enhances the Sigma system to support search and display of SGML documents using a structured editor. The integration also clarifies the requirements for more generic facilities for document search, linking, and indexing for the reqpective systems as modular components of an open systems environment."

Available in Postscript format via the Internet: http://pauillac.inria.fr/~vercous/DOCS/Grif-Sigma.ps; [mirror copy].



Vignaud, Dominique. L'édition structurée des documents: SGML application à l'édition français. Paris: Éditions du Cercle de la Librarie, 1989. ISBN: 2-7654 0420-8.

This volume was prepared to assist French publishers with application of the SGML standard. It supplies a basic DTD, and additional materials are available (including electronic files) for extending the DTD. The book is said to be the first volume in a series L'édition structurée des documents, published by Éditions du Cercle de la Librarie. For availability, contact the Syndicat nationale de l'édition (SNE) or: Éditions du Cercle de la Librarie, 35 rue Grégorie-de-Tours, 75006 Paris, France. Additional details: see "SGML: application à l'édition français," SGML Users' Group Newsletter 13 (August 1989) 9; Yuri Rubinsky's brief review, "Can Imaginative Objects Have Intentions?" <TAG> 10 (July 1989) 11; or "French Book DTD Available," <TAG> 9 (March/April 1989) 15. The book is similar in purpose to the American (EPSIG/AAP) volume "Standard for Electronic Manuscript Preparation and Markup" published by NISO, and to the British volumes written by Joan Smith: Smith and Smith. Whereas the EPSIG/AAP standard for electronic publishing defined some 220 tags, Vignaud's DTD deliberately defines only 60 tags.



[CR: 19971227 MD: 19971229]

Vijghen, Philippe. "Experience of EDI for Documents: The Role of SGML." Pages 213-218 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Philippe Vijghen]: Project Manager, SGML Technologies Group, ACSE sa/nv, Boulevard Général Wahis, 29, B-1030 Brussels, Belgium; Phone: +32 (2) 705.70.21; FAX: +32 (2) 705.81.01; Email: phv@acse.be WWW: http://www.sgmltech.com.

Abstract: "This paper describes the use of SGML in the EDIDOC project for the European Space Agency. The project involved the creation of a flexible framework for exchanging different types of documents, being a gateway for workflow, document conversions, security, and communication. It is used for calls for tenders, working documents, and press releases, and also covers WWW publication.

"SGML was used for many aspects including attaching the different envelopes of the messages exchanged and as a technology for defining workflow scenarios. Benefits and challenges of using SGML or XML at different levels are highlighted, both technically and organizationally."

"[Conclusion:] This paper has demonstrated where SGML is invaluable for system development and integration, based on the EDIDOC experience. XML can be considered as a candidate of choice for structuring data in EDI applications, when no EDIFACT message fulfills the needs. It is also very useful as a way to structure interprocess communications when integrating distributed applications. Full-blown SGML toolkits are a must for data processing. They are particularly useful for implementing data convertors, just by using standard features such as OMITAG, SHORTREF, LINK, and CONCUR. Finally, SGML and related development tools offer a nice way for addressing, at the same time, the definition, documentation, and implementation of workflow scenarios."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

A version of the document is available online in HTML format: "Experience of EDI for Documents: The Role of SGML"; [local archive copy]. See also the white paper on EDIDOC from the SGML Technologies Group: "Electronic Data Interchange for Documents".

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950716]

Villeval, Jean. "SGML Users' Group in Switzerland." SGML Users' Group Newsletter 28 (August 1994) 19. ISSN: 0952-8008. Author's affiliation: Orga Consult.

A report on the activities of the Swiss group (SGML Users' Group Switzerland, SUGS) including an April conference "Hypertext and the Future of Text." The SUGS Newsletter is now edited in SGML and published with EBT's DynaText. Contact: (1) Jean Villeval, Tel/FAX: 41-1-241-83-68. Email: villeval@orga-consult.ch, or (2) Pierre Deruaz, Tel +41 31 308 2626, fax 308 2627.



[CR: 19990904]

Vint, Danny R. SGML at Work. [A Start-to-Finish Real-World Guide to Implementing SGML/XML Systems and Strategies.] Prentice Hall Professional Technical Reference Series. Upper Saddle River, NJ: Prentice Hall PTR, [July] 1998. Extent: xvi + 848 pages, CD-ROM and SGML Reference Card. ISBN: 0-13-636572-8. Author's affiliation: Lexis Law Publishing; Email: Dan.Vint@lexis-nexis.com, or dvint@slip.net [Charlottesville, VA].

Summary: "SGML at Work provides a start-to-finish real-world guide to implementing SGML/XML systems and strategies. SGML at Work covers the nitty-gritty of building an SGML publishing system: developing DTDs as efficiently as possible; introducing a quality control cycle that works; using and adapting conversion tools; and more. You'll find extensive code and software configurations you can use today. The book answers key questions like: 1) Where do I start?; 2) Should I implement my own DTDs or use DTDs that already exist?; 3) How can I convert my legacy documents?; 4) Which kinds of documents should I begin with?; 5) How do I build an effective SGML-based document management system?; 6) Which tools are available, and how do I make them work together? The CDROM contains a library of commercial SGML trialware, and configurations to support the immediate use of Corel Ventura Publisher 7 and WordPerfect 8; Grif; InContext SGML editors; Arbortext ADEPT*Editor and Document*Architect; the OmniMark programming language for document conversion and manipulation the Texcel Information Manager; and all these deliver tools: INSO DynaText and DynaWeb; and SoftQuad Panorama Pro. You'll also find a complete working sample implementation that takes a FrameMaker document through DTD development, SGML conversion, cleanup, editing, presentation and document management. This sample is used to develop programs and demonstrate the SGML industry's best commercial shareware and freeware tools." [from the author/publisher]

See the online volume description and Table of Contents; [local archive copy] Also the CTS posting with TOC.



[CR: 19960429]

Vittal, Chiradeep. An Object-Oriented Multimedia Database for a News-On-Demand Application. Technical Report TR 95-06. Edmonton, Alberta, Canada: Department of Computing Science, University of Alberta, June 8 1995. Extent: 67 pages .

Abstract: "Multimedia applications need support from an underlying multimedia storage system to store and retrieve multimedia objects. The presence of spatio-temporal and composition relationships between the objects, their large volume and their inherent distribution pose interesting modeling and implementation requirements. These requirements cannot be fully met by conventional means such as file servers and relational databases. The design and implementation of a multimedia database to satisfy these requirements is described in this thesis. The design is targeted to a News-on-Demand application. The features of this work are the use of object-oriented database technology, and the use of document standards to represent multimedia documents.

"News-on-demand is a distributed multimedia application that uses broadband network services to deliver news articles to subscribers in the form of multimedia documents. A type system is developed to model the individual media components (monomedia) of the documents. To capture the composition and spatio-temporal relationships between the monomedia objects in the news articles, the SGML and HyTime document standards are used. This is done by designing an SGML/HyTime document type declaration (DTD) for multimedia news articles. This DTD is mapped to a type system. An annotation scheme ensures efficient storage of the text component of the document. These type systems are implemented on an object-oriented database system and fully satisfy the requirements of the news-on-demand application."

The document is available online in Postscript format: "An Object-Oriented Multimedia Database for a News-On-Demand Application" - ftp://ftp.cs.ualberta.ca/pub/TechReports/TR95-14/TR95-14.ps.Z [mirror copy]



[CR: 19950716]

Vizard, Michael. "Adobe to Offer SGML Translation Tool." Computerworld 27/16 (April 19 1993) 14-.

"Abstract: Adobe Systems Inc. recently announced that it will provide a conversion software package that will translate Adobe file formats into the Standard Generalized Markup Language (SGML) format. Avalanche Development Corporation will provide software to convert documents using Adobe's Portable Document Format (PDF) into an SGML-compliant format. However, Adobe's primary solution for document interchange will remain PDF, which is the cornerstone of Adobe's forthcoming Acrobat products."

As of mid 1995, this effort is still underway [I think], though perhaps not with the help of Avalanche.



Vizard, Michael. "Mobil Refines Specs for Oil Facilities." PC Week 11/26 (July 4, 1994) 33, 40.

Abstract: Mobil Oil Corp spent $7 million to develop an online documentation system that contains all the specifications required for constructing the company's oil and gas refineries throughout the world. Such a documentation system was necessary to ensure that Mobil did not incur unnecessary construction costs from projects that varied from company-imposed specifications. Mobil used ArborText's Adept software for creating its Standard Generalized Markup Language (SGML)-based online system. SGML was initially developed by the DOD and is a useful tool for creating and tracking compound documents. Users can download and annotate individual copies of a specification, but they cannot add online changes to core specifications. Mobil also spent $2 million on hardware costs for the new system including purchasing Unix servers, workstations and a variety of PCs.

Describes the decision of Mobil to spend $7 million on an SGML-based online documentation system. Mobil is using SGML software from ArborText, Inc.



[CR: 19950828]

Vizine-Goetz, Diane; Godby, Jean; Bendig, Mark. "Spectrum: A Web-based Tool for Describing Electronic Resources." Computer Networks and ISDN Systems. [Third International World-Wide Web Conference, Darmstadt, Germany, 10-14 April 1995]. 27/6 (April 1995) 985-1001 (with 13 references). Author's affiliation: Office of Research, Online Computer Library Center [OCLC], Dublin, OH, USA. Email: vizine@oclc.org; godby@oclc.org;bendig@oclc.org .

"Abstract: Substantial efforts to establish standards for encoding and accessing electronic resources have occurred over the past five years. We have designed a Web-based tool, called Spectrum, to enable individuals without specialized knowledge of library cataloging or markup to create records for describing and accessing networked electronic resources of various types. System users may create descriptions of electronic resources and view them as formatted USMARC bibliographic records, TEI headers and URCs. Because we anticipate continued volatility in the definition of data element standards, the Spectrum system is designed to allow maximum flexibility in the design of the input formats." [Keywords: USMARC format, Text Encoding Initiative (TEI) header, Uniform Resource Citation (URC), CGI script, SGML, bibliographic data, library cataloging, text retrieval]

The document is available online as part of the WWW '95 conference proceedings; [mirror copy, partial only]



Vizine-Goetz, Diane. "Cataloging Productivity Tools. I. Spectrum: A Web-Tool for Describing Internet Resources." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 7 pages, 3 references. Author's Affiliation: OCLC, Consulting Research Scientist.

"Abstract: Over the past five years, librarians, humanities computing researchers, and computer scientists have been working to establish standards for encoding and accessing local and networked electronic information resources. These standards are just now being put into practice by their corresponding user communities, and their application provides opportunities for exploring synergies among the various approaches used. The OCLC Office of Research Cataloging Internet Resources project is investigating the relationship among three of these the Machine- Readable Cataloging (MARC) format used by librarians, the Text Encoding Initiative (TEI) Header developed by humanities computing researchers, and the emerging Uniform Resource Citation (URC) standard for accessing materials on the World Wide Web. One result of our analysis is a prototype Web-based tool called Spectrum that enables individuals without specialized knowledge of library cataloging or markup to create records describing the bibliographic and location elements of networked electronic resources of various types."

A version of this paper will appear in the forthcoming proceedings of the 1994 Chicago World Wide Web Conference to be published as part of a special issue in Elsevier's Scientific Computer Networks and ISDN Systems. The draft version is available online via the OCLC WWW server [mirror copy w/ partial links only].



[CR: 19951113]

Vliet, J. C. [Hans] van (editor). Text Processing and Document Manipulation. Proceedings of the International Conference, University of Nottingham, 14-16 April 1986. The British Computer Society Workshop Series (P. Hammersley, general editor). Cambridge: Cambridge University Press [on behalf of the British Computer Society], 1986. Extent: viii + 277 pages, bibliography [260-275], index. ISBN: 0-521-32592-7. Editor's affiliation: Centrum voor Faculteit Wiskunde et Informatica, Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam.

The Nottingham conference of 1986, following in the footsteps of the Lausanne and Portland conferences in 1981, and of the Rennes conference in 1983, brought together a number of disciplines to lay solid theoretical foundations for further work on structured documents and interactive editing. A number of papers in this proceedings volume are relevant to SGML.



[CR: 19971216]

Volk, Martin. "Markup of a Test Suite with SGML." Pages 59-76 in Linguistic Databases. [Conference on] Linguistic Databases. Centre for Language and Cognition and Centre for Behavioral and Cognitive Neuroscience, University of Groningen, Groningen, The Netherlands. March 23-24, 1995. Sponsored by the Dutch National Science Foundation (NWO), Royal Dutch Academy of Science (KNAW), et al.. Edited by John Nerbonne (Computational Linguistics, and Humanities Computing, University of Groningen). CSLI Lecture Notes, Number 77. Stanford, CA: Center for the Study of Language and Information, 1998. ISBN: 1-57586-093-7 (hardback), 1-57586-092-9 (paper). Author's affiliation: Computational Linguistics, University of Zurich, Zurich, Switzerland; Email: volk@ifi.unizh.ch; WWW: http://www.ifi.unizh.ch/staff/volk.htm.

Summary: The paper discusses the development of a test suite covering the syntactic phenomena of a natural language; the materials may be uased for testing NLP software.



Volz, Marc; Aberer, Karl; Böhm, Klemens. A Flexible Approach to Combine IR Semantics and Database Technology and Its Application to Structured Document Handling. GMD-IPSI Arbeitspapier, Nr. 891. Sankt Augustin: GMD-IPSI, January, 1995. 22 pages, 30 references.

"Abstract: In the field of hypermedia-document management DBMSs, which are particularly suited to handle structured information in multi-user environments, complement IR systems providing content-oriented retrieval capabilities. We have integrated the IR system INQUERY with the object-oriented DBMS VODAK. It is shown that combining structural queries with IRS queries leads to non-trivial questions with regard to retrieval semantics and query processing. Further, VODAK supports the management of user-definable typed document structures according to SGML/HyTime. We describe a sample application of the coupling for administering such documents. As document types are arbitrary, the coupling has to provide for high flexibility in the mapping of database objects to IR entities.

Available in Postscript format as P-95-01.ps.Z from the GMD-IPSI FTP server.



[CR: 19970909]

von Hagen, William. SGML for Dummies. For Dummies, Computer Book Series from IDG. Foster City, CA / Chicago, IL / Southlake, TX: IDG Books Worldwide, 1997. Extent: xxiv +386 pages, CDROM. ISBN: 0-7645-0175-5. Author's affiliation: Get Hip, Inc. Email: wvh@gethip.com.

Summary: "Author Bill von Hagen provide practical, easy-to-understand coverage of topics such as: (1) What SGML is, where it came from, and where it's heading; (2) How SGML can solve otherwise intractable documentation problems; (3) Whether using SGML makes sense for your business or organization; (4) Whether you should create your own SGML document types or try to work with established ones; (5) How to convert your existing documents into SGML efficiently; (6) How to get the best out of both SGML and desktop publishing." [publisher's description]

The CDROM disc contains: (1) A 90-day demo version of Corel WordPerfect; (2) demo version of Digitome's IDM Personal Edition; (3) sample SGML applications from SGML Systems Engineering - SGMLC; (4) James Clark's SP parser for Win 95/NT; (5) a 45-day demo version of the HyBrowse Browser; (6) sample DTDs.

See the book description from the publisher at http://www.dummies.com/, or local archive copy.



[CR: 19961226]

Vooren, Ludo Van. "Conversion to Industry Standard ATA DTDs." Pages 615-618 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Senior Marketing Manager, Jeppesen Sanderson, 55 Inverness Avenue East, Englewood, Colorado 80112, USA; Tel: +1 303-784-4693; FAX: +1 303-784-4113; Email: ludo@jeppesen.com; WWW: http://www.jeppesen.com/.

Abstract: "In this presentation I would like to share my years of experience in SGML conversion, by reviewing several strategies for converting Legacy documents to ATA standard DTD. In particular, I would like to review practical applications of such strategies in Jeppesen's Maintenance Information Services daily operation.

Among the subjects covered will be input analysis, interchange DTD versus publishing DTD, manual clean-up versus automatic conversion, the "divide and conquer" approach, CALS table conversion and paper conversion. With about 1,000,000 pages of SGML converted so far, I believe we have faced most of the obstacles in this domain."

See the SGML/XML Web Page main entry for ATA (Air Transport Association) for more information on the ATA DTD and its usage.

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



Vooren, Ludo van. "On the Road to SGML. Whitepaper from Avalanche Development Corporation. 1994. approximately 8 pages.

"This document describes a sensible migration to SGML. It shows a way to start implementing the SGML approach by working on your current documents not on your current system. Once your document value is increased, the system can evolve painlessly to take advantage of the new information available. The benefits will increase along the way, and eventually you will be able to operate the ultimate SGML system to fully leverage your information investment." [from the concluding section]

Available in electronic format from the Avalanche/Interleaf server, or here in mirror copy (May 1995).



[CR: 19970621]

Vooren, Ludo van; Severson, Eric C. "SGML Architectural Forms." <TAG> 5/2 (February 1992) 1-3.

"A concept emerging from the HyTime Committee, called 'SGML Architectural Forms,' provides SGML users with a new tool for describing document semantics. The essence of the Architectural Forms idea is that it allows users to extend the attribute set for an element without doing violence to the basic processing, parsing and integrity of the DTD or associated document instances. Extending the attribute set allows users to express and preserve information that would otherwise require use of external files. The attraction of the approach is that it does not require use of new structures and processes; it uses the SGML parser and an extended form of the DTD to convey the desired information. . . for hard-wired SGML applications that work with only one DTD, the SGML architecture approach provides a simple, low-cost way to connect to other SGML applications. The idea behind SGML architectural forms is to directly code the relationship between the SGML elements and target applications semantics in the DTD of the document instance to be converted. . . An immediate application could address an area that has been overlooked by developers: SGML searches across multiple document types. For example, the user might want to find the chapter titles but does not know how these are tagged in the various DTDs. A search engine could look for architectural forms instead of tag names." (extract)

See more on architectural forms in the main section. See also now: the full text of the article online, from SGML Associates, Inc.; [mirror copy, text and partial links only].



Vooren, Ludo van. "Implementing SGML: Where Do You Start?" <TAG> 13 (February 1990) 5-7.

This contribution proposes implementing SGML in several stages: Document Analysis, Process Design, Document Type Declaration Writing, Document Preparation. It is published in similar format in SGML Users' Group Newsletter 17 (August 1990) 5-7.



[CR: 19971125]

Vooren, Ludo Van. "XML and Legacy Data Conversion: Introducing 'Consumable Documents'." Page(s) 185-187 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Senior Marketing Manager, Jeppesen Sanderson, USA; WWW: http://www.jeppesen.com; Email: ludo@jeppesen.com.

Abstract: "This presentation reviews the advantages of using the Extensible Markup Language (XML) in the context of legacy data conversion. This exciting application of SGML solves numerous conversion problems. By reviewing the advantages of XML in converting legacy data, this presentation shows a never before possible migration strategy towards valid SGML information."

"Converting legacy documents to XML is the most economical way to add intelligence to your documents and make them immediately 'consumable'. It also allows you to implement an SGML system for any future document and to use the 'hybrid' technology to slowly convert the legacy data to SGML."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19980515]



[CR: 19971125]

Vulpe, Michel. "SGML - Made SIMPLE." Page(s) 127-132 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Founder and CEO, Infrastructures for Information (I4I); Email: mvulpe@i4i.org.

Abstract: "In spite of its name, SGML (Standard Generalized Markup Language) is, at its core, a data schema language, not just a markup language.

"Its role as a markup language for text presentation is well understood. As such, it is one of the pillars of the WWW (World Wide Web) phenomenon. To limit SGML to text markup, however, is to do a disservice to its power. Textual presentation schemas, while important, constitute but one domain in which SGML can be applied. SGML can be used to specify schemas for many types of data and even for behaviours.

"SGML has two separate and distinct roles. The first role is for document (that is, text document) interchange. The second, more interesting role, is as a data schema language that supports structural semantics: that is, how do objects relate to other objects. It is this use of SGML that allows it to play a fundamental role in managing complex systems.

In this paper we will consider: (1) SGML as text encoding technology; (2) SGML as a data schema language; (3) The future of SGML; (4) Attaining the future.

A version of this paper is available online in HTML and PDF formats: http://www.i4i.org/simpl.htm; [local archive copy]. Other papers are available from the I4I Resource Center [to become: www.i4.com].

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19971107]

Vulpe, Michel. Show Me Which Road I'm On. Paper presented at Museums and the Web: An International Conference, Los Angeles, CA, March 16 - 19, 1997. Toronto, Ontario, Canada: Infrastructures for Information Inc., 1997. Extent: (approximately) 8 pages. Author's affiliation: Founder and CEO, Infrastructures for Information Inc.

Abstract: "The promise of SGML is that if you separate form from content a miracle occurs. Data integrity is preserved across platforms, applications, and soup recipes. The reality is a return to the bad old days of proprietary hardwired solutions. Appreciating what SGML brings to the table requires an understanding both of its origins and of its characteristics. SGML is not what it appears to be at first glance."

In this paper, we'll look at: (1) where SGML came from; (2) how a business case can (or cannot) be made for SGML; (3) what alternatives exist today; (4) attempts to 'fix' SGML, to make it address today's problems; (5) how SGML can be rescued from obscurity by redefining its role."

Available online in HTML format: http://www.i4i.org/paper1.htm or http://www.archimuse.com/mw97/speak/vulpe.htm; also in PDF; [local archive copy]



[CR: 19971227]

Vulpe, Michel. "Overthrowing the Tyrant." Pages 345-353 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Michel Vulpe]: Chairman and CEO, Infrastructures for Information Inc., 116 Spadina Avenue, Fifth Floor, Toronto, Ontario M5V 2K6 Canada; Phone: +1 416.504.0141; FAX: +1 416.504.1785 Email: mvulpe@i-4-i.com; WWW: http://www.i-4-i.com.

Abstract: "SGML promises to free information from the stranglehold of proprietary application file formats and codes, and to make information reusable, repurposable, and restructurable. However, because of the way in which SGML markup is traditionally implemented, it frequently accomplishes no more than the replacement of the proprietary codes with generic codes that still inhibit reuse, repurposing, and restructuring.

"S4 technology, from Infrastructures for Information, fulfills the promise of SGML by making the information truly free and independent of the markup in a way that traditional SGML cannot.

"In addition, by making the power of SGML available to all sorts of non-document applications, S4 addresses the issues of working with the non-text information objects inside an application, allowing SGML to manage information regardless of the source or the file format used to store it."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950716]

Waldt, Dale C. "Adobe Buys Frame [Editorial]." <TAG> 7/7 (July 1995) 1, 8-9. ISSN: 1067-9197.

The co-founder and publisher of <TAG> interprets Adobe's acquisition of Frame to be an admission that "the [document] structure must be retained throughout the [document creation/generation] process if it is to be delivered in the end result." He refers to the fact that Adobe has ceased selling Acrobat "as an SGML tool" when it became apparent that transducing structure from PDF and PostScript is practically impossible. [Reviewer's note: the SGML community has not found it surprising that Adobe is experiencing difficulty coming to grips with "structure" and "information" within documents (evidenced by their problems in supporting ICADD from PDF files), given that Adobe proposed the design of a document interchange language based upon a font standard.]



[CR: 19950716]

Waldt, Dale. "Aircraft Service Manuals on the Web." <TAG> 8/6 (June 1995) 9, 11. ISSN: 1067-9197.

Explains how the Douglas Aircraft Company is using down-translation of SGML documents to create online information for its customers via the Web. HTML is just one of the forms in which the SGML-structured data is delivered.



[CR: 19971227]

Waldt, Dale. "Desktop Publishing and Professional Publishing Systems. Productivity vs. Flexibility - Which Tool is Right for the Job." <TAG> 6/5 (May 1993) 5-7. ISSN: 1067-9197. Author's affiliation: Co-founder and Publisher of <TAG>; SGML Implementor with Thomson Professional Publishing Company.

The author provides reflections on the use/role of SGML in desktop publishing systems, based upon on the Seybold Publishing Conference held in April 1993.



Waldt, Dale. "Highlights from Seybold. Electronic Publishing Conference Offers Segments on Pressing Issues." <TAG> 7/4 (April 1994) 1-6, 11. ISSN: 1067-9197.

Waldt offers a full report on Seybold Seminars Boston (March 22-25, 1994) from the perspective of the SGML industry.



[CR: 19950716]

Waldt, Dale. "The Inclusion and Exclusion Confusion, Revisited." <TAG> 6/9 (September 1993) 1-5. ISSN: 1067-9197.

The artcle contains reworked information from an article first appearing in <TAG> in 1989. Waldt concludes that great care must be taken when using inclusion and exclusion exceptions.



Waldt, Dale. "Kodak Marketing Technical Support Case Study: A Technical Information Documentation and Distribution System." <TAG> 8/3 (March 1995) 1-4. ISSN: 1067-9197.

This case study represents an example of a small-scale implementation of SGML that was completed on a budget that just about any company can afford. This implementation resulted in reductions in processing time, cost, and complexity, and served as a test of the technology to determine a future course for document processing within the group. Kodak has begun the second phase of this project, including expansion into other document types, delivery formats, and system functionality, and will eventually expand this system so that it can serve as a centralized collection and distribution repository for marketing information.



[CR: 19950716]

Waldt, Dale. "The Power of SGML Databases: Using Enabling Information Management and Publishing Technologies for Maximum Benefit." <TAG> 6/8 (August 1993) 5-10. ISSN: 1067-9197.



[CR: 19971106]

Waldt, Dale. "SGML at Seybold? Can it be True?" <TAG>: The SGML Newsletter 10/10 (October 1997) 4-6. ISSN: 1067-9197. Author's affiliation: Research Institute of America Group.

The article provides a personalized report on the Seybold SF '97 Conference, with extensive commentary on the significance of the support given to XML by Microsoft.



[CR: 19951208]

Waldt, Dale C. "SGML Asia Pacific '95." <TAG>: The SGML Newsletter 8/11 (November 1995) 1-7. ISSN: 1067-9197. Author's affiliation: Director, Data Development, Research Institute of America.

Waldt supplies a very detailed review of the events at the conference, and assesses their industry significance.



[CR: 19950716]

Waldt, Dale C. "SGML '93: SGML is Growing Up." <TAG> 7/1 (January 1994) 1-3. ISSN: 1067-9197.

Waldt reports on SGML '93, held in Boston and attended by 460 people. Highlights and trends are summarized. See further the main entry for this conference (other reports).



Waldt, Dale C. "SGML & Looseleaf Page Information: Managing Concurrent Structures in an SGML Looseleaf Database." <TAG>: The SGML Newsletter 7/2 (February 1994) 8-10. ISSN: 1067-9197.



[CR: 19951220]

Waldt, Dale C. "SGML Conversion Planning: Tipniques and Pratfalls." <TAG> 8/12 (December 1995) 1-8. ISSN: 1067-9197. Author's affiliation: Director, Data Development, Research Institute of America.



[CR: 19961029]

Waldt, Dale. "If SGML is the Chicken, Then Workflow is the Egg." <TAG> 9/9 (September 1996) 1-5. ISSN: 1067-9197. Author's affiliation: Vice President for Product Development, Research Institute of America (Thomson Professional Publishing).

The author summarizes the role played by SGML in improving certain workflow processes.



[CR: 19970620]

Waldt, Dale C. "Why It Took Me Ten Years To Really Understand that SGML is Not a Religion." <TAG>: The SGML Newsletter 10/5 (May 1997) 1-2. ISSN: 1067-9197. Authors' affiliation: Vice President for Product Development, Research Institute of America (Thomson Professional Publishing).

The author supplies ten-year retrospects on SGML via the experience of publishing <TAG>: The SGML Newsletter, concluding that religious hype comes and goes, but the "business case" is what makes SGML have stayin power.



[CR: 19980719]

Waldt, Dale C. "Why XML Is More Exciting Than SGML: Part II. Or, 'How I Stopped Worrying and Learned to Love XML'." <TAG>: The SGML Newsletter 11/7 (July 1998) 4-6. ISSN: 1067-9197. Authors' affiliation: Vice President of Technology, Research Institute of America, Thomson Professional Publishing.

"Summary: This is the second part of a two-part article on my personal opinions as to why XML is sexier than SGML. This article describes a system that utilizes XML for interchange between commercial software over the Web. It was created by a single programmer in our labs in a matter of a few weeks. This prototype was created to illustrate the potential of several tools and is not an indication of any applications, product offerings, or promises from any party mentioned in this article. The prototype was first viewed publicly at the GCA XML conference in Seattle, Washington, in March of 1998."



[CR: 19970207]

Waldt, Dale C. "XML: Is Less More?" <TAG>: The SGML Newsletter 9/12 (December 1996) 11-12. ISSN: 1067-9197. Authors' affiliation: Vice President for Product Development, Research Institute of America Group.

The author discusses the relationship between XML and SGML, evaluating the political and technical issues that have surfaced with the XML design effort. He raises the issue of 'topless' XML documents (XML documents without DTDs) and its relation to validation.



[CR: 19980719]

Waldt, Dale C. "XML is More Exciting Than SGML. Part 1." <TAG>: The SGML Newsletter 11/6 (June 1998) 1-4. ISSN: 1067-9197. Authors' affiliation: Vice President of Technology, Research Institute of America, Thomson Professional Publishing.

"The truth is that the really compelling reasons for using SGML did not exist for the mainstream until very recently with the wide acceptance of the Web and the dramatic price decreases and functionality increases for document processing tools. . . If I were starting from scratch to convert our information resources, I would obviously go straight to XML and bypass all the nonessential pieces of SGML that were left out of XML." [Waldt introduces his 2-part article as beginning with 'some highly subjective revisionist observations about XML and why it should prove to be important and more widely adopted than SGML'.]



[CR: 19960206]

Waldt, Dale C.; Travis, Brian E. "SGML '95: Great Expectations." <TAG>: The SGML Newsletter 9/1 (January 1996) 1-6. ISSN: 1067-9197. Authors' affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>; Dale Waldt is the co-founder and publisher of <TAG>, and Data Development Manager with the Research Institute of America.

The authors provide a detailed report on the highlights of SGML '95, the annual conference sponsored by the GCA. Conference attendance was about 800, and 1,250 people attended the conference or trade show.

Additional reporting for this article was supplied by Sharon Adler, Tanya Bosse, Rick Egdorf, Eric Freese, Todd Thalimer, Deni Travis, Simon Wickes, and Ron Wilhelm.



[CR: 19971202]

Walker, Derek. "Taking Snapshots of the Web with a TEI Camera." Pages 137-140 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Queen's University, Kingston, Ontario; Email: walker@qucis.queensu.ca.

Summary: "The goal of the Snapshot project is to observe and capture the linguistic development of a technological culture over the short period of several months. The culture in question is the World Wide Web and the linguistic variations in this culture are obtained by taking snapshots, pseudo-random samples of web documents. The documents are captured at regular intervals and are added to an open corpus of previously retrieved Web documents. The corpus then serves as the raw material for qualitative and quantitative linguistic analysis of the arguably unique nature of this form of electronic communication. In order to facilitate this knowledge base, documents must be both encoded in a standardized way, and amenable to data retrieval by the researcher. The TEI serves as the encoding standard for this document system. The details of the encoding method are introduced here as a novel method for storing documents in an open and growing corpus. Some of the problems inherent with automatic document retrieval and encoding are also explored."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/walker.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.



Walker, Janet H. "Supporting Document Development with Concordia." IEEE Computer 21/1 (January 1988) 48-59. ISSN: 0018-9162.

Abstract: A development environment has been designed and implemented for technical writers. This environment, called Concordia, is an extension of Genera, the software development environment provided on Symbolics computers. It applies object-oriented techniques to creating, publishing, and maintaining complex documentation. The discussion covers the goals and design of Concordia, creating and editing documents, viewing and reviewing documents, and production. The Concordia approach is evaluated.



Walker, Janet H.; Bryan, Richard L. "An editor for structured technical documents." Pages 145-150 (with 6 references) in PROTEXT IV. Proceedings of the Fourth International Conference on Text Processing Systems. International Conference on Text Processing Systems, Boston, MA, USA 20-22 October 1987. Sponsored by INCA - Institute for Numerical Computation and Analysis. Edited by John J. H. Miller. Dun Laoghaire, Ireland: Boole Press, Ltd., 1987. vii + 153 pages. ISBN: 0-906783-80-1 (hardback); 0-906783-79-8 (paperback). Authors' affiliation: Symbolics Inc., 11 Cambridge Center, Cambridge, MA, USA 02142.

Abstract: The authors describe an editor for technical writers who create very large, complex document sets with long development cycles. It is part of a document support environment emphasizing development instead of final production. The system has two novel aspects, the document structure and the editing paradigm. Documents are directed graph structures composed of semantic building blocks instead of linear streams of text. The editor that maintains the modular structure of these documents provides semblance editing, combining concepts from "what you see is what you get" (WYSIWYG) editing with a full generic markup language.

The paper describes the text editing component of Concordia, a system for supporting document development.



[CR: 19971107]

Walsh, Norman. "A Guide to XML." Pages 97-107 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: ArborText, Inc.

Abstract: "This article provides a technical introduction to XML with an eye towards guiding the reader to appropriate sections of the XML specification when greater technical detail is desired. This introduction is geared towards a reader with some HTML or SGML experience, although that experience is not absolutely necessary. The XML Link and XML Style specifications are also briefly outlined."

A version of this document is available online in HTML format: http://www.berkshire.net/~norm/articles/xml/, or alternately http://www.arbortext.com/nwalsh.html; [local archive copy].



[CR: 19990616]

Walsh, Norman; Muellner, Leonard. DocBook: The Definitive Guide. Sebastopol, California: O'Reilly & Associates, [forthcoming] August 1999. Extent: 704 pages [ca.]. ISBN: 1-56592-580-7. Authors' affiliation: [Walsh:] Arbortext, Inc..

Summary: "DocBook is a Document Type Definition (DTD) for use with XML (the Extensible Markup Language) and SGML (the Standard Generalized Markup Language). DocBook lets authors in technical groups exchange and reuse technical information. This book contains an introduction to SGML, XML, and the DocBook DTD, plus the complete reference information for DocBook."

See the official home page for DocBook: The Definitive Guide. On DocBook, see: the DocBook Home Page.



[CR: 19961112]

Walter, Mark. "Applied Physics Letters Online: A Case Study in Online Journal Publishing [Physics Society Takes Its Journal Online]." Seybold Report on Publishing Systems 25/8 (December 31, 1995) 12-21). ISSN: 0736-7260.

[Provisional note: this is a case study of major importance, and deserves widespread reading. Together with cooperation from OCLC (e.g., Guidon software), the American Institute of Physics (AIP) is making an increasing number of publications available online using SGML technologies.] See the AIP WWW server for some details on the results of the project, including an article by Tim Ingoldsby, Director, New Product Development, AIP [Reprinted with from Computers in Physics, Vol.8, No. 4, pp. 398-401, July/August 1994; (mirror copy, text only)]. See the online article via the Seybold WWW server [see the section "Fuji Photo Film U.S.A." - the article beginning is slightly mis-tagged, November 12, 1996].



[CR: 19970726]

[Walter, Mark (SRIP editor]. "Comments from our Readers. On XML." Seybold Report on Internet Publishing 1/6 (February 1997) 2. ISSN: 1090-4808. Author's affiliation: Seybold Publications.

Two letters from readers (Bob de Jeu, Kluwer Academic Publishers; Judith Riddell Messimer, Asta Productions) are published on the topic of XML (Extensible Markup Language). These letters are in response to two previous articles in Seybold Report on Internet Publishing -- about and from Netscape, on XML.



Walter, Mark. "Delivery Wars: Silicon Graphics and Novell Side with SGML." The Seybold Report on Publishing Systems 22/1 (September 7, 1992) [1,] 3-8. ISSN: 0736-7260.

"Silicon Graphics (SGI) and Novell recently gave the Standard Generalized Markup Language (SGML) a tremendous boost, revealing strategic plans to use SGML for delivering technical reference information in electronic form to their customers. The adoption of SGML by two leading general computer suppliers reinforces the embracing of SGML already under way among leading tech-doc and publishing-system suppliers and signals a change among computer vendors in their approach to electronic delivery." [from page 3]. The article reviews the suppliers' decision to use Electronic Book Technologies' (EBT) DynaText SGML hypertext system as a delivery vehicle for online documentation and overviews the DynaText product. SGI's viewer "Iris Insight" is an enhanced and customized version of EBT's DynaText 1.5 SGML viewer. The article includes a brief introduction "SGI, Novell Pick SGML for Documents" (page 1).



[CR: 19970207]

Walter, Mark. "Document Management Embraces the Intranet: Report from Documation '96 [LiveLink Intranet Lights Up Documation]." The Seybold Report on Publishing Systems 25/14 (April 23 , 1996) [1], 11-17. ISSN: 0736-7260.

The author supplies a detailed report on the Documation '96 conference. Featured especially is Open Text's LiveLink technology, which delivers information to Web clients from a central library. Also reviewed are Documentum (Accelera), Day & Zimmerman (SGML based IETM toolkit, and ZDIS document database development system), Digitome (IDM SGML toolkit), and InfoAccess (Guide Professional Publisher). See the article online: http://www.seyboldseminars.com/seybold_report/reports/P2514000.HTM.



[CR: 19980421]

Walter, Mark E. Jr. "Introducing XML.com." The Seybold Report on Internet Publishing 2/8 (April 1998) 2. ISSN: 1090-4808. Author's affiliation: Seybold Publications; Editor, The Seybold Report on Internet Publishing.

The article describes and provides rationale for Seybold Publications' partnership in the creation of a Web site XML.com, together with the publisher O'Reilly & Associates Inc.

"Seybold Publications has been covering SGML since the standard's inception in 1986, and generic markup since we began publishing in 1970. For all those years the primary groups concerned with generic markup have come from professional publishers and their suppliers - groups that are part of the core Seybold constituency. [. . .] That's why we teamed up with O'Reilly to create a new Web site (www.xml.com) devoted to XML education. We plan to continue to cover SGML and XML applications specific to publishing. But we also wanted to integrate that focus with developments coming from the larger business community."



[CR: 19961202]

Walter, Mark E. Jr. "The ISO Font Standard: Looking Beyond the Desktop. [Glyphs Versus Characters and Character Sets]." The Seybold Report on Desktop Publishing 4/6 (February 5, 1990) 24-30. ISSN: 0736-7260.

Walter's article includes discussion of ISO 9541 and its relation to ODA and DSSSL committees. Contacts: General editor of ISO 9541; Ed Smura, Xerox, 701 South Aviation Boulevard, El Segundo, CA 90245; phone (213) 333-4642; or President of AFII, Al Griffee, IBM GPD, 6300 Diagonal Highway, 51E/025D, Boulder, CO 80301; Tel: (303) 924-7670; FAX: 924-5935.



Walter, Mark. "A Look Inside JCALS: Integrated Approach to Technical Manuals." The Seybold Report on Publishing Systems 21/11 (February 29, 1992) [1,] 13-26. ISSN: 0736-7260.

The author describes in detail the goals, scope and technology strategies involved in the JCALS project, including the role played by SGML and HyTime. "The Joint CALS (JCALS) project is one of the most ambitious paper-reduction efforts undertaken. . . it encompasses hundreds of sites, thousands of workstations and millions of pages that will eventually be put into digital form." Included is a brief introduction "JCALS: Databases and System Integration" (page 1).



[CR: 19961209]

Walter, Mark. "Making an Internet Newspaper: SGML, HTML and Signposts From Wyoming. [Internet Publishing: A Newspaper Case Study (Casper Star-Tribune)]." Seybold Report on Publishing Systems 24/10 (January 30, 1995) 1, 3-6. ISSN: 0736-7260.

Mark Walter describes the use of SGML by the Star-Tribune in Casper, Wyoming. A subset of information prepared for the Star-Tribune (in SGML format) is filtered down to HTML and made available on the Internet (WWW). SGML also plays a key role in the newspaper's database management system: archiving of the electronic data makes use of the SGML-tagged text for indexing and retrieval. The author's conclusion: "It has become clear to us that SGML makes it easier to reprocess information for different delivery media. What we found interesting about the Tribune's experience is that it is no doubt a forerunner of many online postings from newspapers. If the UTF lives up to its promise, the new format will make it much easier for papers, both large and small, to create these kinds of products in the future."

Available online: "Making an Internet Newspaper: HTML and Signposts from Wyoming", by Mark Walter, from Seybold Report on Publishing Systems Vol 24, No 10, pages 1, 3-6.



[CR: 19970620]

Walter, Mark. "Netscape Back On Track. Change of Heart [on XML]." Seybold Report on Internet Publishing 1/8 (April 1997) 2. ISSN: 1090-4808. Author's affiliation: Seybold Publications.

A brief note on Netscape Corporation's decision to seriously examine the XML (Extensible Markup Language) specification -- after an original posture of saying XML was unnecessary for its customers.



[CR: 19970718]

Walter, Mark. "Netscape Brings XML and Metadata Together [From the Editor]." Seybold Report on Internet Publishing 1/11 (July 1997) 1, 2. ISSN: 1090-4808. Author's affiliation: Seybold Publications.

Mark Walter (editor, SRIP) reports on the proposed MCF (Metadata Content Framework) standard backed by Netscape Corporation, and discusses the significance of metadata as a core asset of corporations that need to deliver information online. See links to relevant supporting documents in the XML section: Meta Content Framework Using XML.



[CR: 19961202]

Walter, Mark E. Jr. "New CALS products: A Shift Toward Databases and Commercial Applications. [Computer-aided Acquisition and Logistics Support Initiative Launched by the US Department of Defense]." The Seybold Report on Publishing Systems 19/9 (January, 1990) 15-24. ISSN: 0736-7260.

The article discusses the IBM CALS support, and CALS SGML database applications more broadly.



[CR: 19960808]

Walter, Mark. "ODA (Office Document Architecture): What Is It? What Is It Good For?" The Seybold Report on Publishing Systems 19/7 (December 18, 1989) [1,] 3-20. ISSN: 0736-7260.

Abstract: "In this article, we present an overview of the Office Document Architecture (ODA). Completed in draft form in 1988 and published earlier this year, ODA is ISO 8613, an international standard for exchanging compound documents. Products that support the standard are just beginning to appear on the market. This article serves as background for future discussions of ODA products as they appear."

Another abstract, from a bibliography complied by Steve Gants. "Document Interchange - the ability to exchange documents among different users and systems - is a subject we've covered since this Report's inception 18 years ago. But the need for standard methods of interchange have changed dramatically since 1970, when we were wrestling with how to get word processing files into a typesetting system. There are some for whom document interchange is not a pressing issue (ad typographers and newspapers, for example). But for many vendors and publishers, interchanging documents is becoming more than a nuisance; it's a critical need, now that many of us have composition systems on our desks and document exchange means more than just translation of text. This article examines ODA, a new standard for document interchange that arose out of the office automation industry. Of what use is an Office Document Architecture in the graphic arts market? How important is it, and how does it relate to other standards, such as SGML? This article attempts to clarify these difficult questions from a publishing perspective. Our conclusions, which may surprise many readers, are based on the assumption that publishers today want free exchange of the document they currently produce, as well as ones they produce in the future. The shortcomings of ODA point out why the publishing community needs to participate in the shaping of standards that meet our needs."

The article provides an overview for each of the seven parts of the ODA standard. It also supplies a comparison (page 5) between "ODA and SGML". It includes a brief introduction "ODA From a Publishing Perspective", page 1.



[CR: 19970726]

Walter, Mark. "Online Journals: Print Publishers Move from Pilot to Full Rollout." Seybold Report on Internet Publishing 1/6 (February 1997) 10-20. ISSN: 1090-4808. Author's affiliation: Seybold Publications; Editor, Seybold Report on Internet Publishing.

In this feature article, the author provides a detailed survey of the trends and developments in online journal publishing. See a previous article by Mark Walter on the topic: Applied Physics Letters Online: A Case Study in Online Journal Publishing [Physics Society Takes Its Journal Online], published in Seybold Report on Publishing Systems 25/8 (December 31, 1995) 12-21. This sequel is an important, "must read" article. The author concludes: "And surprisingly, an overwhelming majority of the major journal publishers with whom we spoke have some sort of SGML project going, and most have fully committed to migrate from composition files to SGML as their master archive format for revisable-form text. Although most everyone seems to be making PDF first; significant SGML conversion efforts have been made at other well-known publishers, including the American Medical Association, American Chemical Society, IEEE and others." Some other excerpts are provided below.

[Summary:] "After years of trial programs with select publications, leading STM (science, technical, medical) publishers are in the midst of rolling out online versions of all of their journals. Initially positioned as complements to their print counterparts, online journals are the foundation for a new generation of STM digital libraries, and a potential new source of revenue for their publishers. . .behind the scenes we found a very different picture at many STM publishers. We found strong evidence of the use of SGML, both for bibliographic headers and for full text. We also found tremendous acceptance of PDF, again, as a format for both archiving and delivering print journals."

"Academic Press publishes 175 STM journals. . . It decided several years ago that SGML made sense for its master source files. The generic tagging would enable detailed markup, including marking up citations and cross-references that would eventually become hyperlinks on the Web. At the same time, converting to SGML would put the text into a neutral form that could easily be converted to whatever format AP might need in the future. . .The master archive, called International Digital Electronic Archive Library (IDEAL), is stored in Fujitsu's ODBII database. The header information is stored in SGML and fed out as html. The articles themselves are in PDF format."

"Elsevier is the world's largest scientific publisher, and publisher of roughly 4% of the world's paid printed journals . . .Throughout 1995 and 1996, Elsevier's online journals consisted of bitmapped tiff images of printed pages. The page images were supplemented by an SGML-encoded header, used for field searching, and untagged ASCII files of the text, used for full-text searches. This format had already been tested for five years by Elsevier and nine universities under the TULIP (The University Licensing Program) project."

"In Germany, Springer is in the midst of bringing 160 journals online as part of a service called Link. . .For some time Springer has been preparing the content of its journals for future online use. The header information is being encoded in SGML and stored in an Oracle database. This is linked to the complete text of journal articles, which are being compiled into specific online libraries, each indexed according to its subject and stored in the file system."

"ACM...publishes 17 journals in computer science. Its method has been to offer all of the subscribers to its print journals full access to the online versions. Even though its present method of posting is to make PDF versions of articles composed in its Xyvision and Quark systems, ACM is in the midst of converting all of its journals to an SGML up-front approach, rather than relying on back-end conversion."

In mid-1996, after launching its first online journal, the American Institute of Physics (AIP) launched Titles in Physics, a service that lists articles and page ranges from AIP journals and those of its member societiesUsing its Applied Physics Letters Online as the prototype journal, AIP is converting all of its journals to SGML, and then automatically making HTML versions from those source files."



Walter, Mark. "OSU's Chameleon Architecture: A Grammatical Approach to Translation and DTDs." The Seybold Report on Publishing Systems 20/7 (December 24, 1990) [1,] 17-23. ISSN: 0736-7260.

"This article describes part of an ongoing research project [at Ohio State University] involving language translation and grammar generators. It is fairly technical and will be of most interest to vendors or large users investigating SGML, particularly the writing of document type definitions and the development of text translators to and from SGML. Some of the [ICA] software was demonstrated at SGML '90, as noted in the previous article of this issue."

The article includes a brief introduction "Ohio State's Toolset for DTDs and Translators", page 1. See more on Chameleon sub Mamrak and O'Connell.



Walter, Mark. "SGML '94: Upbeat Wrap-up To an Eventful Year." The Seybold Report on Publishing Systems 24/6 (November 30, 1994) [1,] 3-15. ISSN: 0736-7260.

A report on SGML '94, held November 7-11 at Tyson's Corner, VA. The conference drew more than 700 people, nearly double the attendance at SGML '93. The article opens ("SGML for wire services") with a discussion of the use of SGML by the news agencies (International Press Telecommunications Council [IPTC] and the Newspaper Association of America [NAA]) in the development of an SGML DTD for "Universal Text Format" (UTF). The article also includes (pages 4-5) highlights from the "SGML Year in Review -- 1994", a presentation given by Tommy Usdin and Yuri Rubinsky at SGML '94. Based upon software exhibited at the conference, SGML product reviews are given for ArborText, Texcel, Datalogics, Grif, InContext, Interleaf, Timelux, WordPerfect, XSoft, OpenText, Penta. The article includes a brief introduction "SGML '94: Rising Tide Arrives In Tyson's Corner" (page 1).



[CR: 19960808]

Walter, Mark. "SGML For Journals: Toward Electronic Delivery." Seybold Report on Publishing Systems 21/18 (15 June 1992) 1, 3-13, 16-19.

Abstract: The Standard Generalized Markup Language (SGML) can be applied to commercial publishing. The author presents four case studies that illustrate different approaches to using SGML for journals: The IEEE moved to electronic editing and composition in one fell swoop, using ArborText's The Publisher to create journals for print and to archive in SGML. J.B. Lippincott has copyeditors tag in XyWrite as they copyedit manuscripts that are sent on disk to outside compositors. Generic coding has reduced production costs while facilitating electronic products, such as CD-ROM. The American Chemical Society is conducting several experiments with SGML, including reverse-engineering SGML from Xyvision files and collaborating on a prototype electronic library of online journals. The publisher of Science magazine is about to launch an electronic peer-reviewed journal.

Another abstract, from a bibliography complied by Steve Gants. "For years, we've known that the Standard Generalized Markup Language (SGML) could be applied to commercial publishing, but there were few real-world examples to cite as proof. In this article, our first feature study of the use of SGML since December 1990, we present ways in which it really is being done. We've collected four case studies that illustrate different approaches to using SGML for journals: The IEEE moved to electronic editing and composition in one fell swoop, using ArborText's Publisher to create journals for print and to archive in SGML. J.B. Lippincott has copyeditors tag in XYWrite as they copyedit manuscripts that are sent on disk to outside compositors. Generic coding has reduced production costs while facilitating electronic products, such as CD-ROM. The American Chemical Society is conducting several experiments with SGML, including reverse-engineering SGML from Xyvision files and collaborating on a prototype electronic library of online journals. The publisher of Science magazine is about to launch an all-electronic peer-reviewed journal. Driven by increasing pressure to deliver electronically, these journal publishers are turning in increasing numbers to SGML as the means to accommodate both print and electronic delivery. Although their implementations are different, these publishers share a common objective: the desire to store their content electronically in a form that will facilitate future requirements, whatever they may be. Although the experiences related here are of particular interest to the journal community, many of the lessons are equally relevant to any publisher debating how to deliver information in electronic form."



[CR: 19961202]

Walter, Mark E. Jr. "TechDoc '89 Vendor Demonstrations. [Trade Show on Technical Documentation]." The Seybold Report on Publishing Systems 19/2 (September 25, 1989) 16-23. ISSN: 0736-7260.

A review of many software and hardware products supporting SGML. Companies included Datalogics Inc.; Intergraph Corp.; Software Exoterica; Kurzweil; U.S. Lynx; Xerox Corporation; Xerox Imaging Systems; Xyvision; Yard Software. "Most significant (for those not in the TechDoc market) were several SGML-related products. There were SGML translators from Software Exoterica and US Lynx. Yard Software, a subsidiary of the Sema Group, announced a PC structured editor to accompany the Sobemap parser; it is being offered to both OEMS and end users. These products demonstrate one of the most positive effects of CALS: the sudden emergence of SGML-related software." Kurzweil integrated a special version of Avalanche Development's Visual Recognition Engine into the K5100 software (for SGML support).



[CR: 19961202]

Walter, Mark E. Jr. "Technical Documentation: Part Four." The Seybold Report on Publishing Systems 17/12 (March 14,, 1988) 21-22. ISSN: 0736-7260.

The author provides a summary of the primary editing and publishing systems, with comment on SGML's importance in the group of standards.



Walter, Mark. "Update on the 'Quiet Revolution': Report from SGML '92." The Seybold Report on Publishing Systems 22/7 (December 21, 1992) [1,] 13-17. ISSN: 0736-7260.

A report on the highlights of the SGML '92 conference, attended by some 275 people. The high interest at this conference mirrored the rising interest in SGML evidenced at the earlier Seybold Seminar in San Francisco. Topics treated in the review include DTD design, HyTime, SGML for tables and math, SGML query languages, and new SGML software products. The article includes a brief introduction "SGML: The 'Quiet Revolution' Continues" (page 1).



[CR: 19961209]

Walter, Mark. "W3C Publishes Draft of Simplified SGML. At Last a Sensible Way to Extend HTML." Seybold Report on Internet Publishing 1/4 (December 1996) 3-5. ISSN: [1090-4808]. Author's affiliation: Seybold Publications.

Summary: "On the tenth anniversary of the adoption of SGML as an ISO standard, a band of SGML experts announced they have drafted a simplified subset of the language they hope will be adopted as a standard method of extending HTML to accommodate user-defined tags and attributes. The new language, Extensible Markup Language, or XML, was prepared by a World Wide Web Consortium SGML working group and announced at the GCA SGML '96 conference held last month in Boston. The draft XML specification is the culmination of an intense eleven-week collaboration by a working group of 80 SGML experts, representing vendors, users and consultants. The group was led by Jon Bosak of Sun, who is also working on an online variation of DSSSL, the style sheet language for SGML documents." [extracted]

This article was Seybold Publications' "Story of the Week" ['Up Front'] in December [8-14], 1996. An online HTML version of the feature article by Mark Walter is available from the Seybold WW server; also available in PDF format. See the database entry on XML for more information on the Extensible Markup Language, and the Seybold entry for Seybold contact addresses.



[CR: 19971227]

Walter, Mark. "W3C Smiles on Multimedia. Proposes Spec[ification] for Synchronizing Time-based Media with Web Pages." Seybold Report on Internet Publishing 2/4 (December 1997) 23. ISSN: 1090-4808. Author's affiliation: Seybold Publications, and Editor of The Seybold Report on Internet Publishing.

The article introduces SRIP readers to the Synchronized Multimedia Integration Language (SMIL), produced by the W3C Working Group on Synchronized Multimedia (SYMM). The language "aims to create a straightforward way to control time-based media in the context of HTML documents." XML and BNF notations are provided for the language. See the W3C server for more information.



[CR: 19961202]

Walter, Mark E. Jr. "What Applications are Good for ODA [Office Document Architecture]?" The Seybold Report on Publishing Systems 19/7 (December 19, 1989) 15-19. ISSN: 0736-7260.



[CR: 19961202]

Walter, Mark E. Jr. "Xyvision's Parlance: Port to an Open Architecture." The Seybold Report on Publishing Systems 19/2 (September 25, 1989) 3-11. ISSN: 0736-7260.

The article discusses XyVision's SGML support, together with with SoftQuad's Author/Editor product.



Walter, Mark; Alexander, George A. "Status Report on SGML: Notes from SGML '93." The Seybold Report on Publishing Systems 23/9 (January 3, 1994) [1,] 3-13. ISSN: 0736-7260.

The article surveys the highlights from SGML '93 Conference in Boston, and reports on products from Microstar, Datalogics, Frame, Texcel, BL/AIS, EBT, Exoterica, Tachyon, and Zandar. "As the brochure advertised, the volume is cranking up on the quiet revolution. The Standard Generalized Markup Language (SGML) has not yet become a mass-market phenomenon, but within some publishing circles, SGML has already become the standard of primary importance, creating a ground swell that both vendors and users in other publishing circles can no longer ignore" [from page 3]. The article includes a brief summary "SGML '93: Progress on All Fronts" (page 1).



[CR: 19960206]

Walter, Mark; Karsh, Arlene. "SGML Crosses Technology-Adoption Chasm into the Bowling Alley [SGML Crosses the Chasm]." Seybold Report on Publishing Systems 25/9 (January 29, 1995) 1, 3-15). ISSN: 0736-7260.

The authors provide in-depth coverage of the SGML '95 Conference, including assessment of the progress of SGML in the preceding year. The review article is one in a continuing series from Seybold Report on Publishing Systems. From Seybold's vantage point, SGML has now "bridged the gap between early, visionary adopters and early, practical implementers." Walter and Karsh interpret the technology trends surrounding SGML in terms of the technlogy life-cycle model used by Geoffrey Moore in Crossing the Chasm.

The conference report features an appraisal of the emerging and stable SGML technologies in four major sections: (a) Introduction [Including "DSSSL Gets Real"]; (b) SGML Authoring Tools ["Interleaf 6 SGML"; "Stilo Makes its U.S. Debut"; table of major products on page 8]; (c) SGML Document Management tools [Passage, Texcel, EBT, XyVision; table of major products on page 11]; (d) Electronic Delivery using SGML ["EBT Previews DynaText 3.0, Matterhorn"; "Inforium Opens SGML DBMS to Web"].



[CR: 19970726]

Walter, Mark; Ryan, Susan. "Linking and Accessibility on the Agenda at the WWW6 Conference. New Initiatives, Coupled with XML, Extend the Web in Several Directions." Seybold Report on Internet Publishing 1/9 (May 1997) 40. ISSN: 1090-4808. Authors' affiliation: Seybold Publications.

The article is a report on XML highlights at the Sixth International World Wide Web conference, 1997.

[Summary:] "As expected, the W3C unveiled a new spec for hyperlinking at the Sixth International World Wide Web conference held last month in Santa Clara, CA. If adopted, the draft specification, which makes use of the open tagging conventions of the Extensible Markup Language (XML), will bring the Web much closer to the functionality of some of the closed hypertext systems of the past decade. . ."

"The new hyperlinking spec addresses many of the deficiencies in the current HTML linking scheme. . .Emphasizing the theme of the event-making the Web accessible - the W3C also launched an international Web accessibility initiative, which was endorsed by the Yuri Rubinsky Insight Foundation and will be supported by SoftQuad and other vendors in the development of tools to make it easier for people with disabilities to make use of the Web."



[CR: 19961017]

Wang, Xinxin. Tabular Abstraction, Editing, and Formatting. PhD Thesis presented to the Department of Computer Science, University of Waterloo. Waterloo, Ontario, Canada: Department of Computer Science, University of Waterloo, 1966. Extent: xiv + 184 pages.

The thesis is available in Postscript format on the Internet: ftp://ftp.cs.ust.hk/pub/dwood/xinxin/thesis.ps.gz; [mirror copy].



[CR: 19961017]

Wang, Xinxin; Wood, Derick. "An Abstract Model for Tables." TUGBoat 14/3 ( 1993) 231-237 (with 9 references).

Abstract: "We present a tabular model that abstracts a wide range of tables. The model is presentation independent, we abstract the logical structure of tables, rather than their presentational form. The model can be used to guide the design and implementation of tabular editors and formatters. In addition, the model is formatter independent; it can be used to direct the formatting of tables in many typesetting systems."

The document is available (preliminary version) in Postscript format via the Internet: ftp://ftp.cs.ust.hk/pub/dwood/xinxin/tr357.ps.gz; [mirror copy].



Wang, Xinxin; Wood, Derick. An Abstract Model for Tables. Technical Report UWO 357. London, Ontario: University of Western Ontario, Department of Computer Science, May 1, 1993. 9 pages, 9 references. Authors' affiliation: [Wang] Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: xwang@watdragon.uwaterloo.ca; [Wood] Department of Computer Science, University of Western Ontario, London, Ontario N6A 5B7, Canada; email: dwood@csd.uwo.ca.

Abstract: "We present a tabular model that abstracts a wide range of tables. The model is presentation independent, we abstract the logical structure of tables, rather than their presentational form. The model can be used to guide the design and implementation of tabular editors and formatters. In addition, the model is formatter independent; it can be used to direct the formatting of tables in many typesetting systems."

The paper was submitted for publication; see now TUGboat [reference above]. A draft version of the document is available via anonymous FTP (ftp.csd.uwo.ca/pub/csd-technical-reports/357). Alternately:



[CR: 19971206]

Wang, Xinxin; Wood, Derick. "Tabular Formatting Problems." Pages 171-181 (with 6 references) in Principles of Document Processing. Proceedings of the Third International Workshop. PODP '96, Third International Workshop. Palo Alto, California. September 23, 1996.. Edited by Charles Nicholas (Department of Computer Science and Electrical Engineering, UMBC, Baltimore, MD) and Derick Wood (Department of Computer Science, HKUST, Clear Water Bay, Kowloon, HONG KONG). Lecture notes in artificial intelligence. Lecture notes in computer science, 1293. Berlin / London: Springer-Verlag, 1997. ISBN: 354063620X. Author's affiliation: [Wang]: NorTel, Ottawa, Ontario, Canada.

Abstract: "Tabular formatting determines the physical dimensions of tables according to size constraints. Many factors contribute to the complexity of the formatting process so we analyze the computational complexity of tabular formatting with respect to different restrictions. We also present an algorithm for tabular formatting that we have implemented in a prototype system. It supports automatic line breaking and size constraints expressed as linear equalities or inequalities. This algorithm determines in polynomial time the physical dimensions for many tables although it takes exponential-time in the worst case. Indeed, we have shown elsewhere that the formatting problem it solves is NP-complete."



[CR: 19961018]

Wang, Xinxin; Wood, Derick. Tabular Formatting Problems. Technical Report HKUST-CS96-28 June 1996. : , . Extent: . ISSN: . Author's affiliation: [Wang] NorTel, P.O. Box 3511, Station C, Ottawa, Ontario K1Y 4H7 Canada; [Wood]: Department of Computer Science Hong Kong University of Science and Technology HKUST, Clear Water Bay, Kowloon Hong Kong. Tel. +852.2358.6988; Fax +852.2358.1477; E-Mail dwood@cs.ust.hk. WWW Home Page .

Abstract: "Tabular formatting determines the physical dimensions of tables according to size constraints. Many factors contribute to the complexity of the formatting process, so we analyze the computational complexity of tabular formatting with respect to different restrictions. We also present an algorithm for tabular formatting that we have implemented in a prototype system. It supports automatic line breaking and size constraints expressed as linear equalities or inequalities. This algorithm determines in polynomial time the physical dimensions for many tables although it takes exponential-time in the worst case. Indeed, we have shown elsewhere that the formatting problem it solves is NP-complete."

A preliminary version of the paper is available as Technical Report HKUST-CS96-28, URL: ftp://ftp.cs.ust.hk/pub/techreport/96/tr96-28.ps.gz; [mirror copy]. It will be published in the Proceedings of Principles of Document Processing 1996 (PODP 96). The research was supported by grants from the Natural Sciences and Engineering Research Council of Canada, from the Information Technology Research Centre of Ontario, and from the Research Grants Committee of Hong Kong.



[CR: 19961019]

Wang, Xinxin; Wood, Derick. Tabular Abstraction for Tabular Editing and Formatting. Paper presented at the Third International Conference for Young Computer Scientists, 1993.. Waterloo, Ontario: University of Waterloo, 1993. Extent: 11 pages, 9 references. Author's affiliation: [Wang]: Department of Computer Science, University of Waterloo; [Wood]: Department of Computer Science, University of Western Ontario, London, Ontario.

Abstract: "This paper presents a tabular model that abstracts a wide range of concrete tables. The model is presentation independent, because we abstract the logical structure of tables, rather than their presentational format. It is also representation independent, because we specify tables with well-understood mathematical notions, rather than with special representational structures. The model can be used to guide the design and implementation of tabular editors and formatters."

Key Words and Phrases: abstraction, tabular abstraction, tabular editing, tabular formating, document processing.

Available in Postscript format: ftp://ftp.cs.ust.hk/pub/dwood/xinxin/table.ps.gz; [mirror copy].



[CR: 19961018]

Wang, Xinxin; Wood, Derick. "XTABLE - A Tabular Editor and Formatter." Pages 167-179 (with 13 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: [Wang]: Nortel, Ottawa, Ontario, Canada. Email: xinxin@nortel.ca; [Wood]: Department of Computer Science Hong Kong University of Science and Technology HKUST, Clear Water Bay, Kowloon Hong Kong. Tel. +852.2358.6988; Fax +852.2358.1477; E-Mail dwood@cs.ust.hk. WWW Home Page..

Abstract: "XTABLE is a prototype interactive tabular editor and formatter for the design of high-quality tables and for the exploration of tabular data from different viewpoints. It abstracts the multidimensional logical structure of a table and provides a mechanism to map an abstract table into different two-dimensional presentations. We present XTABLE from four aspects: abstract model, presentational model, system structure, and user interface. We also discuss the merits and limitations of XTABLE."

A preliminary version of the document is available as Technical Report HKUST-CS96-29; URL: ftp://ftp.cs.ust.hk/pub/techreport/96/tr96-29.ps.gz; [mirror copy]. For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.



Warmer, Jos; Van Vliet, Hans. "Processing SGML Documents." Electronic Publishing: Origination, Dissemination and Design (EPOdd) 4/1 (March 1991) 3-26. ISSN: 0894-3982. Authors' affiliation: [Warmer] PTT Research, DR Nehir Laboratories, Liedschendam, Netherlands; [Van Vliet] Faculteit Wiskunde et Informatica, Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam.

Abstract: SGML (Standard Generalized Markup Language) is an ISO standard that specifies a language for document representation. The main idea behind SGML is to strictly separate the structure and contents of a document from the processing of that document. This results in application-independent and thus reusable documents. To gain the full benefit of this approach, tools are needed to support a wide range of applications. The ISO Standard itself does not define how to specify the processing of SGML documents. Many existing SGML systems allow for a simple translation of an SGML document, which exhibits a 1-1 correspondence between elements in the SGML document and its translation. For many applications this does not suffice. In other systems, the processing can be expressed in a special-purpose programming language. In this paper the various approaches to processing SGML documents are assessed. We also discuss a novel approach, taken in the Amsterdam SGML Parser. In this approach, processing actions are embedded in the grammar rules that specify the document structure, much like processing actions are embedded in grammars of programming languages that are input to a parser generator. The Appendix contains an extended example of the use of this approach. [check]

Dates: Received 10-January-1990, revised 18-October 1990.



Warmer, Jos; Egmond, Sylvia van. "The Implementation of the Amsterdam SGML Parser." Electronic Publishing: Origination, Dissemination and Design (EPOdd) 2/2 (July 1989) 65-90. ISSN: 0894-3982.

Abstract: The Standard Generalized Markup Language (SGML) is an ISO Standard that specifies a language for document representation. This paper gives a short introduction to SGML and describes the (Vrije Universiteit) Amsterdam SGML Parser and the problems we encountered in implementing the Standard. These problems include the interpretation of the Standard in places where it is ambiguous and the technical problems in parsing SGML documents. Note: the "Amsterdam parser" is available electronically via Internet anonymous-FTP.



[CR: 19970726]

Warnock, John E [interviewee]. "<Q> &amp; <A>: Dr. John E. Warnock [Interviewed by <TAG>]." <TAG>: The SGML Newsletter 10/7 (July 1997) 1, 7-8. ISSN: 1067-9197. Authors' affiliation: Adobe Systems, Inc.

The CEO of Adobe Systems, Inc. answers questions about the Adobe products which support SGML, might support XML, and how these relate to PDF and the Internet.



Warren, P. T "SGML and Style Sheets: the Implications for Electronic Document Preparation." University Computing 9/2 (June 1987) 81-86. ISSN: 0265-4385. Author affiliation: Leicester University, England.

Abstract: Standards have been a long time coming in the field of text processing and the recent publication of the standard generalized markup language starter set has attracted some interest. This is a generic mark-up system for the structural, as opposed to the presentational, features of documents. It can then be implemented on a variety of output devices according to the facilities available. Style sheets help enforce uniformity of style throughout a document, and across documents from different authors, by allowing the author to write without attention to formatting. The paper shows how the style sheet feature of a proprietary word processor may be configured to simulate most of the features of the SGML starter set.



Watson, Bradley C. "Converting ACM Authors' Articles to SGML." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 5 pages. OCLC, Research Scientist.

"Abstract: The Association for Computing Machinery (ACM) has contracted with OCLC to provide an end-to-end electronic publishing system for their journal publications. This report focuses on the article conversion component, which converts accepted articles from the original format used by the author to Standard Generalized Markup Language (SGML), based on ACM's SGML Document Type Definition (DTD). The conversion component can work with documents in several word-processing and text-editing formats, including WordPerfect, MS Word, Framemaker, and LaTex, that originate on a variety of platforms, including DOS/Windows, Macintosh, OS/2, and Unix. The key transformations in the process are: (1) from the native format to Rich Text Format (RTF), and (2) from RTF to SGML. Exoterica Corporation's OMNIMARK programming language is used for the second step, while each word processor creates the RTF files."

Available online via the OCLC WWW server [or in mirror copy, text only].



Watson, Bradley C.; Davis, Robert J. "ODA and SGML: An Assessment of Co-existence Possibilities." Computer Standards and Interfaces 11 (1990-1991) 169-176. (8) references. ISSN: 0920-5489. Authors' affiliation: Online Computer Library Center [OCLC], Dublin, Ohio.



[CR: 19960808]

"Web Consortium Proposes Style Sheets." Seybold Report on Desktop Publishing 10/7 (March 25 1996) 12.

"In a move that could help restore some of the separation of form and content in Web documents, the World Wide Web Consortium (W3C) has published a draft specification for attaching vendor-neutral style sheets to HTML documents. Style sheets will enable publishers to attach formatting hints to their documents. All of the major Web browser suppliers have endorsed the W3C style-sheet effort."

"The specification, called Cascading Style Sheets (CSS), is rather basic for those accustomed to working with today's full-featured desktop publishing programs. Nevertheless, it represents an important first step toward interchange of robust style sheets, an objective that has proved elusive in the SGML community. (An ISO committee worked for years to develop the DSSSL standard, but vendor support has been slow to materialize.)" [extracted]



[CR: 19960816]

Weibel, Stuart. "The Changing Landscape of Network Resource Description." Library Hi Tech 14/1 ( 1996) 7-10 (with 1 reference). Author's affiliation: OCLC.

Abstract: "The author discusses the status of the Dublin Core DTD, HTTP, HTML (Hypertext Markup Language) and URNs (Uniform Resource Names). He considers the PICS (Platform for Internet Content Selection) industry consortium effort directed at providing technical means to support content selection on the World Wide Web."

See other SGML-publications of Stuart Weibel referenced in this collection and in the author's home pages.



Weibel, Stuart. "The CORE Project: Technical Shakedown Phase and Preliminary User Studies." OCLC Systems and Services 10/2-3 (Summer-Fall 1994) 99-102. Author affiliation: OCLC, Dublin, Ohio, USA.

Abstract: "The CORE project is an electronic library prototype that provides networked access to the full text and graphics content of American Chemical Society journals and associated Chemical Abstracts Service indexing since 1980 (some 250 journal years of data). The database is coded in SGML ("Standard Generalized Markup Language", translated from original typography codes) which captures the structural richness of the original document and provides flexibility for indexing, searching, and display. The prototype provides a full-scale laboratory environment in which to explore issues of database structure, user interface capabilities, and information retrieval questions on a large, real-world scholarly electronic journal database. The complete database, representing more than 600000 pages of full text and graphics, will be the largest electronic corpus of its kind. Scheduled for availability at Cornell in late 1993, this database will be available for use by the Cornell Chemistry Department faculty and students on a local area network (although the architecture of the CORE system is extensible to wide area networks as well)."



Weibel, Stuart L. "The Design and Implementation of XSCEPTER, an X-Windows Graphical User Interface to the CORE Project." In Part 2: External and Collaborative Research, pages 40-45, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 14 pages; 3 references, 8 figures. Author's affiliation: OCLC, Consulting Research Scientist.

"Abstract: The CORE project is an electronic library prototype that provides networked access to the full text and graphics content of American Chemical Society journals and associated Chemical Abstracts Service indexing since 1991. This project provides a full scale laboratory environment in which to explore issues of database structure, user interface capabilities, and information retrieval questions on a large, real-world scholarly electronic journal database. The magnitude of the CORE project, along with the complexities of searching and navigating large full-text collections require novel capabilities in user interface design. This report discusses key design issues and the capabilities of OCLC's XSCEPTER interface to the CORE database. It describes strategies for providing cross-platform interoperability, searching and browsing capabilities, and the formatting and display of complex SGML data."

Available in HTML format on the OCLC WWW Server.



[CR: 19970817]

Weibel, Stuart. "In Memoriam: A Tribute to Yuri Rubinsky, August 2, 1952 -- January 21, 1996." Page 583 in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Author's affiliation: Senior Research Scientist, OCLC Office of Research, 6565 Frantz Road, Dublin, OH 43017-3395; Email: weibel@oclc.com; WWW: http://purl.oclc.org/net/weibel/.

A related version of Weibel's tribute to Yuri Rubinsky is available online. See also the larger collection of tributes to Yuri Rubinsky in the SGML/XML Web Page, and the Yuri Rubinsky Insight Foundation, "dedicated to commemorating the genius of the late Yuri Rubinsky by bringing together workers from a broad spectrum of disciplines to stimulate research and development of technologies which will enhance access to information of all kinds."

See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.



[CR: 1995]

Weibel, Stuart. "Metadata: The Foundations of Resource Description." D-Lib Magazine / (July 1995) [??]. Author's affiliation: Office of Research, OCLC Online Computer Library Center, Inc. Email: weibel@oclc.org.

Abstract: "This paper is an abbreviated version of the Summary Report of the OCLC/NCSA Metadata Workshop. It sets forth a proposal for the content of a simple resource description record (the Dublin Core Metadata Element Set) and outlines a series of further steps to advance the standards for the description of networked information resources."

Available online in HTML format: http://www.cnri.reston.va.us/home/dlib/July95/07weibel.html [mirror copy, text only]. See also the author's publication (with Lorcan Dempsey) "The Warwick Metadata Workshop: A framework for the deployment of resource description," DLib Magazine (July/August, 1996) (http://www.dlib.org/dlib/july96/07weibel.html).



[CR: 19950716]

Weibel, Stuart L. "Project ADAPT: Automated Document Architecture Processing and Tagging." EPSIG News 2/3 (September 1989) 1-2. ISSN: 1042-3737.

The article describes work done by OCLC based upon funding from AITRC. OCLC is attempting to automate the process of character capture (OCR) and document structuring for retrieval, interchange, display and archiving. SGML is used with OCLC's experimental Graph-Text system to structure documents for these purposes.



[CR: 19950716]

Weibel, Stuart L. "Project ADAPT Studies Electronic Conversion of Documents." OCLC Newsletter 178 (March/April 1989) 10-11.



Weibel, Stuart L. "Scholarly Publishing on the World Wide Web." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 11 pages, 6 references. OCLC, Consulting Research Scientist.

"Abstract: The explosive growth of the World Wide Web (WWW) is due in part to the ease with which information can be made available to Web users. The simplicity of HTML and HTTP servers lowers the barriers to network publishing."

"Publishers are increasingly turning to Standard Generalized Markup Language (SGML) as the lingua franca for electronic representation of their products. SGML allows the representation of the logical structure of a document and is sufficiently flexible to support arbitrary rendering models, either paper-based or electronic. HTML (HyperText Markup Language), a simple application of SGML-like markup, is the standard method for expressing document structure in the WWW (Berners-Lee, 1993). Its simplicity has contributed to its popularity and made Web publishing more accessible, but that same simplicity makes it difficult to express the full richness of conventionally published scholarly documents..." [extracted]

Available via the Internet on the OCLC WWW server [or in mirror copy, text only]. See also immediately below .



[CR: 19950925]

Weibel, Stuart L. "The World Wide Web and Emerging Internet Resource Discovery Standards for Scholarly Literature." Library Trends 43/4 (Spring, 1995) 627-644 (with 21 references). Author's affiliation: OCLC Office of Research, Dublin, Ohio.

"Abstract: The World Wide Web (WWW) has become an important medium for the dissemination of scholarly information. This article discusses the technology of the Web and why it is likely to have a lasting impact on the dissemination of published scholarship. The role of the display and indexing of structured text is discussed, particularly the relationship of HyperText Markup Language (HTML) and Standard Generalized Markup Language (SGML), as well as problems associated with matching the needs of session based document retrieval and the "stateless" architecture of the Web. The relationship of existing bibliographic description standards to emerging standards for the description of networked information resources is described."



Weibel, Stuart; Godby, Jean; Miller, Eric; Daniel, Ron. OCLC/NCSA Metadata Workshop Report. OCLC Conference Report. Dublin, OH: Office of Research, OCLC Online Computer Library Center, Inc., [March], 1995. Extent: approximately 30 pages; includes DTD for "the Dublin Core". Authors' affiliation: [Weibel, Godby, Miller] Office of Research, OCLC Online Computer Library Center, Inc.; [Daniel] Advanced Computing Lab, Los Alamos National Laboratory .

"Executive Summary: The March 1995 Metadata Workshop, sponsored by the Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA), convened 52 selected researchers and professionals from librarianship, computer science, [SGML] text encoding, and related areas, to advance the state of the art in the development of resource description (or metadata) records for networked electronic information objects."

Available online from the OCLC WWW server: Workshop Report, by Stuart Weibel [mirror copy of report, text only, July 1995].



Weibel, Stuart; Miller, Eric; Godby, Jean; LeVan, Ralph. An Architecture for Scholarly Publishing on the World Wide Web . Dublin, OH: Office of Research, OCLC Online Computer Library Center, Inc., 1995 [1994?]. approximately 12 pages; 5 references. Authors' affiliation: OCLC Office of Research.

"The explosive growth of the World Wide Web (WWW) is due in part to the ease with which information can be made available to Web users. The simplicity of HTML and HTTP servers lowers the barriers to network publishing.

The high-quality rendering of HTML in WWW browsers such as Mosaic raises the aesthetic appeal of information and makes it more useful by virtue of enhanced readability. But the simplicity that makes WWW technology so appealing also makes it difficult to represent the complex markup and typography necessary for scholarly publishing. The need for extensive character sets and more effective interface facilities for inter- and intra-document navigation stretch the limits of the current standards that underlie the Web and its clients. In addition, the stateless nature of WWW client-server interactions presents certain challenges to the effective implementation of search and retrieval functionality so important to effective document retrieval systems.

OCLC distributes several scholarly journals under its Electronic Journals Online program, acting, in effect, as an `electronic printer' for scholarly publishers. As part of this effort, OCLC is prototyping a WWW-accessible version of these journals.

This presentation will describe the problems encountered, detail some of the short-term solutions, and highlight changes to existing standards that will enhance the use of the Web for scholarly electronic publishing."

Available on the Internet via WWW . See also immediately above. [mirror copy, June 1995, text only]



[CR: 19950914]

Weise, John. Retrieving Images from Structured Documents: Realizing the Potential of SGML. Technical Report LS 605: The Making of Digital Libraries. Ann Arbor, MI: The University of Michigan, School of Information and Library Studies, Fall [December 9], 1994. . Author's affiliation: The University of Michigan, School of Information and Library Studies. Contactby email: jweise@umich.edu.

Summary: "Images present unique information retrieval problems that must be approached with enthusiasm and caution as new publishing standards such as SGML (Standard Generalized Markup Language) spread throughout the industry. . ."

Available online: Retrieving Images from Structured Documents: Realizing the Potential of SGML; mirror copy here, September 1995.



Weiss, E. H. "Of Document Databases, SGML, and Rhetorical Neutrality." IEEE Transactions on Professional Communications 36/2 (June 1993) 58-61. 9 references.

Abstract: New technology has enabled the audience to shape a writer's message. Today, publishing technical information often consists of letting the receivers search the files, extract what they judge relevant, sequence and organize it any way they wish, and even print or display it to their own specifications. Often, the writer is not creating deliberately worded and presented messages but rather, feeding molecular articles to rhetorically neutral databases, from which readers may extract what they wish. Such technologies as SGML even further limit writers and deprive them of such basic presentation devices as deciding where pages will begin and end. The rhetorical implications of technology that empowers readers and enfeebles writers are reviewed.



Weitzman, Louis; Wittenburg, Kent. "Automatic Presentation of Multimedia Documents Using Relational Grammars." [To appear as pages] xx-xx in Proceeedings of ACM Multimedia '94 (San Francisco, CA, October 15-20, 1994). New York: ACM, 1995.

Abstract: This paper describes an approach to the automatic presentation of multimedia documents based on parsing and syntax-directed translation using Relational Grammars. This translation is followed by a constraint solving mechanism to create the final layout. Grammatical rules provide the mechanism for mapping from a representation of the content of a presentation to forms that specify the media objects to be realized. These realization forms include sets of spatial and temporal constraints between elements of the presentation. Individual grammars encapsulate the "look and feel" of a presentation and can be used as generators of that style. By making the grammars sensitive to the requirements of the output medium, parsing can introduce flexibility into the information realization process.



Wesley, T.; Tobin, C. "CAPS Copyright Communication and Access to Information for Persons with Special Needs." SIGCAPH Newsletter 48 (September 1993) 4-8. Authors' affiliation: Department of Computing, Bradford University, UK.

Abstract: The SGML community is perfectly aware of the benefits International Standards bring to the interchange of digital information, especially when that information is textual in form. What the same community may be less aware of is the enormous group of individuals who have no access to such information because of visual disability. This paper describes the efforts of a European Community funded project CAPS (communication and access to information for people with special needs), to use SGML and its associated standards to fulfil the belief "that advancing computer based publishing and through adaptive computer technology for persons with disabilities offers the potential to make printed information accessible simultaneously and at no greater cost than the able bodied community enjoys." The paper also describes how SGML has been used to define a DTD for the European Interchange Format (EIF) to enable print disabled access to newspapers.



[CR: 19950716]

Wesley, Tom; Tobin, Christopher. "ISOs and the Print Disabled." New Library World 94/1110 (1993) 26-28. Authors affiliation: EC and University of Bradford, West Yorkshire, UK.

"Abstract: Some of the work of the European Community-funded project - Communication and Access to Information for People with Special Needs (CAPS) in the Programme Technology Initiative for Disabled and Elderly - is described. Standard generalized markup language (SGML) is proving to be of great value within the CAPS project. The use of SGML in the distribution of electronic daily newspapers for the blind is examined. One of the aims of the CAPS project was to harmonize incompatible electronic information formats, paving the way for a truly pan-European potential for the distribution of electronic newspapers for the print disabled. The harmonized format was named the European Interchange Format (EIF). The pilot phase of CAPS included the development of a PC-based workstation capable of receiving newspapers in the EIF and making them accessible to the print disabled. The EIF, encoded in SGML, has so far proved very successful."



[CR: 19971227]

Wheedleton, Chris. "Metadata and SGML: How to Use Both to Your Advantage." Pages 269-280 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Chris Wheedleton]: Information Technology Engineer, Science Applications International Corporation (SAIC), 1710 Goodridge Drive, McLean, Virginia 22102 USA; Phone: +1 703.821.4475; FAX: +1 703.883.9042; Email: christopher.c.wheedleton@cpmx.saic.com; WWW: http://www.sgml.saic.com/.

Abstract: "Many of our customers have recently focused their attention on the descriptive data about content objects which can be found in and around document processing systems. This 'meta' data plays a key role in providing descriptive information that drives the processing of SGML content (i.e., printing, searching or filtering) while also providing behind-the-scenes information about authors or changes to the content. This data can be used for descriptive content or as the content itself, adding additional layers of usefulness that must be managed and tracked when processed. Recent advances in document management technology have introduced a new set of metadata that provide object attribution at the database layer of the system. This metadata can and should be used in concert with SGML to provide a more robust solution that can be exported outside of the corporate enterprise. Metadata is also being used on the web in many creative ways. The introduction of XML into the web environment will only enhance the effectiveness that metadata can play in the processing and communication of corporate assets. This paper will introduce these concepts and describe some examples of how a properly designed system can meet internal and external production requirements."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971229]

Wheedleton, Chris C. "The Power of Using Content Tagging and Attributes with Your Data." Pages 71-76 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Information Technology Engineer, Science Applications International Corporation (SAIC), 1710 Goodridge Drive (M/S T2-5-1), McLean, VA 22102, USA; Tel: +1 703-821-4475; FAX: +1 703-883-9042; Email: CHRISTOPHER.C.WHEEDLETON@cpmx.saic.com.

Abstract: "The use of SGML attributes to represent complex tabular data can help authors create and maintain large volumes of data. Smart use of attributes combined with the functionality of today's SGML processing tools can make the management and distribution of this type of data simple, effective, and more usable. SAIC has recently implemented attributes in some unique SGML applications. We consider SGML attributes as a useful extension of the "content tagging" approach that is being commonly implemented with SGML elements. This paper will describe one such application that effectively used attributes to store up to 250 pages of tabular records each with up to 70 repetitive content descriptors. The application will be described and the rational for selecting an attribute solution will be described."

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

A version of the document is available online in HTML format; [local archive copy, text only]



Wickes, Simon. "Practical SGML by Eric Van Herwijnen. The Second Edition of this Popular Book is Reviewed." <TAG> 7/7 (July 1994) 7-8. ISSN: 1067-9197. Author affiliation: Simon Wickes is an SGML Analyst employed by InfoDesign Corporation.



[CR: 19951220]

Wickes, Simon. "[Review of] README.1ST: SGML for Writers and Editors, Written by Ronald C. Turner, Timothy A. Douglass, Audrey J. Turner." <TAG> 8/12 (December 1995) 13-14. ISSN: 1067-9197. Author's affiliation: Information Architects, Inc.

See the bibliographic entry for README.1ST: SGML for Writers and Editors.



[CR: 19971227]

Wickes, Simon. "SGML and Related Standards, by Joan Smith." <TAG> 6/5 (May 1993) 9-10. ISSN: 1067-9197. Author's affiliation: InfoDesign, Toronto, Ontario.

The author provides a review of Joan Smith's book SGML and Related Standards. Document Description and Processing Languages, published by Ellis Horwood (1992) in the series 'Ellis Horwood Series in Computers and their Applications.' Wickes "finds much to commend" the book and gives it a warm recommendation. See the bibliographic entry for other details, including an online Table of Contents. The volume is available for purchase through the International SGML Users' Group.



[CR: 19970808]

Wickes, Simon. What's the DIS on DSSSL? Information Architects Technical Report. Aurora CO: Information Architects, December 12 1995. Extent: approximately 5 pages. Author's affiliation: Information Architects.

Abstract: "This paper summarizes the current state of DSSSL as of the date above. All of the information was gathered at SGML '95 in Boston. It contains the results of informal interviews and captures the general feeling of the SGML industry toward the future of DSSSL."

Available online: http://www.sgmlu.com/documents\iai\swickes.htm.



[CR: 19970312]

Wiesener, Stephan; Kowarschick, Wolfgang; Vogel, Pavel; Bayer, Rudolf. "Semantic Hypermedia Retrieval in Digital Libraries." Pages 115-129 in Digital libraries: research and technology advances. ADL '95 Forum. Selected Papers. Forum on Research and Technology Advances in Digital Libraries, ADL '95. McLean, Virginia, USA, May 15-17, 1995. Sponsored by NASA. Edited by Adam, Nabil R.; Bhargava, Bharat K.; Halem, Milton; Yesha, Yelena. Lecture Notes in Computer Science, volume 1082. Berlin/Heidelberg, Germany: Springer-Verlag, 1996. ISBN: 3-540-61410-9. ISSN: 0302-9743. Authors' affiliation: [Wiesener:] Bayerisches Forschungszentrum für Wissensbasierte Sys., Germany; [Kowarschick, Vogel, Bayer:]; Technische Universität München.

The discussion of OMNIS system for document query languages discusses SGML and other methods for addressing metadata and structural subcomponents in documents.



Wilhelm, Ronald E. "An Introduction to the World Wide Web and SGML." <TAG> 8/3 (March 1995) 7-9. ISSN: 1067-9197.



[CR: 19970314]

Williams, Ian "SGML to Hypertext - Conversion Tools Used in the Idex Document Database." SGML Users' Group Bulletin 3/2 (1988) 37-38. ISSN: 0269-2538. Author's affiliation: Product Manager, Owl International Inc.

Abstract: "The benefits of markup in text processing systems are well understood. The SGML standard is now established, and is increasingly used in technical documentation environments. Authors using such systems can separate the logical structure of their documents from software and device specific formatting requirements, making it possible to support a variety of delivery formats with no change to the source data files. Today, most SGML-based systems are concerned with the production of paper documentation. This paper describes the interpretation of markup in the creation of hypertext documents for computer screen display in the Idex Document Management Systems, and looks at some future prospects for SGML and hypertext."

The article is based upon a paper presented at TechDoc 12, San Diego, CA, 26 August 1988.



[CR: 19971227]

Williams, Jason P; Toback, Michael; Sujansky, Walter. "SGML and the Electronic Health Record." Pages 611-617 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Jason P. Williams]: Oceania, Incorporated, 3145 Porter Drive, Suite 103, Palo Alto, California 94304; Email: jwilliams@oceania.com; WWW: http://www.oceania.com; [Michael Toback]: Oceania, Incorporated; Email: mtoback@oceania.com; [Walter Sujansky]: Oceania, Incorporated; Email: wsujansky@oceania.com.

Abstract: "This paper describes the steps taken by Oceania Inc., creator of the WAVE EHR (Electronic Health Record), to develop SGML solutions to better manage healthcare information. WAVE allows clinicians to create structured documents which then become a part of the patient's medical record. Oceania has developed a Document Type Definition so that these documents may be encoded using SGML. The combination of relational database technology with an SGML document repository within WAVE will provide maximum access to the information for retrieval purposes for data reporting and analysis purposes. For our customers, SGML encoding will greatly enhance the portability of documents created by WAVE, especially as major healthcare standards bodies are rapidly adopting SGML as one solution to facilitate data exchange."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971227]

Willner, Eli. "Vertical Idiosyncracies: How Different Industries View SGML. For Presentation at SGML/XML '97, December 10, 1997." Pages 561-564 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Eli Willner]: Director of Marketing, Data Conversion Laboratory; Phone: +1 718-357-8700; FAX: +1 718-357-8776; Email: ewillner@dclab.com; WWW: http://www.dclab.com.

Abstract: "SGML is like the proverbial elephant being examined by the blind men; it means different things to different people. Some folks are concerned about structure, others about media independence. Some want platform portability, others are fixated on the Web. Some are purists, some just want to save money. Some are experienced SGML pros and some don't know their entities from their attributes. Whether you're a vendor, a service provider or just a colleague, it's useful - and maybe profitable - to know where your fellow SGML travelers are coming from when you talk shop with them. There aren't any hard-and-fast rules, but there are patterns within industries that will assist you in determining SGML perspective. This presentation examines those patterns."

This paper was delivered as part of the "Business Management" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950911]

Wilmott, Sam. "Distinguishing Intelligence from Formatting." <TAG> 20 (December, 1991) 6-10.

"Far from being a technique limited to old-style, large mainframe computer, batch-oriented methods of processing text, the use of markup languages is the major tool at our disposal for embedding intelligence in documents. Text markup languages provide the tools for economical capture of the highly detailed information required by the new information storage and access technologies." [from the article "Conclusion"]

Wilson, Eve. "A comparison of interfaces: computer, designer, and user." Pages 326-331 (with 10 references) in DEXA 92. Database and Expert Systems Applications. Proceedings of the International Conference, Valencia, Spain. (Valencia, Spain, 2-4 September 1992). Edited by A. Min Tjoa and Isidro Ramos. Wien, Austria/New York: Springer Verlag, 1992. xii + 546 pages. ISBN: 3211824006 (Wien); 0387824006 (New York). Author affiliation: Comput. Lab., Kent University, Canterbury, UK.

Abstract: The paper compares three interfaces: the first is a traditional, information retrieval system handling SGML tagged text; the second, a menu-driven interface with multi-font output; the third, a hypertext interface where some graphical capability is sacrificed for improved information structure, reduced information load, and enhanced user control. The hypertext interface is innovative because it is a hybrid system: a hypertext front-end and post-processor are fully integrated with the original information retrieval package and the SGML text. The retrieved text is converted to hypertext format just before presentation to the user, who sees a consistent hypertext interface for query formulation, input, and browsing.



[CR: 19971018]

Wilson, Eve; Shepton, Peter D. "SGML as a vehicle for porting hypertext applications between systems." Pages 175 - 176 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Authors' affiliation: [Eve Wilson:] University of Kent at Canterbury, Email: E.Wilson@ukc.ac.uk; [Peter D. Shepton:] University of Kent at Canterbury.

[Extract:] "The markup of EPS information was an evolutionary process i.e., it was not clear at the start what features should be tagged and continual modification of the DTD was required to ensure that the same definition could be used effectively for all three hypertext formats and the question of the optimum level of mark up for portability is still not wholly resolved. An SGML document ensures that there is an accurate description of the logical structure and components, but to achieve maximum functionality from the target hypertext system, modifications were frequently desirable and much vital and system dependent work had to be done during the parsing stage. Data portability is still greatly constrained by the requirements of the target system."

Abstract available online in HTML format: "SGML as a vehicle for porting hypertext applications between systems", by Eve Wilson and Peter D. Shepton. Presentation at ACH/ALLC '97. [archive copy. Abstract for a paper delivered at ACH/ALLC '97.]

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.



Wohler, Wayne L. "The DTD May Not Be Enough: SGML Declarations." <TAG> 5/10 (October 1992) 6-9. Author affiliation: Wayne L. Wohler is an Advisory Engineer with Publishing Solutions, IBM Corporation [Boulder, Colorado], and represents IBM's SGML interests in various working groups.

Part one of a three-part serialized article in <TAG>'s occasional tutorial series. This first article covers: Introduction, Document Character Sets, Defining Character Sets, System Character Set, Using Character Sets from the Far East, Conclusion. The full text of this tutorial is available online. See Part 2 and Part 3. See the SGML Declaration main entry for other information.



Wohler, Wayne L. "The DTD May Not Be Enough: SGML Declarations." <TAG> 6/1 (January 1993) 1-7. Author affiliation: Wayne L. Wohler is an Advisory Engineer with Publishing Solutions, IBM Corporation [Boulder, Colorado], and represents IBM's SGML interests in various working groups.

Part two of a three-part serialized article in <TAG>'s occasional tutorial series. This second article covers: Declaration of a Concrete Syntax, How is a Syntax Defined?, Defining the Concrete Syntax, Conclusion. The full text of this tutorial is available online. See Part 1 and Part 3. See the SGML Declaration main entry for other information.



Wohler, Wayne L. "The DTD May Not Be Enough: SGML Declarations." <TAG> 6/2 (February 1993) 1-6. Author affiliation: Wayne L. Wohler is an Advisory Engineer with Publishing Solutions, IBM Corporation [Boulder, Colorado], and represents IBM's SGML interests in various working groups.

Part three of a three-part serialized article in <TAG>'s occasional tutorial series. This third article covers: Feature Usage Declaration, Application Specific Information, Using the Concrete Syntax Scope, Capacity Sets, Reference Capacity Set, A Few Final Notes, Putting It All Together. The full text of this tutorial is available online. See Part 1 and Part 2. See the SGML Declaration main entry for other information.



Wolfsthal, Y. "Style control in the Quill document editing system." Software - Practice and Experience 21/6 (June 1991) 625-638. (14) references. Author affiliation: IBM Palo Alto Science Center, CA.

Abstract: A critical problem in the design of editors for structured documents is that of style control, i.e. mapping the logical elements of the documents to their physical appearance on pages. This paper presents a novel approach to style control, used in the Quill document editing system that has been prototyped at the IBM Almaden Research Center. The style control mechanism is an integral part of the editing system and consistent with the overall system architecture, in both its inner structure and its user interface. Properties that specify the formatting process, together with action routines for specifying complex semantics, are the basic style control primitives in the proposed approach. See also on Quill in Chamberlin 1988.





[CR: 19980220]

Wonneberger, Reinhard. "Tex in an Industrial Environment." Electronic Publishing: Origination, Dissemination and Design (EPODD) 7/1 (March 1994) 3-19. With: 80 references. ISSN: 0894-3982. Author's affilation: EDS Electronic Data Systems, Deutschland GmbH, Russelsheim, Germany.

Abstract: "During its first decade, TEX has been at home mainly in the academic world. Therefore it comes as a surprise to find that it has been spreading into industry during the last few years, and we try to outline some highlights of this development first. Then criteria for an industrial environment application area and reasons for using the structured document processing approach are discussed. It is shown what role TEX can play in an integrated document processing environment, and this role is exemplified by a case study from an application at EDS."

For more on SGML/XML and TeX, see the dedicated database entry and the topical bibliography listing.



Wonneberger, Reinhard; Mittelbach, Frank. "SGML - Questions and Answers." TUGboat 13/2 (July 1992) 221-223.



[CR: 19971125]

Wood, Chris; Gallagher, James A. "Tornado F3 Conversion of Publications Data to AECMA (Association Européenne des Constructeurs de Matériel Aèrospatial) 1000D - A Case Study." Page(s) 27-30 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Wood]: Group Leader, Technical Publishing Systems Group, British Aerospace Defence Ltd., United Kingdom; Email: gbbaee8f@ibmmail.com; [Gallagher]: Ministry of Defence, Air technical Publications (RAF), Scotland.

Abstract: "The RAF (Royal Air Force) have traditionally supported their in-service fleet exclusively with either hard-copy publications or microfiche. This is changing. Recently placed contracts for the Attack Helicopter, EuroFighter 2000 and the Replacement Maritime Patrol Aircraft mandate electronic delivery of Descriptive, Maintenance, Parts Catalogue and Training publications data. This data is destined for delivery to LITS (Logistics Information Technology Strategy, the RAF's Logistic IT System). LITS is being developed by the RAF and IBM to receive and electronically distribute this data in an SGML-based Data Module form as defined by AECMA 1000D and the UK (United Kingdom) Def-Stan 00-60.

"In order to prove the capability of LITS and also to prove the contractors capability to deliver coherent Modular data, the RAF is sponsoring a series of "Proof of Concept" initiatives. One such initiative is the Tornado F3 Conversion Project

"This project is scoped to include over 60,000 pages of Technical Manuals (including Descriptive, Procedural and Parts Catalogue data) for conversion from its current production methods and specifications to SGML-based Data Module production and delivery.

"The project will challenge Consultants, Applications Developers, Authors, Editors and Document Conversion specialists to deliver this data in a new form and structure whilst continuing to support the operation of this front line Fighter Aircraft."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19970523]

Wood, Derick. "Standard Generalized Markup Language: Mathematical and Philosophical Issues." Pages 344-365 (with 33 references) in Computer Science Today. Recent Trends and Developments. Edited by Jan van Leeuwen, Utrecht University. Lecture Notes in Computer Science, 1000. Berlin / Heidelberg/ New York, NY: Springer-Verlag, 1995. ISBN: 3-540-60105-8. Author's affiliation: Department of Computer Science, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. WWW: Derick Wood Home Page.

Abstract: "The Standard Generalized Markup Language (SGML), an ISO standard, has become the accepted method of defining markup conventions for text files. SGML is a metalanguage for defining grammars for textual markup in much the same way that Backus-Naur Form is a metalanguage for defining programming-language grammars. Indeed, HTML, the method of marking up a hypertext documents for the World Wide Web, is an SGML grammar. The underlying assumptions of the SGML initiative are that a logical structure of a document can be identified and that it can be indicated by the insertion of labeled matching brackets (start and end tags). Moreover, it is assumed that the nesting relationships of these tags can be described with an extended context-free grammar (the right-hand sides of productions are regular expressions).

"In this survey of some of the issues raised by the SGML initiative, I reexamine the underlying assumptions and address some of the theoretical questions that SGML raises. In particular, I respond to two kinds of questions. The first kind are technical: Can we decide whether tag minimization is possible? Can we decide whether a proposed content model is legal? Can we remove exceptions in a structure-preserving manner? Can we decide whether two SGML grammars are equivalent?

"The second kind are philosophical and foundational: What is a logical structure? What logical structures may a document have? Can logical structures always be captured by context-free nesting?"

Published in a preliminary version as Technical Report HKUST-CS95-37, July 1995. The preliminary version is available as ftp://ftp.cs.ust.hk/pub/techreport/95/tr95-37.ps.gz; [mirror copy].



[CR: 19961226]

Wood, Eileen M. "Retro-Fitting DTDs for Near and Far Library: An Approach to using a Central DTD Repository." Pages 279-288 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Information Manager, Research Institute of America, 1 Publishers Parkway, Webster, NY, 14580, USA; Tel: (716) 671-7780 ext. 4256; FAX: (716) 671-9426; Email: emwood@riag.com.

Abstract: "Objectives [of the paper are]:

  • To provide an overview of the information and steps necessary to convert and load manually authored DTDs into Near and Far Library.
  • Provide a list of problems found when loading manually authored DTDs into Near and Far Library.
  • Provide a Summary of costs and benefits of using Near and Far Library for manually authored DTDs.

"In the Fall of '95, three Thomson Companies; RIA (Research Institute of America), WG&L (Warren, Gorham and Lamont) and Thomson Legal Publishing, Alexandria, VA were merged into one company RIAG (RIA Group). In order to share SGML DTDs and SGML data more effectively across all three companies and a variety of geographic areas, we needed a common storage, maintenance and documentation method. The three companies had various technology departments using a variety of hardware, operating systems and applications. Microstar's Near and Far Designer (previously referred to as Near and Far) and Near and Far Library (previously referred to as CADE) products was the best and only choice for this purpose.

NFD (Near and Far Designer) is a graphical editor for SGML DTDs. NFL (Near and Far Library, previously referred to as CADE Groupware) is a template for a Lotus Notes Database, and is a repository for storing definitions and descriptions of all information objects, e.g., elements, attributes, etc. in a DTD. NFD has an interface to NFL for storing and retrieving DTDs from the Lotus Notes database. Because Lotus Notes is available for a variety of platforms and works well across many geographic locations, this was a good solution to a common storage medium. A central repository for SGML DTDs was needed to provide a method for standard definitions, usage and documentation of information objects across multiple DTDs.

Because NFL is designed to protect the integrity of the DTD database, the definitions and descriptions of the elements, attributes and entities had to be consistent within a programmatic algorithm (i.e., byte-for-byte identical values). Because RIAG had 43 DTDs an automated solution was needed for this conversion. Several problems found during this process and the solutions that RIAG devised will be presented. A summary of the costs and benefits found during this project will also be presented. The presentation will also cover the unexpected benefits, and organizational impact of having a central repository for DTDs."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19951220]

Wood, Eileen M. "So What is SGML?" Journal of Systems Management 46/5 (September-October 1995) 24-29 (with 2 references). Author's affiliation: Thomson Professional Publishing.

"Abstract: The Standardized Generalized Markup Language (SGML) is an international standard for text management and identification. It provides a method for separately identifying information content from style or media type and provides easy access to information via an open framework independent from proprietary formats or platforms. SGML offers facilities for defining the character set to be used, document structure, text used more than once, external information to be included, special techniques for marking up text and the manner of text processing."



[CR: 19971125]

Wood, Lauren. "Getting to XML from HTML." Page(s) 189-192 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Technical Product Manager, SoftQuad, Inc.; Email: lauren@sqwest.bc.ca..

Abstract: "Many of those who use HTML are realising that they need the added flexibility of XML for their applications. This talk discusses how to get your data and systems from HTML to XML, including conversion and authoring."

"Interest in XML is growing, particularly now that major browser vendors are showing some interest in XML. There is an opportunity for people to add the richness they need to their documents, getting away from the restrictions of HTML. The best methods to do this will depend on the systems you currently have in place, as well as what you want to do with the documents.

"In general, you don't need to convert your SGML DTDs to use XML syntax. Many applications will only need the document, not the DTD. The advantage is that you can use the more complex SGML syntax that may be in your system, such as marked sections. What matters is that the document coming out is XML-compliant, not that the system that produced it is XML-compliant. Even if you need to provide a DTD, because the processing application on the other side of the Web uses the DTD, it may be possible to provide an XML-compliant DTD that matches the documents, but isn't the DTD you use for authoring." [extracted]

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19961226]

Wood, Lauren. "Learning from HTML - Lessons for DTD Authors ." Pages 231-234 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Technical Product Manager, SoftQuad, Inc., 10070 King George Hwy, Suite 108, Surrey, British Columbia V3T 2W4, Canada; Email: lauren@sq.com; WWW: http://www.sq.com.

Abstract: "This talk is not concerned with document analysis or ways in which to turn requirements such as database connectivity into SGML. It is concerned with discussing some features of the HTML DTD and what authors of other DTDs can learn from them.

HTML has been probably the single largest experiment in structured document construction that there has been, in terms of the numbers of participants. DTD authors should consider some of the results of this experiment when writing their DTDs, as at least some of the lessons to be learned may be valid for any given application of SGML. Authors of HTML documents choose to use specific elements and features of the language. Knowledge of possible reasons for these choices are also important for the design of DTDs."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19990519]

Wood, Lauren. "Programming Marked-Up Documents." Markup Languages: Theory & Practice 1/1 (Winter 1999) 91-100. ISSN: 1099-6622 [MIT Press]. Author's affiliation: Technical Product Manager, SoftQuad, Inc.; Email: lauren@sqwest.bc.ca; WWW: www.softquad.com.

Abstract: "The Document Object Model is a programming interface to HTML and XML documents. The level 1 DOM specification enables application writers to access, navigate, and manipulate the content and structure of HTML and XML documents. The paper describes the motivation behind the work on the DOM, as well as the rationale behind some of the design decisions. A precis of future work is given."

Revision: Received 29 June 1998, Revised 8 September 1998.

For other articles in this issue of MLTP, see the annotated Table of Contents.



[CR: 19971227]

Wood, Lauren. "The Web Document API." Pages 445-448 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Lauren Wood]: Technical Product Manager, SoftQuad, Inc., 108-10070 King George Hwy, Surrey, British Columbia, Canada V3T 2W4; Phone: +1-604-585 8394; FAX: +1-604-585 1926; Email: lauren@softquad.com; WWW: http://www.softquad.com/.

Abstract: "'The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of HTML and XML documents. The document can be further processed and the results of that processing can be incorporated back into the presented page'."

"This sentence is taken from the pages at the W3C (World Wide Web Consortium) site that discuss the work being done by the DOM (Document Object Model) Working Group. This group is working hard to standardize the various ways of accessing HTML (Hypertext Markup Language) and XML (eXtensible/Extensible Markup Language) documents that exist, from JavaScript and applets to the various vendor-dependent command language interfaces. The group consists of representatives from many of the companies one would expect, from both the HTML and SGML / XML communities. This talk will present an overview of the current specifications, what has been done, and what yet remains to be specified. The latest specification of the DOM [The DOM Book] will always be found on the W3C site."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971106]

Wood, Lauren; Sorensen, Jared. "Document Object Model Requirements." Pages 91-94 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: [Wood]: Technical Product Manager, SoftQuad, Inc., and Chair of the W3C DOM WG; [Sorensen]: Manager, Documentation Projects Group, Novell Inc.

Abstract: "This document defines the high-level requirements for the Document Object Model (DOM). References to XML and HTML documents generally denote the physical files that contain structural markup Some requirements are not implemented in DOM Level 1 [. . .]."

Note that the Document Object Model Specification (W3C Working Draft 09-Oct-1997) "defines the Document Object Model, a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. The Document Object Model provides a standard model of how the objects in an XML or HTML document are put together and a standard interface for accessing and manipulating these objects and their inter-relationships. Vendors can support the DOM as an interface to their proprietary data structures and APIs, and content authors can write to the standard DOM interfaces rather than product-specific APIs, thus increasing interoperability on the Web."

A later version of this document is available online in HTML format: Document Object Model Requirements, W3C Working Draft 09-October-97.



WordPerfect Corporation. "WordPerfect(R) 6.1 for Windows(TM) SGML Edition." White Paper. WordPerfect Corporation, [1994]. 29K (computer file); ca 9 pages.

The document is available on WordPerfect's server or here (copy mirrored on March 17, 1995).



Wright, Haviland. "SGML Frees Information. Escape a World Where There is Too Much Data and Go to a Place Where You Can Access the Information Hidden Within It." Byte Magazine 17/6 (June 1992) 279-286. 0360-5280. Author Affiliation: Haviland Wright is President of Avalanche Development Corporation, Boulder, CO; email: haviland@avalanche.com.

The article appears in a Byte special section "Managing Infoglut: How to Add Value to Your Data." Other articles in this special issue also discuss SGML.



[CR: 19961210]

Wu, Gilbert. "The SGML Research Experience at Hatfield Polytechnic." SGML Users' Group Bulletin 4/1 (1989) 24-34. ISSN: 0269-2538. Author's affiliation: British Library Research Associate, Hatfield Polytechnic. Address: ERDC, Hatfield Polytechnic, Hatfield AL10 9AB, UK.

Abstact: "Hatfield Pol~ytechnic is a member of Project Quartet which is a collaborative project undertaken by four research organizations (Bi~rmingham~ University, Loughborough University, University College London, and Hatfield Polytechnic) to investigate new developments in information technology. The entire project is sponsored by the British Library. The aim of the SGML work at Hatfield is to investigate applications and tools for SGML document production. As we are in collaboration with the Adonis Consortium, part of our work is to mark up Adonis journals using SGML. Adonis is a trial document delivery service that supplies 219 biomedical journals published in 1987 and 1988 on CD-ROM. Text and graphics are both scanned and stored in bit-map format. Each CD-ROM contains index information to allow users to search information by journals, articles and pages."

"In the course of our work, we have developed several document type definitions (DTDs) for the following classes of documents: Adonis Biomedical Journals, a Project Quartet Technical Report, and letters for correspondence. We have also investigated several possible routes for SGML document publishing. Our recent research has concentrated on investigating different input routes for SGML documents; this has involved evaluation of the commercial products and services that exist for this purpose. We are now researching different rendering routes for completed SGML documents using TEX,~ Troff, and PostScript etc. This paper also includes a discussion on the experience and difficulties in DTD design and a comparison between SGML and the Adonis system by using a marked-up Adonis journal as an example."

Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).



Wu, Gilbert. SGML Theory and Practice. British Library Research Paper, 68. South Swamp, GA: British Library Research and Development Department, 1989. 93 pages. ISBN: 0-7123-3211-1. ISSN: 0269-9257.



Wu, Gilbert S. K.; Robinson, Brian. SGML Support for Secure Document Systems. British Library. Research and Development Report, 6158. Boston Spa, UK: British Library Research and Development Department, 1994. 59 pages, with bibliography (55-59). ISSN: 0308-2385.



[CR: 19980413]

[XML Files Staff]. "Book Review: SGML CD." XML Files: The XML Magazine Issue 02 (October 20, 1997) [??].

A review of SGML CD: A Complete SGML Toolkit, by Bob DuCharme.

Summary: "This book is the answer for anyone who has ever struggled to use SGML freeware. It takes the reader beyond the often cryptic readme files with its step-by-step descriptions of installation and program use. The screen clips, keystroke documentation, and sample scripts make using free SGML software a straightforward and comfortable experience. The software and documentation in Bob DuCharme's SGML CD also provide valuable and inexpensive training for those attempting to learn more about SGML technology. This book is a must for those technologists who need to learn more about SGML and SGML systems as well as for those trying to implement SGML on a tight budget."

The review is available online.



[CR: 19980413]

[XML Files Staff]. "Book Review: SGML Buyers Guide." XML Files: The XML Magazine Issue 03 (December 8, 1997) [??].

A review of The SGML Buyers Guide. A Unique Guide to Determining Your Requirements and Choosing the Right SGML and XML Products and Services, by Charles F. Goldfarb, Steve Pepper, Chet Ensign, Linda Burman, et al.

Summary: "The SGML Buyers Guide is a must for anyone who is currently 'shopping' for an SGML publishing solution. It provides a clear roadmap to the classes of tools by function and the tools on the market at the point of publication. In addition, the HARP analysis methodology presented in the book will be a valuable tool to consultants and system designers who are in the process of modeling SGML publishing systems."

The review is available online.



[CR: 19980413]

[XML Files Staff]. "Book Review: Presenting XML." XML Files: The XML Magazine Issue 04 (March 17, 1998) 3-4.

Review of the book on XML, by Richard Light.

The conclusion: "Presenting XML is a good first publication on XML. It provides fundamentals of XML from several points of view including the HTML view, the Systems/Web Master view, and the SGML view. It gives readers practical examples of how a transition to XML can be made and provides some insights into the development that XML will see in the upcoming year. If you want to learn more about the new technology of XML, this is a good title to add to your professional library."

The review is available online: http://www.gca.org/memonly/xmlfiles/issue4/book.htm.



[CR: 19980413]

[XML Files Staff]. "Book Review: SGML on the Web." XML Files: The XML Magazine Issue 01 (September 20, 1997) [??].

Review of SGML on the Web: Small Steps Beyond HTML, by Yuri Rubinsky and Murray Maloney.

Summary: "... this book lays the foundation upon which XML was originally built. In the preface, Yuri Rubinsky calls for extending HTML within a 'stable framework' in a manner so that Web software knows 'what to expect'. He focuses on creating markup that is rich, yet easy to use. And he proposes that 'inventing new markup makes sense' if a method can be devised so the new markup holds 'no surprises.' Those who have been following the development of XML as an alternate Web language may quite easily be convinced that this has come about, in part, through Yuri's final guidance."

The review article is available online.



Yamada, M. "Conversion method from document image to logically structured document based on open-document architecture (ODA).." Systems and Computers in Japan 25/13 (1 November, 1994) 47-61. Contains: 14 references. Author's affiliation: Research and Development Laboratory, Kokusai Denshin Denwa Co. Ltd., Kamifukuoka, Japan..

Abstract: "A large number of studies have been made on document image processing, focusing mainly on the media conversion in character recognition and the layout structure analysis of document image. On the other hand, studies as well as efforts have been made toward a standardization of the document architecture, which emphasizes the editing aspect of the document and maintains the logical structure independent of the layout, as in the cases of open-document architecture (ODA) and Standard Generalized Markup Language (SGML). This paper considers the application based on the logical structure, and proposes a method which converts several continuous pages of document image to the logically structured document with the multistage structure of chapter/section/paragraph."



[CR: 19971227]

Yang, Jennifer. "Applying SGML to Graphic Arts and Multimedia Cataloguing." Pages 621-624 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Jennifer Yang]: Digital Graffiti, 42 Mary St., Hamilton, Ontario Canada L8R 3M9; Phone: +1 (905) 308-9634; FAX: (905) 529-5532; Email: graffiti@idirect.com; WWW: http://www.digitalgraffiti.com.

Abstract: "Will describe a commercial project to create software which manages and tracks both physical and software collections based upon the 'Anglo-American Cataloguing Rules' (AACR) and SGML. How the software takes advantage of these standards, particularly in its management of computer files, will be the focus. Although the software's primary market is graphic artists, the principles behind the design will be of particular interest to those managing any collection."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19970331]

Yeh, Lin-Ju; Yao, Hsiu-Hsen; Chen, Yuan-Kuo. "SSQL: a Semi-Structured Query Language for SGML Document Retrievals." Pages 221-228 (with 15 references) in Conference Proceedings, SIGDOC '96. The 14th Annual International Conference on Computer Documentation. ["Marshalling New Technological Forces: Building a Corporate, Academic, and User-Oriented Triangle"]. ISGDOC '96: 14th Annual International Conference. Research Triangle Park, North Carolina, US. October 20-23, 1996. Sponsored by the Association for Computing Machinery Special Interest Group on Documentation (SIGDOC). New York, NY: Association for Computing Machinery, 1996. ISBN: 0-89-791-799-5. Authors' affiliation: Center for Humanities and Sciences, Ibaraki Prefectural University of Health Sciences, Japan; Yuan-Ze Institute of Technology, Taiwan.

Abstract: Four structures can be found in SGML documents. They are, (i) sequential (free formatted) text structure, (ii) primary element hierarchy, (iii) flat attribute table in each element (type), and (iv) network structure generated by reference links among elements. Based on the development of the SGML document format, a study on document retrievals via these structures has been done. Six query paths are discussed in our research. In further theoretic exploration, a new data model, named the multigraph data model (MGDM), has been developed. Each document (elements) and each reference among the elements are modeled as nodes and links, respectively, in the multi-graphs. A semi-structured query language (SSQL) which supports integrated document retrievals has been developed. The syntax of SSQL is similar to ordinary object-oriented structured query languages, and the functions of SSQL fully support all six retrieval paths. The purpose of developing SSQL, and the major goal of our research, is to integrate various retrieval paths and to provide an effective query language for SGML document access. Both conventional structured tables and mark-up documents can be retrieved via SSQL.

Several other articles in this proceedings volume are germane to SGML: Tom Banfalvi, et al., "Manufacturing Documentation in the Virtual Warehouse"; Betsy Brown, et al., "From Hardcopy to Online: Changes to the Editor's Role and Processes"; Paul Beam and Peter Goldsworthy, "Technical Writing on the Web-Distributed SGML-Based Learning"; Stephanie Copp, "Working with Academe"; Cindy Roposh, et al., "Developing Single-Source Documentation for Multiple Formats"; Paul Prescod, "Multiple Media Publishing in SGML"; Dee Stribling, et al., "A Real World Conversion to SGML".



[CR: 19960818]

Yoshikawa, Masatoshi; Ichikawa, Osamu; Uemura, Shunsuke. "Amalgamating SGML Documents and Databases." Pages 259-274 (with 12 references) in Proceedings of the Fifth International Conference on Extending Database Technology. Advances in Database Technology - EDBT '96. Fifth International Conference on Extending Database Technology. Avignon, France, March 25-29, 1996. Sponsored by . Edited by Peter M. S. Apers, Mokrane Bouzeghoub, and Georges Gardarin. Lecture Notes in Computer Science, Vol. 1057. Berlin: Springer-Verlag, 1996. ISBN: ISBN 3-540-61057-X. Authors' affiliation: Graduate School of Information Science, Nara Institute of Science & Technology, Japan. E-mail: yosikawa@is.aist-nara.ac.jp.

"Abstract: The paper proposes a uniform and flexible mechanism to make reference links from SGML documents to database objects. In addition to typical document logical structures such as sections and paragraphs, our mechanism allows arbitrary character strings in documents as source of these links. By using this mechanism, SGML attributes and their values of marked-up words can be transparently stored as database attributes, and we can establish hyperlinks between keywords in documents, which reflect relationships between the corresponding database objects. Also, we present a query language to retrieve SGML documents which are coupled with databases in this manner. The query language does not assume a particular database schema; instead, it utilizes DTD (document data type) graphs, representing element structures of DTDs, as virtual schemas."

A preliminary version of this paper is available in Postscript format as Information Science Technical Report NAIST-IS-TR95007 [possibly a corrupt file, however], Graduate School of Information Science, Nara Institute of Science and Technology, February 27, 1995. Further: recent papers by Masatoshi Yoshikawa, or the author's home page.



[CR: 19971227]

Young, Russ. "Electronic Information Commerce." Pages 223-226 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Russ Young]: Director of Tools Development, Folio Division of Open Market, 5072 North 300 West, Provo, Utah 84604 USA; Phone: + (801) 229-6541; FAX: +1 (801) 229-6786 Email: ryoung@folio.com; WWW: http://www.folio.com.

Abstract: "Hard goods like books and CD-ROMs are not the only things being sold over the internet anymore. There is now a secure way to sell information over the internet, and this 'Information Marketplace' is changing the face of commercial publishing. The internet is providing incredible access to vast amounts of information, the real challenge for users is knowing where to look for the information and then how to access it. The obvious challenge to publishers is how to tap the internet market potential while still finding a way to generate revenue, provide secure transactions and increase advantages over their competition. We will explore both the publisher and user issues involved in electronic information commerce and show examples of working information commerce sites. We will also discuss how information commerce will drive more and more published content to be created, stored and managed in SGML and XML."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971125]

Young, Russell W. "Technology Driving the SGML Marketplace Driving Technology." Page(s) 295 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Folio Division of Open Market.

[Abstract unavailable.]

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19971227]

Young, Russ. "XML-Based Document Image Analysis." Pages 355-364 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Russ Young]: Director of Tools Development, Folio Division of Open Market, 5072 North 300 West, Provo, Utah 84604 USA; Phone: + (801) 229-6541; FAX: +1 (801) 229-6786 Email: ryoung@folio.com; WWW: http://www.folio.com.

Abstract: "Document image analysis is not a new field of study, as there are several different methods discussed in the research. Because of the newness of the XML standard, however, integrating XML-related technology with some of these classical algorithms has not yet been discussed. This will not only improve the image processing results, but also provide a standard method for representing various structured document types. XML will open the door for richer and more direct access of document images on the Internet. These advantages will be demonstrated in a Java-based application that analyzes document images, classifies them according to their type, and converts them to a tagged XML file. The resulting document will be ready for indexing by a structure-aware text search engine and for electronic delivery on the Internet."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19961210]

Zadow, Jerome L. "The Production Efficiencies of SGML: Case Studies in the Provision of Information." SGML Users' Group Bulletin 4/1 (1989) 15-17. ISSN: 0269-2538. Author's affiliation: AGFA Compugraphic Division, Agfa Corporation.

The author discusses the use of the CAPS production system for technical documentation. Companies profiled include Dataquest (subsidiary of Dun & Bradstreet), and Salomon Brothers. Studies of the database production time and operator time demonstrates that SGML is yielding a substantial econimic savings for the users of the CAPS document systems.

Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).



Zheng, M.; Rada, R. "A Standard-based Approach to Text and Hypertext Mutual Conversion and Interchange." Pages 116-121 (with 10 references) in IPCC 93 Proceedings. The New Face of Technical Communication: People, Processes, Products. IEEE International Professional Communication Conference (IPCC'93), Philadelphia, PA, USA, 5-8 October, 1993. New York, NY: IEEE, 1993. Authors' affiliation: Inst. of Comput. Sci. & Technol., Peking Univ., Beijing, China.

"Abstract: Hypertext interchange and bidirectional conversion between text and hypertext are largely considered two different applications. In the authors' research, a system (SGML-MUCH) for importing and exporting SGML documents into and from a collaborative hypermedia system - MUCH (Many Using and Creating Hypermedia) system - is developed. The SGML-MUCH system is able to support both hypertext interchange and text-hypertext bidirectional conversion. The SGML-MUCH system is developed following the SHyD model, which uses international standards SGML and HyTime as intermediate formats. An SGML DTD based on a widely accepted hypertext reference model- the Dexter model, is defined to represent both text and hypertext documents. Text and hypertext documents in different formats can be imported into the MUCH system by going through the SGML representation."



[CR: 19961226]

Ziener, Christopher. "State of the Industry: GCA SGML Survey Results." Pages 559-562 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Graphic Communications Association, GCA, Research Institute, Electronic Publishing Special Interest Group; Email: cziener@gca.org.

Abstract: "In order to determine how people are really using SGML, GCA has polled attendees at GCA conferences over the last year and conducted a mail survey of our extensive database of people interested in SGML. Results will be discussed by conference, in order to give regional perspective, as well as for the information collection as a whole. Survey topics included: current uses of SGML, user skill levels, document formats, and investment in SGML technologies. From this survey we can begin to get a more accurate picture of the markets that SGML has reached and what attracted current SGML users."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19971106]


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/bib-sz.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org