The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: January 17, 2001
SGML/XML Bibliography Part 5, M - N

[CR: 19950716]

Maasdam, Jan. "SGML Design Issues." SGML Users' Group Newsletter 30 (March 1995) 6-7. ISSN: 0952-8008. Author's affiliation: Intermedia bv, and the Dutch SGML Users' Group (Chairman).

Announcement for a new SIG ("Special Internet Group") on the subject "SGML Design Issues Related to Applications." Topics include: CONCUR, LINK, content model ambiguity, etc. The foundational meeting for the new SIG is May 10, 1995. Contact: Arjan Loeffen as sgmltw@let.ruu.nl, or Tel. +31-30-536417, or Arthur van Horck, Tel. +31-13-662232.



[CR: 19951113]

Mabrouk, Mbark. Modele d'hyperdocument base sur le standard ISO 8613 ODA. Lyon: INSA, 1992. Extent: 195 pages.

Apparently based upon or being a thesis in the area of applied information science. The work apparently discusses SGML and ODA.



Mabrouk, M.; Dykiel, R.; Henry, J.; Pinon, J. M. "A Hyperdocument Model Based upon the ODA Standard." Pages 245-263 (with 15 references) in Intelligent Text and Image Handling: Proceedings of a Conference on Intelligent Text and Image Handling, "RIAO91" Barcelona, Spain, 2-5 April, 1991 [Conference organized by the Centre de Hautes Etudes Internationales d'Informatique Documentaire (CID), Center for the Advanced Study of Information Systems, Inc. (CASIS). Sponsored by the Commission of the European Communities, Minister of Education and Sciences, Spain; Minister of "Industrie en Aménagement du Territoire", France; et al.] Edited by André Lichnerowicz [Collège de France, Académie des Sciences de Paris]. Amsterdam/London/New York/Tokyo: Elsevier, 1991. xiii + 999 pages. ISBN: 0-444-89361-X. Authors affiliation: Bull S.A. and INSA Lyon.



[CR: 19980305]

Mace, Scott; Flohr, Udo; Dobson, Rick; Graham, Tony. "Weaving a Better Web. [Reinventing the Web: XML and DHTML to Bring Order to the Chaos]." Byte Magazine 23/3 (March 1998) 58-68. ISSN: 0360-5280.

Abstract: "HTML 4.0 has barely been released, but to some of us it is dead on delivery. We're already looking past it to XML, the eXtensible Markup Language, which promises to add much more power, flexibility, and reliability to the web. This article serves as a great introduction to XML and, to a lesser degree, Dynamic HTML (DHTML). The online version of the article links you through to some of the essential documents on XML. If you are interested in the future of the web, listen up. As the authors of this article put it: 'Although it will require developers and user to retool, the migration to XML must begin. The future of the Web depends on it'." [from (c) '-- RT' in Current Cites 9(2) (February 1998) ISSN: 1060-2356.

Another abstract: "We have a love/hate relationship with HTML. We love its easy learning curve and universality, but we hate its easily broken links and limited formatting. We love its simple and compact syntax, but we hate its rigid formatting and inflexibility. To keep what we love and jettison what we hate, we've scripted it, styled it, tabled it, and framed it. Yet, after more face lifts and tummy tucks than an aging Hollywood star, today's HTML is still just HTML. The broken links and formatting problems are just warts and cellulite that won't go away. It's time to find some new, fresh talent. A few new stars are about to break onto the scene with names like Extensible Markup Language (XML), cascading style sheets (CSS), and Dynamic HTML (DHTML). Each works on a slightly different set of HTML. 3.2's problems: XML on helping organize and find data, CSS on Web page inheritance and presentation, and DHTML on dynamic presentation of Web content. Aided by the recent HTML 4.0 refresh, these new technologies will beat back HTML's legacy of too many dead links, slow searches, and static pages on today's Internet and intranets." [from authors? check -9804]

This article is the "Cover Story" in the March 1998 issue of Byte Magazine. Now online: "The features that made HTML so popular are causing the Web to fall apart. What's next?"



[CR: 19950716]

Maclean, G. E. "Setting Office Systems Standards." Words 14/2 (August-September 1985) 32-33. ISSN: 0164-4742; CODEN: WRDSDR.

"Syntopican XIII served as the host site for the American National Standards Institutes (ANSI) meeting of the American National Standards Committee (ANSC) for Office Systems. ANSC is the advisory group to the international committee on office standards for information processing systems and text and office systems. New standards are needed since office documents are becoming more complex, incorporating practices that were initially developed for printing and publishing applications. Six task groups were formed to address the issues of user requirements, document architecture, procedures for text interchange, content architectures (including requirements for character sets and coding, videotex in office systems and text interchange via magnetic media), text processing languages and user/system interfaces and symbols. ANSC is working to make itself known to users who feel alone with their incompatibility problems. One of its accomplishments has been to facilitate development of Standard Generalized Markup Language (SGML) for those in the printing and publishing industries to edit and mark copy, which is being considered a standard at the international and national levels."



MacLeod, Douglas. "Building Buildings: An Analogy for a Language to Build Architectures." <TAG> 14 (May, 1990) 7-10. Author affiliation: Barton/Meyers Associates, Los Angeles, CA.

The article is a revision of a presentation given at the GCA-sponsored conference SGML '89 (Atlanta, November 1989).



Macleod, Ian A. "Extending the Command Language Interface to Handle Marked-up Documents." Pages 192-196 (with 14 references) in Information in the Year 2000, From Research to Applications. ASIS '90. Proceedings of the 53rd Annual Meeting of the American Society for Information Science (Toronto, Ontario, Canada, 4-8 November 1990]. Edited by Diane Henderson. American Society for Information Science Proceedings of the Annual Meeting, 27. Medford, NJ, USA: American Society for Information Science [published for ASIS by Learned Information, Inc], 1990. ISBN: 0938734482. 0044-7870. Author affiliation: Department of Computer and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada.

Abstract: Two important international standards relating to text have emerged. One of these, SGML, describes a framework for descriptive markup. The other, and more recent, deals with a command language interface for full text retrieval. The two standards have been developed in isolation from one another and the command language can handle only the conventional view of text and not the relatively complex structures implicit in descriptive markup. It is shown how a relatively simple syntactic extension to the command language enables it to be applied to SGML databases. Some implementation issues are also discussed.



Macleod, Ian A. A Query Language for Retrieving Information from Hierarchic Text Structures Technical Report 89-263. Kingston, Ontario: Queen's University Department of Computing and Information Science, August, 1989. 26 pages.

Abstract: Descriptive markup languages provide a mechanism for specifying the structure of a document. The basic premise of the work described here is that structure is an important characteristic of a document and is something more than a layout specification. For this reason, it appears important that retrieval tools should be developed which can take advantage of structural knowledge. In this paper, a query language is described which provides such a capability. The underlying implementation strategy is also discussed. [Funding: Supported by the Natural Sciences and Engineering Research Council of Canada.]



Macleod, Ian A. "A Query Language for Retrieving Information from Hierarchic Text Structures." Computer Journal 34/3 (June 1991) 254-264. (24) references. ISSN: 0010-4620. Author affiliation: Department of Computer and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada.

Abstract: "Descriptive markup languages provide a mechanism for specifying the structure of a document. The basic premise of the work described here is that structure is an important characteristic of a document and is something more than a layout specification. For this reason, it appears important that retrieval tools should be developed which can take advantage of structural knowledge. A query language is described which provides such a capability. The underlying implementation strategy is also discussed.

See a previous version of the document in the Queen's University technical report.



[CR: 19960408]

Macleod, Ian A. "Guest Editorial: SGML into the Nineties." Computer Standards & Interfaces 18/1 (January 1996) 1-2 (as Special Issue Preface). ISSN: 0920-5489. Author's affiliation: Queen's University..

This introductory article was published in an SGML special issue of Computer Standards & Interfaces [The International Journal on the Development and Application of Standards for Computers, Data Communications and Interfaces], under the issue title SGML Into the Nineties. It was edited by Ian A. Macleod, of Queen's University.



[CR: 19960408]

Macleod, Ian A (special issue guest editor). SGML Special Issue: Computer Standards & Interfaces [The International Journal on the Development and Application of Standards for Computers, Data Communications and Interfaces]. Amsterdam: Elsevier Science Publishers B.V./North-Holland, 1995. ISSN: 0920-5489.

The journal Computer Standards & Interfaces is an Elsevier/North-Holland (Amsterdam) publication characterized as "The International Journal on the Development and Application of Standards for Computers, Data Communications and Interfaces." In mid-1995 it sponsored a special SGML issue, edited by Ian A. Macleod. The special issue title was: SGML Into the Nineties.

Articles include: David T. Barnard, Lou Burnard, and C. Michael Sperberg-McQueen, "Lessons from Using SGML in the Text Encoding Initiative"; Bart Bauwens, Filip Evenepoel, and Jan Engelen, "SGML as an Enabling Technology for Access to Digital Information by Print Disabled Readers"; Franz Burger and Sigfried Reich, "Design and Implementation of an Abstract SGML Interface in Smalltalk"; Patricia Francois, "Generalized SGML Repositories: Requirements and Modelling"; Matthew Fuchs, "The User Interface as Document: SGML and Distributed Applications"; Edward Levinson, "Exchanging SGML Documents Using Internet Mail and MIME"; Ian A. Macleod, "SGML into the Nineties"; Hans Holger Rath and Hans-Peter Wiedling, "Making SGML Work: Introducing SGML Into an Enterprise and Using its Possibilities in Advanced Applications"; Darrell R. Raymond, Frank Wm. Tompa, and Derick Wood, "From Data Representation to Data Model: Meta-Semantic Issues in the Evolution of SGML".

The 'Call for Papers' read as follows, in part: "SGML (the Standardized General Markup Language) is an international standard whose importance is rapidly growing. It is fair to say that the era of electronic text has finally arrived. A large number of potential text applications are seeking solutions, and there is significant industrial interest in the technologies being developed in the SGML context. In view of the high importance of SGML, Computer Standards and Interfaces is planning a special issue on this topic to be published in mid 1995. The goal is to collect papers incorporating important advances in the field. Topics of interest include, but are not limited to, the following: Novel applications of SGML; SGML databases and information retrieval; Languages for accessing and manipulating SGML structures; Hypertext/Hypermedia; Entity management; Visualisation and SGML; Related standards and SGML; Converting legacy databases to SGML; Tools for developing and using DTDs." See the full announcement for other publication details.



Macleod, Ian A. "Storage and Retrieval of Structured Documents." Information Processing and Management 26/2 (1990) 197-208. Author affiliation: Department of Computer and Information Science, Queen's University at Kingston, Kingston, Ontario, Canada.

Abstract: There have been a number of important related activities which suggest the need for a new model for text. ISO standards for document description have been recently developed. These standards view documents as hierarchical objects and it is likely that languages such as SGML will become widely used in the near future for document markup. As structured documents become available, so there will be a need to evolve tools to take advantage of structural knowledge. The goal of the work described here is to develop such tools. A conceptual model for bibliographic data has been designed. The model is known as Maestro (Management Environment for Structured Text Retrieval and Organization). It supports structured documents and provides a query language to retrieve and link information contained in these structures. In this paper an overview of Maestro is presented together with an outline of the basic implementation strategy.



Macleod, Ian A.; Barnard, David T.; Hamilton, D.; Levison, M. "SGML Documents and Non-linear Text Retrieval." Pages 226-244 (with 17 references) in Intelligent Text and Image Handling: Proceedings of a Conference on Intelligent Text and Image Handling, "RIAO91" Barcelona, Spain, 2-5 April, 1991 [Conference organized by the Centre de Hautes Etudes Internationales d'Informatique Documentaire (CID), Center for the Advanced Study of Information Systems, Inc. (CASIS). Sponsored by the Commission of the European Communities, Minister of Education and Sciences, Spain; Minister of "Industrie en Aménagement du Territoire", France; et al.] Edited by André Lichnerowicz [Collège de France, Académie des Sciences de Paris]. Amsterdam/London/New York/Tokyo: Elsevier, 1991. xiii + 999 pages. ISBN: 0-444-89361-X.

Abstract: Standard Generalized Markup Language (SGML) is an international standard for markup languages. Descriptive markup is a means whereby the logical structure of a document can be explicitly encoded. Such markup can subsequently be processed to provide an appropriate physical layout of the actual document content. Additionally, the logical structure provides the information necessary for highly context sensitive retrieval. In this paper the authors describe an SGML application which can process encoded documents into a format suitable for storage and retrieval by an appropriately powerful retrieval system.

It is also possible to encode links between documents using SGML. One technique is that suggested by the Text Encoding Initiative (TEI), a co-operative international venture to promote guidelines for the encoding and interchange of machine-readable texts. This paper also describes how such links can be processed to produce the equivalent structures in a document database.



[CR: 1995]

Macleod, Ian A.; Narine, D. "A Depository for Structured Text Objects." Pages 272-282 (with 18 references) in Database and Expert Systems Applications. Proceedings of the Sixth International Conference, DEXA 95.. DEXA '95: Database and Expert Systems Applications, London, UK. September 4-8, 1995. Edited by Norman Revell and A. Min Tjoa. Lecture notes in computer science, Number 978. Berlin/New York: Springer-Verlag, 1995. ISBN: 3540603034. ISSN: 0302-9743. Authors' affiliation: Department of Computing and Information Science, Queen's University, Kingston, Ontario, Canada.

"Abstract: This paper describes Godot (Generalized Object Depository Oriented to Text), a depository for structured text. The work is heavily influenced by international standards relating to text. The physical storage model, which is built on top of an implementation of the ISO DFR (Document Filing and Retrieval) standard, is described. It is shown how structured SGML documents can be incorporated within the storage model. An overview of the underlying object-oriented implementation is given and the basic access operations described. Examples of structural queries are provided."



Macleod, Ian A.; Nordin, Brent; Barnard, David T.; Hamilton, Doug. "A Framework for Developing SGML Applications." Pages 53-63 in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Authors' affiliation: Queen's University, Kingston, Ontario, K7L 3N6 CANADA.

Abstract: SGML [C. F. Goldfarb, editor, The Standard Generalized Markup Language (ISO 8879)] is a passive standard. That is, it provides mechanisms through which descriptive markup can be applied to documents, but says nothing about how these documents are to be processed. The SGML standard refers frequently to the "application" but includes no clean mechanism for attaching applications to SGML parsers. The purpose of this paper is to present one such mechanism while maintaining compatibility with the current SGML standard.



Macleod, Ian; Reuber, A. R. "The Array Model: A Conceptual Modeling Approach to Document Retrieval." Journal of the American Society for Information Science 38/3 (1987) 162-170.

Abstract: Recent research has sought to build document-retrieval systems on top of relational database management systems (DBMS) in order to increase the power of document retrieval. While the use of DBMS shows a more flexible approach to designing search strategies, the underlying representation of the information is inflexible and does not correspond to either the structure or the meaning of the real-world objects. This limitation can be overcome through the use of conceptual modelling techniques. The array model presented here is based on these techniques and has been designed specifically for application in document retrieval.



[CR: 19950925]

MacNee, C. A.; Behrendt, W.; Kalmus, J. R.; Jeffery, K. G. "Presenting Dynamically Expandable Hypermedia." Information and Software Technology 37/7 (July 1995) 339-50 (with 16 references. Authors' affiliation: Rutherford Appleton Laboratory, Chilton, UK.

"Abstract: The Multimedia Information Presentation System (MIPS) allows end-users to browse multimedia information presented in a user-friendly and consistent manner. In its most powerful configuration, it will allow the end-user to formulate queries which are interpreted, analysed, and despatched by the system to heterogeneous distributed external data sources, and to view a coherent and customized presentation of the data retrieved as answers. Data are stored in, or referenced from, a set of hyperdocuments conforming to the ISO standards HyTime and SGML. The hyperdocuments constitute an information web which may be dynamically expanded to accommodate retrieved data. The web navigation structure, structure of information nodes, specification of presentation mechanisms, specification of presentation tools, and data are separable and potentially reusable for different applications, different activities within an application, or different environments. The authors outline the intended functionality and the design of MIPS, with particular reference to the structure and function of the hypermedia web and the role of the knowledge base system module in its dynamic expansion."



[CR: 19971024]

Madigan, Chris; Silber, Michael K.; Wilson, Suzanne. "Lessons Learned Prototyping an SGML-based Computerized Document Management System." IEEE Transactions on Professional Communication 40/2 (June 1997) 139-143. ISSN: 0361-1434. Authors' affiliation: [ ].

Abstract: "In developing new ways to publish vast amounts of information, many technical communication teams face problems that go far beyond the challenges of one book, a series of books, or even a series of CD-ROMs. Technical communicators begin to face a constellation of problems that are more like those that have plagued software development since it became a distinct profession in the 1960s. At first a project appears promising. Then, as the work begins and progresses, we become enmeshed in interlocking problems of management, purchasing, staffing, training, installation, integration, and vision. This article summarizes the lessons learned from a major effort to use the Standard Generalized Markup Language (SGML) to pull together into a single, accessible, electronic "publication" large amounts of very complicated information."

Note: This article is part of a special issue of IEEE Transactions on Professional Communication (with an introduction by Jonathan Price): "Structuring Complex Information for Electronic Publication."



[CR: 19971125]

Mäkelä, Riku; Sundquist, Risto; Vendelin, Timo. "Keep it Simple - Interactive Electronic Applications with SGML." Page(s) 273-276 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Information Standards-based Multimedia System Projects, Remtec Systems, Ltd, Espoo, Finland; Email: Riku.Makela@remtec.fi.

Abstract: "The SGML (Standard Generalized Markup Language) world has concentrated on solving the problems of textual documentation. SGML and other information standards are rather complex to take into wide use. SGML alone is not enough to implement working solutions. There are a large number of methods, models and naming conventions developed for different application areas: microdocuments, components that contain bigger element hierarchy, etc.

"This paper describes a keep-it-simple model as a base for interactive electronic applications. The model keeps the data in life-cycle safe format (SGML), but still gives the end-user any possible view to the data and interaction with it. One design goal of the model was to separate information, functionality and user interface from each other.

"Information is managed in SGML, HyTime (Hypermedia-Time-based Structuring Language, ISO 10744), and DSSSL (Document Style Semantics and Specification Language) formats. The information packages, that travel between client and server (and between applications), are modeled with information standards. Functionality is achieved with engines on client and/or server side. The user interface language is HTML (HyperText Markup Language) and Java applets.

"XML (Extensible Markup Language) brings lightness and data format independence when HTML provides a common user interface description language. Java applets are a modular solution for interface functionality and platform independence. HyTime is the way to link the information chunks together in a standard way. Data is stored in databases that are part of the application or part of the information infrastructure.

"This presentation contains models for putting SGML and other information standards to work for wide range of interactive electronic multimedia applications.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19990610]

Magnusson Sjöberg, Cecilia. Critical Factors in Legal Document Management: A Study of Standardised Markup Languages. Stockholm: Jure AB, 1998. Extent: 458 pages. ISBN: 91-7223-045-2. Author's affiliation: Doctor of Law (LL.D.) and Associate Professor of Law and Informatics. Project Manager, Corpus Legis Project. The Swedish Law & Informatics Research Institute. Email: cecilia.magnussonsjoberg@juridicum.su.se; WWW: http://www.juridicum.su.se/iri/cems/.

Abstract: "This book is meant as a guide to modern handling of legal information with the aid of standardized markup languages, in response to the well-known need for sharpened tools for managing the rapidly growing amount of legal information in combination with transborder data flows, especially on the Internet. The SGML and XML international standards for document description are becoming increasingly important for the legal domain in these respects.

The content is based on empirical results reached in the Corpus Legis Project. This interdisciplinary research programme began in 1994 at the Faculty of Law, Stockholm University and it has led to three different IT-applications, which may be categorised according to the following profiles: (1) hypertext based systems, (2) advanced information retrieval systems, and (3) general electronic document and management systems."

"Experiences from this practical work are described in the book. Major activities associated with the development of an SGML system, e.g. document analysis, DTD-design (Document Type Definition), and markup, are described from a legal point of view. The study comprises document types originating from different national legal systems, written in various languages, and covering a broad time perspective. The book can thus be seen as a checklist of critical factors in legal document management." [from the online book description]

A related work is represented by the publication The Comparative Part of the Corpus Legis Project - Using SGML for Intelligent Information Retrieval of Legal Documents. Authors: Haider, Georg, Magnusson Sjöberg, Cecilia, Quirchmayr, Gerald, Sebald, Verena, EXPERSYS-96, Artificial Intelligence Applications. J Zarka, E. Mercier-Laurent, D.L. Crabtree, M. Narasipuram. In: Technology Transfer Series. pp. 181-186. Editor: A. Niku-Lari. Other publications by the author are listed in the Corpus Legis Final Project Documentation. On Corpus Legis, see "The Corpus Legis Project." See also the author's list of publications.

The Table of Contents for the book is available online; [local archive copy]. See also the book announcement/review at: http://www.sub.su.se/juridik/subiura/1999-1.htm.

This book may be ordered from JURE Law Books, Artillerigatan 67, SE-114 45 Stockholm,Sweden. Phone: +46-8-662 00 80; Fax: +46-8-662 0086; Email: order@jure.se.



[CR: 19960312]

Maguire, Mark. "Secure SGML - A Proposal to the Information Community." Journal of Scholarly Publishing [Downsview, Ontario, Canada] 25/3 (April 1994) 146-156. ISSN: 0036-634X. University of Toronto, Faculty of Library and Information Science, Toronto M5S 1A1, Ontario, Canada.



[CR: 19960716]

Mah, Carole E. "An Exploration of Problems Unique to Descriptive Markup." <TAG>: The SGML Newsletter 9/6 (June 1996) 1-6. ISSN: 1067-9197. Authors' affiliation: Brown University Women Writers Project.

Abstract from the online version of the paper (co-authored with Julia H. Flanders): "This paper presents two groups of text encoding problems encountered by the Brown University Women Writers Project (WWP). The WWP is creating a full-text database of transcriptions of pre-1830 printed books written by women in English. For encoding our texts we use Standard Generalized Markup Language (SGML), following the Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange. SGML is apowerful text encoding system for describing complex textual features, but a full expression of these may require very complex encoding, and careful thought about the intended purpose of the encoded text. We present here several possible approaches to these encoding problems, and analyse the issues they raise."

The TAG article focuses upon various strategies for using the TEI's so-called mirror tags for the encoding of simple variants, such as an apparent error and a correction. For example: <abbr expan=""></abbr> (storing the expansion for an abbreviation in an attribute value and the abbreviation in the content) versus the mirror, <expan abbr=""></expan> (storing the abbreviation's expansion in the content). See the TEI P3 Guidelines (1994) Chapter 6, pages 163-170. The author illustrates how nesting such elements can lead to perplexing logic, depending upon how processing is assumed to take place. The article is based upon a longer discussion published in a scholarly journal as "Some Problems of TEI Markup and Early Printed Books." See a version of the document on the STG Web server: http://dynaweb.stg.brown.edu/wwp_books/DL/1.toc. For the specific section on dual emendation: http://dynaweb.stg.brown.edu/wwp_books/DL/57; [mirror copy, partial links only].



[CR: 19980123]

Mah, Carole; Flanders, Julia; Lavagnino, John. "Some Problems of TEI Markup and Early Printed Books." Computers and the Humanities (CHUM) 31/1 (1997) 31-46 (with 9 references). ISSN: 0010-4817. Authors' affiliation: Brown University Women Writers Project.

Abstract: "This paper presents two groups of text encoding problems encountered by the Brown University Women Writers Project (WWP). The WWP is creating a full-text database of transcriptions of pre-1830 printed books written by women in English. For encoding our texts we use Standard Generalized Markup Language (SGML), following the Text Encoding Initiative's Guidelines for Text Encoding and Interchange. SGML is a powerful text encoding system for describing complex textual features, but a full expression of these may require very complex encoding, and careful thought about the intended purpose of the encoded text. We present here several possible approaches to these encoding problems, and analyze the issues they raise."

Also apparently published as: "Some Problems of TEI Markup and Early Printed Books," Carole Mah, Julia Flanders and John Lavagnino, [forthcoming in] Revue Informatique et Statistique dans les Sciences Humaines. Université de Liège, 32.1-4 (1996). And: Julia Flanders, Some Problems of TEI Markup and Early Printed Books, Paper presented at Digital Libraries Workshop 1996, Organized by Nancy Ide and Judith Klavans, Held in conjunction with the First ACM International Conference on Digital Libraries, Bethesda, Maryland, 1996.

See the main database entry: The Brown University Women Writers Project.



[CR: 19981109]

Maler, Eve; El Anduloussi, Jeanne. Developing SGML DTDs: From Text to Model to Markup. Upper Saddle River, NJ: Prentice Hall PTR, 1996. Extent: 560 pages. ISBN: 0-13-309881-8. Author contacts: [Eve Maler] elm@arbortext.com (Burlington, MA); [El Andaloussi] Berger-Levrault, Aubervilliers, France; email: jela@berger-levrault.fr.

Summary: Every SGML document must conform to some specified Document Type Definition (DTD). Maler and El Anduloussi explain the basics of DTD design, then present a methodology and series of techniques to help information professionals design, implement and document DTDs. [publisher's pre-publication description]

The books offers detailed treatment of the SGML "Document Type Definition (DTD) -- specifications that form the foundation for every document based on the SGML language. Therefore DTD quality is too important to be left to chance. This guide shows how to develop DTDs that work, based a proven methodology and techniques. KEY TOPICS: The book explains how DTD development benefits from the same rigorous treatment as software development: Articulate project goals, analyze requirements, write specifications, design and implement readable and maintainable code using good programming style, perform thorough testing, and document the work along the way. MARKET: The book is appropriate for writers, editors, and other subject matter experts; software developers and other DTD implementors; and publishing managers."

Additional information on the book is accessible via the Prentice Hall WWW server: http://www.prenhall.com/allbooks/ptr_0133098818.html - [mirror copy], [formerly].

Chet Ensign published a review of this book in "Structure Rules! Why DTDs Matter After All" (Markup Languages: Theory & Practice Volume 1, Number 1 [Winter 1999]). See the abstract in the issue summary, and the expanded/annotated Table of Contents in Deborah A. Lapeyre's complementary review article.



[CR: 19950903]

Mallery, Mary. "A Report on the 1994 CETH Summer Seminar: Electronic Texts in the Humanities: Methods and Tools." CETH Newsletter 2/2 (Fall, 1994) 8-10. ISSN: 1071-7692. Author's affiliation: MLS Candidate at the School of Communication, Information and Library Studies, Rutgers University.

Abstract: "The third CETH Summer Seminar, co-sponsored by the Centre for Computing in the Humanities, University of Toronto, was held at Princeton University in the final two weeks of June. The thirty participants hailed from seven countries including the United States: Spain, Sweden, Canada, Australia, New Zealand, and Hong Kong. Participants also came from a variety of disciplines: humanities scholarship, computing, publishing, and the library communities. For two weeks we shared the facilities at Princeton University and tried to speak each other's language and see electronic texts from one another's point of view."

Also available via the Internet: CETH Newsletter, Fall 1994/1994 CETH Summer Seminar Report [or mirror copy]. For further details on the seminar, see the main entry.



Mallery, Mary. Report on the ACRL Electronic Text Centers Discussion Group at the American Library Association Mid-Winter Meeting, February 4, 1995: "Markup and Access Techniques for Electronic Texts: TEI and SGML". Posting to ETEXTCTR Discussion List, March 16, 1995, "Subject: Report on the ACRL E-Text Centers Discussion Group 2/95". March, 1995.

The report summarizes major presentations by John Price-Wilkin (University of Michigan, HTI) and Gregory Murphy (CETH). Announcements about the CORE Project and SGML Open are also given. A copy of the report is available here.



[CR: 19951015]

Mallery, Mary. Report on the ACRL E-Text Center Discussion Group Meeting at the American Library Association Meeting in Chicago, IL, June 24, 2-4 PM: "Putting E-Texts on the Net: Three Perspectives". Posting to ETEXTCTR Discussion List, October 17 1995, "Report on the ACRL E-Text Center Discussion Group, June '95". October, 1995.

Mallery supplies a detailed report from the session, chaired by Marianne Gaunt. Major presentations -- all dealing with SGML in some detail -- were given by David Seaman, Coordinator of the University of Virginia's Electronic Text Center; Mark Day, Co-Director of LETRS (Library Electronic Text Resources Services) of the Indiana University, with Perry Willett, the Coordinator of Collection Development of LETRS and Librarian for English and American Literature, Indiana University Libraries, and Gregory Murphy, Text Systems Manager at CETH (Center for Electronic Texts in the Humanities). An online version of the report is available.



[CR: 19950926]

Mamrak, Sandra A. Benefits of Using an Integrated Architecture for Data Translation. Technical Research Report OSU-CISRC-TR25-1991. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, 1991.

Abstract: "The Chameleon Research Project has demonstrated that for one particular class of encoding schemes the problem of data translation can be solved in a general, elegant way. A translation technology, the Integrated Chameleon Architecture, unique in its combined functionality and integration, automates the generation of translation code. The technology eliminates coding errors, reduces iterations to achieve correct translations, and increases productivity. In this manuscript we describe how the architecture is used to specify translators and we report on our experience using the technology to develop translators for a document of type 'book' and for bibliographic databases." [Keywords: code generation, data translation, intermediate form, SGML.]



Mamrak, Sandra A. An Overview of the MANDEN Project: A Computerized System to Support Scholarly Writing. Technical Research Report OSU-CISRC-6/89-TR23. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, June, 1989. iv + 17 pages.

Existing computer systems to support scholarly writing are inadequate to meet the needs of authors. This paper presents a new model of scholarly writing, merging elements from several models of a scholar as a composing author. The new model identifies the activities that encompass the authoring task, arranged into three stages. The middle stage, composition, is bracketed by stages that support activities peripheral to the primary writing endeavor of forming a coherent sequence out of a set of ideas, notes, figures, and so on. A computer system that implements this model would eliminate inadequacies of existing support systems. The MAnuscript Development ENvironment, or MANDEN, project is building a prototype software architecture to instantiate the model. This paper describes the motivation for and the details of the writing model and identifies components of the model that are currently under development. Companion papers describe these components in more detail.



[CR: 19950926]

Mamrak, Sandra A; Barnes, Julie. Comparing Tools and Techniques for Data Translation. Technical Research Report OSU-CISRC-TR08-1994. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, February 28, 1994.

Abstract: "A translation often is required from one specific electronic encoding of a document to another. For example, an author may wish to translate an article encoded with LATEX to the same article encoded using Scribe, or using a macro version of the troff family. Different techniques and tools exist for achieving such translations. Techniques include using a pairwise or intermediate-form approach to the translation. Tools include programming languages, code-generating tools, and integrated, code-generating toolsets.

A person faced with a translation problem must choose among the various combinations of techniques and tools, with little guidance as to the comparative effort or quality that is achievable with different approaches. In this paper we discuss the complexity of comparing the various approaches. We describe an experiment that we have undertaken to begin to generate some comparative data. And, we discuss the potential significance of the experimental data. (From the Introduction).

Keywords and categories: Software Engineering; Data Storage Representations; Text Processing; Computers in Other Systems; format and notation, publishing; data translation, pairwise translation, intermediate-form translation."

Available online in PostScript format from OSU ([mirror copy of the text]). See also the appendices in files: ftp.cis.ohio-state.edu/pub/tech-report/1994/TR08-DIR/appendix1.gz, ...appendix2.gzm ...appendix3.gz, ...appendix4.gz, ...appendix5.gz.



Mamrak, Sandra A.; Barnes, Julie. Guidelines for the Preparation of SGML Document Type Definitions. Technical Report OSU-CISRC-7/90-TR16. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, July, 1990. 21 pages.

Abstract: The Standard Generalized Markup Language, SGML, is being adopted by various international organizations as the medium for exchange of electronically encoded documents. An exchange is accomplished by way of a Document Type Definition, DTD, that describes the content of documents targeted for an exchange. In this paper we suggest guidelines for the designers of SGML DTDs. The guidelines emphasize uniformity and simplicity without sacrificing expressive power.



Mamrak, Sandra A.; Barnes, Julie. Guidelines for Specifying Data Representations in SGML Document Databases. Technical Report, OSU. Submitted for publication in: Electronic Publishing: Origination, Dissemination and Design. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, October 12, 1989. 26 pages.

There exists a huge store of electronically encoded data, comprising a broad and varied collection of document databases. Examples of such electronic stores are various corpora, dictionaries, thesauri, and databases holding legal documents, abstracts of scientific manuscripts, and catalog card information. A primary goal of creating electronic stores of these data is to make them accessible to a wide audience, for a wide variety of activities such as queries, online data delivery and display, data exchange, and data analysis. An obstacle to achieving the goal of developing a full, rich set of software tools for data access is the often undue complexity of the underlying data representations. In the past, database designers have typically chosen their own, idiosyncratic representations, leaving software developers with the task of recreating scanners and parsers, and other components common to many software tools, from scratch for each database. The situation has greatly improved with the advent of standardized data representations of document databases, as encouraged by the Standard Generalized Markup Language, SGML, for example. However, these standard representations can themselves be utilized in such a way as to leave the final data representation difficult to access by humans and machines alike. In this paper we suggest guidelines for the designer of SGML document databases that emphasize uniformity and simplicity in the data representation, with little or no necessary loss of expressive power or functionality.



Mamrak, Sandra A.; Barnes, Julie; Bushek, Joan; Nicholas, Charles K. Translation between Content-Oriented Text Formatters: Scribe, LaTeX and Troff. Technical Report OSU-CISRC-8//88-TR23. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, August, 1988. vi + 32 + 62 pages, with 8 appendices.

There is a need for widespread exchange of electronic documents in domains as diverse as book publishing, automated offices, factories and research laboratories. The variety of data representations, and the subsequent need for data translation, is a major obstacle to this exchange. This article describes our experiences in developing translators among three specific text formatters: Scribe, LaTeX and Troff. We used a standard form approach in developing the translation capability. We chose a Standard Generalized Markup Language (SGML) data type definition (DTD) for the manuscript type article, developed by the Association of American Publishers, as the basis for our work. We describe the difficulties that we encountered in developing the translators and present guidelines for future definers of SGML DTD's and future developers of translators for these DTD's (sic!).



Mamrak, Sandra A.; Barnes, Julie; Hong, H.; Joseph, C.; Kaelbling, Michael; Nicholas, Charles; O'Connell, Conleth; Share, M. "Descriptive Markup - The Best Approach?" Communications of the Association for Computing Machinery 31/7 (July 1988) 810-811. Authors' affiliation: The Chameleon Research Group, Department of Computer and Information Science, Ohio State University.

This is a reply to the CACM article of Coombs/DeRose/Renear. The authors argue that a few of the claims are overstated, and that some of the difficulties in the use of descriptive markup (e.g., document portability) are trivialized.



Mamrak, Sandra A.; Kaelbling, Michael J.; Nicholas, C. K.; Share, M. "Chameleon: A System for Solving the Data-Translation Problem." IEEE Transactions on Software Engineering 15/9 (September 1989) 1090-1108. ISSN: 0098-5589. Contact: Sandra A. Mamrak, Department of Computer and Information Science, The Ohio State University, 2036 Neil Ave., Columbus, OH USA 43210-1277. Email (Internet): mamrak@cis.ohio-state.edu OR mamrak@oboe.cis.ohio-state.edu (Internet).

Abstract: "There is a need for widespread exchange of electronic documents in domains as diverse as book publishing, automated offices, factories, and research laboratories. The variety of data representations, and the subsequent need for data translation, is a major obstacle to this exchange. This paper describes a comprehensive data translation system with the following characteristics: 1) it is derived from a formal model of the translation task; 2) it supports the buildingof translation tools; 3) it supports the use of translation tools; and 4) it is accessible to its targeted end-users. A software architecture to achieve the translation capability is fully implemented. Translators have been generated using the architecture, both by the original software developers and by industrial associates who have installed the architecture at their own sites."



[CR: 19961121]

Makrak, Sandra A.; Kaelbling, Michael J.; Nicholas, C. K.; Share, M. "A Software architecture for supporting the exchange of electronic." Communications of the ACM 30/5 (May, 1987) 404-418 (with 15 references). ISSN: 0001-0782. Author's affiliation: AITRC, Department of Computer and Information Science, Ohio State University.

Abstract: "As electronic manuscript exchange becomes more prevalent, problems arise in translating among the wide variety of electronic representations. The optimum solution is a system that can support both the use and the creation of translation tools."

Review: "This is a somewhat long-winded yet readable paper about how to organize a collection of tools to help translate between a standard manuscript representation (based on ISO SGML) and the myriad existing document representations. The authors point out that, owing to the high-level structure of SGML, translating to other representations is straightforward, but translating in the other direction inherently requires human assistance. Unfortunately, they have not yet implemented much of what they propose. The paper would have been more valuable if it had (1) left out the general discussion of the merits of standardization, and (2) emphasized more clearly the general translation paradigm, since it also applies to other applications, such as translating between musical scores and soundtracks." [review by David Alex Lamb, in the CACM database]



[CR: 19950914]

Mancino, Piero. Can the Open Document Architecture (ODA) Standard Change the World of Information Technology? A study of the Documentation Standard Open Document Architecture (ODA, ISO 8613) for Information Technology. Ericsson Telecom Technical Report, and ODA Master's Thesis. Rijen, The Netherlands/Stockholm, Sweden: Ericsson Telecom, September, 1994. Extent: 880K Postscript file, 80 pages. Author's affiliation: Ericsson Telecom AB..

Available in PostScript format or in .gz-compressed PostScript format. Mirror copy here (September 1995). Email: Pietro Mancino, piero@stdoca.ericsson.se



Marchal, Benoît. An Introduction to SGML. Tech Report.. [privately published], 1995. Extent: 39K HTML file. Author's affiliation: Assurnet sc (Belgium)..

Abstract: "This work introduces to the Standard Generalized Markup Language (SGML, formally ISO 88791). SGML is an international standard for electronic document exchange. SGML is the basis of the highly popular HyperText Markup Language (HTML) which, together with URLs (Uniform Resource Locators) and HTTP (HyperText Transfer Protocol), is one of the foundations of the World Wide Web initiative (WWW also known as Web)."

Available online: HTML format: An Introduction to SGML by ben marchal, or text format. Author contact: ben@brainlink.com



Marchionini, Gary; Crane, Gregory. "Evaluating Hypermedia and Learning: Methods and Results from the Perseus Project." ACM Transactions on Information Systems 12/1 (January, 1994) 5 (30 pages total).

Abstract: The Perseus Project is a large-scale hypermedia research effort based on the premise that multifaceted evaluations of interactions between such emerging technologies as hypermedia and such complex human processes as learning guide the development of specific systems and illuminate human performance in electronic environments. Perseus is intended to provide an environment that lets people work more effectively with various primary source materials, including visual and textual materials, than is possible in print. Students may learn less quantitative information than in a course that runs through a fixed and linear curriculum, but they will develop an attitude of disciplined and respectful skepticism toward published interpretations. Perseus combines texts, images, and programs comprising a set of HyperCard stacks and data files; structured data include SGML texts, relational tables for catalogs and encyclopedia, PostScript drawings, LandSat images, and 35mm slides.



[CR: 19970715]

Marcoux, Yves. Place de SGML parmi les nouvelles architectures documentaires. Paper presented at the conference "Technologie SGML 1996", Ottawa, March 27, 1996. Montreal: Université de Montréal, March 1996. Extent: approximately 8 pages. Author's affiliation: <GRDS> - EBSI, Université de Montréal. WWW: Marcoux Home Page.

The paper is also published in Actes de la conférence "Technologie SGML 1996" organisée par le Centre de recherche en droit public de l'Université de Montréal, CRDP, 1996, pages 1-13. See also the conference entry, or possibly the site page. An online version is available in HTML format; [mirror copy, text only]; or in Word format.



[CR: 19961030]

Marcoux, Yves. "Les formats normalisés de documents électroniques." ICO Québec 6/1-2 (printemps 1994) 56-65. Author's affiliation: <GRDS> - EBSI, Université de Montréal.

See the main entry for EBSI-GRDS.



[CR: 19970531]

Marcoux, Yves et Martin Sévigny. "Querying hierarchical text and acyclic hypertext with generalized context-free grammars." Accepté pour publication, RIAO 1997.

See: RIAO'97 CONFERENCE, conference entry.



[CR: 19970531]

Marcoux, Yves; Sévigny, Martin. "Why SGML? Why now?" Journal of the American Society for Information Science [?]/[?] ( [forthcoming 1996]) ??. ISSN: . Authors' affiliation: <GRDS> - EBSI, Université de Montréal.

A version of the document from 1995 (see "Pourquoi SGML? Pourquoi maintenant?" - below) is available online: http://www.droit.umontreal.ca/crdp/fr/equipes/technologie/conferences/sgmlquebec/12.html; [mirror copy]

The JASIS version of the article is to appear in a special journal issue dedicated to structured information and standards for document architectures. See the main entry for EBSI-GRDS. See also (pending more complete bibliographic work): (1) MARCOUX, Yves. "Pourquoi SGML? Pourquoi maintenant?" Actes de la conférence "SGML et inforoutes; pour la diffusion optimale de l'information gouvernementale et juridique" organisée par le Centre de recherche en droit public de l'Université de Montréal et l'EBSI, CRDP, 1995, pp. 55-69; (2) MARCOUX, Yves. "Les formats normalisés de documents électroniques." ICO Québec, vol. 6, nos 1-2, printemps 1994, pp. 56-65; (3) HUARD, Guy; MARCOUX, Yves; POULIN, Daniel. Le SGML en documentation juridique et gouvernementale: potentiel et mise enoeuvre. Éditeur officiel du Québec, Québec, 1995; (4) Marcoux, Yves et Martin Sévigny. "Querying hierarchically structured texts with generalized context-free grammars." To appear in Proceedings of the 1996 annual SIGIR conference, 1996; (5) Sévigny, Martin et Yves Marcoux. "Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés." Soumis pour publication, 1996. [Details on the Home Pages of Yves Marcoux and Martin Sévigny]



[CR: 19970817]

Marcoux, Yves; Sévigny, Martin. "Why SGML? Why Now?" Pages 584-592 (with 16 references) in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: [Marcoux]: GRDS; - École de bibliothéconomie et des sciences de l'information (EBSI), Université de Montréal, WWW: http://tornade.ERE.UMontreal.CA:80/~marcoux/; [Sévigny], Email: Martin.2.Sevigny@hec.ca or WWW: http://www3.sympatico.ca/msevigny/.

See the predecessor to this article in French: "Pourquoi SGML? Pourquoi maintenant?" in Actes de la conférence "SGML et inforoutes; pour la diffusion optimale de l'information gouvernementale et juridique" organisée par le Centre de recherche en droit public de l'Université de Montréal et l'EBSI, CRDP, 1995, pages 55 - 69.

See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.



[CR: 19971227]

Marichal, Gilles. "What SGML-Based Software for the Visually-Impaired Can Teach the Next Generation of Interface Designers." Pages 311-316 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Gilles Marichal]: Project Manager, Grif S.A., Immeuble Le Florestan 2, Boulevard Vauban - BP 266, F-78053 Saint Quentin en Yvelines Cedex, France; Phone: +33 1 30 12 14 30; FAX: +33 1 30 64 06 46; Email: Gilles.Marichal@grif.fr; WWW: http://www.grif.fr/.

Abstract: "MATHS (Mathematical Access for Technology and Science) is a recently completed project of the European Community which developed a SGML-based computer-oriented approach to teaching mathematics to visually-impaired students. A document-oriented architecture was implemented which permitted the software to be used by sighted, low-vision, and blind students and which supported multiple interactive input methods. SGML was the core component as it permitted the implementation of an application specific to mathematics, supported the input, output and interaction modes defined in the architecture, and enabled implementation with an existing SGML editor in a relatively short period of time. The knowledge gained in the MATHS project is not just of use to specialists in the area of accessibility but is of general applicability to human-computer interaction. In addition to describing the architecture of MATHS and its encoding of mathematics in SGML, this presentation will suggest ways to relate the results of the MATHS project to the general problem of computer application interfaces design."

"MATHS SGML markup was partly visual/presentational and partly semantic/structural, a balance which enabled a single application to provide good visual presentation and all the hooks for software processing. In order to manipulate mathematical expressions, all active objects were available to the software, providing practical lessons to the designers of the forthcoming document object model. Voice and other means of input and of output, critical to the operation of the MATHS environment, are being designed into the next generation of operating systems in order to address problems like repetitive motion syndrome, to make the computer useful to workers in types of work where it is not convenient to rely on a keyboard, and to increase the productivity of all computer users. The recognition that individuals absorb information in many different ways and the desire in interface design to make the presentation of information more flexible highlights the importance in MATHS of the great degree of customization possible in the presentation of math and its multiple modes of operation. Finally, MATHS uses SGML to implement a learning environment which is based on an abstract layer (the DTD) and is extensible (modify or replace the DTD). This extensibility was judged to be critical since mathematics is a field where both notations and pedagogical concepts are constantly evolving. The same approach to extensibility can be used to implement and evolve metaphors used in interface design."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



Marin-Navarro, José; Alevantis, Panagiotis E. "Alice in the Wonderland of SGML: Streamlining Text Entry in the CELEX Databases." The Electronic Library 9/3 (June 1991) 155-160.

Abstract: This article describes the system used for the introduction of textual data into the CELEX full-text document databases. The solution implemented is based on the establishment of a text production database for the management and validation of texts before introducing them into the CELEX dissemination databases, and the management of structured documents described with the help of an SGML syntax. Note: CELEX (Communitatis Europææ LEX) is the computerized multi-lingual documentation system for European community law.

Contact: Commission of the European Communities, Service EUROBASES, 200 rue de la Loi, B-1049 Brussels, BELGIUM; Tel: +32-235-00-01; FAX: +32-235-00-03. Another description of CELEX is Geoffrey Gudgion, "SGML Applications in the European Commission," EuroCALS Newsletter 4 (May, 1990) 17-20; the author discusses SGML applications relative to the CELEX database (European Commission Law), INFOTEX or INFO 92 (database of European tariffs) and electronic mail.

[2001-01-16 note:] "Alice in the wonderland of SGML: streamlining text entry in the CELEX databases." By J. Marin-Navarro and P.E. Alevantis (CEC, JMO C2/25 Bâtiment Jean Monnet, Plateau de Kirchberg, L-2920 Luxembourg). Brought online 16/01/2001. "Abstract: This article describes the system used for the introduction of textual data into the CELEX full-text document databases. The solution implemented is based on the establishment of a text production database for the management and validation of texts before introducing them into CELEX dissemination databases, and the management of structured documents described with the help of SGML syntax." [cache]



Marker, Hans Jørgen. "Encoding Standards for the "Generalist" and the "Specialist": Complex, Compound Documents as a Test Case." Pages 147-162 in Modelling Historical Data: Towards a Standard for Encoding and Exchanging Machine-Readable Texts. Edited by Daniel Greenstein. Halbgraue Reihe zur Historischen Fachinformatik, Serie A, Historische Quellenkunden, edited by Manfred Thaller, Band (A) 11. St. Katharinen: [Published for the Max-Planck-Institut für Geschiche, Göttingen by] Scripta Mercaturae Verlag, 1991. iv + 223 pages. ISBN: 3-928134-45-0.



[CR: 19950716]

Markey, B. D. "Emerging Hypermedia Standards: Hypermedia Marketplace Prepares for HyTime and MHEG." Pages 59-74 (with 11 references) in Proceedings of the Summer 1991 USENIX Conference [Summer 1991 USENIX Conference, Nashville, TN, USA, 10-14 June 1991] Berkeley, CA: USENIX Association, 1991. Author's affiliation: Digital Equipment Corporation, Maynard, MA, USA.

"Abstract: ISO/IEC JTC1 SC2/WG12, known collectively as the Multimedia and Hypermedia Information Coding Experts Group (MHEG), is developing a standard titled Coded Representation of Multimedia and Hypermedia Information. ANSI group XSVI.8M, known collectively as the Music Information Processing Standards (MIPS) committee, is developing a hypermedia document interchange standard titled HyTime/SMDI. HyTime has been officially accepted as an ISO project as well, following a successful new project ballot by ISO/IEC JTC1. The authors describes the history, technical orientation and status of the MHEG and HyTime projects, as well as their relationship to multimedia (e.g. MPEG) and document interchange (e.g. ODA and SGML) standards. The relationship between the standards is also explored, with emphasis on appropriate applications and situations where they can be used together in a complementary fashion."



[CR: 19950716]

Markey, B. D. "HyTime and MHEG." Pages 25-40 (with 11 references) in Digest of Papers. IEEE COMPCON Spring 1992. [Thirty-Seventh IEEE Computer Society International Conference, San Francisco, CA, USA, 24-28 February, 1992]. Los Alamitos, CA: IEEE Computer Society Press, 1992. Author's affiliation: Digital Equipment Corporation, Maynard, MA, USA.

"Abstract: A description is given of the history, technical orientation and status of the MHEG (Multimedia and Hypermedia Information Coding Experts Group) and HyTime projects. Their relationship to multimedia (e.g. MPEG) and document interchange (e.g. ODA and SGML) standards are also discussed. The relationship between the standards is explored, with an emphasis on appropriate applications and situations where they can be used together in a complementary fashion."



[CR: 19971227]

Martin, Thomas Fredrick. "The Future of Information Management in the US Intelligence Community: A Case Study Approach to 'Virtual Intelligence'." Pages 619-620 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Fredrick Thomas Martin]: Deputy Director, National Security Agency, Information Services Group, Fort George G. Meade, Maryland 20755 USA..

Abstract: "This talk will describe the future of information management within the various organizations and agencies that collectively are known as the United States Intelligence Community, including the CIA, NSA, DIA, and the now declassified NRO . The central focus of this talk will address what the US Intelligence Community believes to be the 'information revolution' of the Third Millennium, with an impact similar to that experienced in past millennia in both the agriculture and industrial revolutions. Kept secret as classified information in all fifty previous years since its inception, the Intelligence Community of the US Government recently confirmed that its budget last year totaled $26.6 billion dollars. This talk will provide an explanation of the possible role and impact that the ITMRA (Information Technology Management Reform Act), passed by Congress in August 1996, will have on the future of information management in the Intelligence Community, and how that relates to this industry. It will describe the transition to web-centric, electronic publishing of our nation's intelligence reports, known as 'finished intelligence' into an integrated information space. Describing what the future world of 'Virtual Intelligence' will really look like, this talk will explore the concept of a more 'agile' intelligence enterprise, giving us insight into how the US Intelligence Community plans to achieve its goal of an electronically networked environment for the production and exchange of intelligence, a goal deemed essential to national security in the 21st Century."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19961226]

Martin, Fredrick Thomas. "SGML in the Intranet for the US Intelligence Community: 'INTELINK' - A Case Study." Pages 563-566 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Deputy Director, Information Services Group, National Security Agency, Fort George G. Meade, MD 20755, Tel: +1 301-688-7306, Email: 104576.234@compuserve.com.

Abstract: "This talk will describe how the US National Security Agency, the Central Intelligence Agency, the Defense Intelligence Agency, the National Reconnaissance Office and other top agencies that collectively are known as the United States Intelligence Community are significantly improving their intelligence gathering and reporting operations through the development and implementation of advanced technology including networking concepts and international information standards such as SGML.

The central focus of this talk will be a description and discussion of Intelink, the classified, world wide 'Intranet' for the Intelligence Community. Intelink, and the Intelink community address one of the world's largest data management problems, involving demanding requirements that are at the extreme of what normal enterprises require.

Intelink is now operational for a broad base of intelligence customers and consumers from the warfighter to the White House. Intelink is currently being used in support of several basic and key functional areas. Perhaps the most significant of these areas is the electronic publishing and distribution of our nation's intelligence reports. This talk will discuss how our "Signals Intelligence" (SIGINT) Reports have gone from the world of reports in only ASCII text to robust multimedia formats with distribution, using SGML, over Intelink. The talk will also address other key functional areas including analytical research, collaboration facilities, and training.

The talk will address several of the unique problems, concerns, challenges and special features that distinguish Intelink from other Intranet applications. These issues include networking; architecture and standards; analyst collaboration issues; and finally encryption and other security considerations that are unique to this special environment.

The talk also will provide specific examples of Intelink SGML applications in several agencies within the US Intelligence Community. These examples will present insights into the issues, problems, and solutions for organizations desiring to take advantage of emerging technology allowing them to realize tangible cost savings as well as to enjoy significantly improved capabilities.

The talk will conclude with an examination of the future for Intelink, including plans for enhanced analyst collaboration, security boundaries/access control, and an improved Graphical User Interface."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19961226]

Mason, James David. "After SGML: What Comes Next?" Pages 389-398 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Lockheed Martin Energy Systems, 1060 Commerce Park, M.S. 6480, Oak Ridge, Tennessee 37831-6480; Email: masonjd@ornl.gov; WWW: http://www.ornl.gov/sgml/WG8/wg8home.htm.

Abstract: "SGML has been an ISO standard for ten years now. It was being adopted and implemented even before the final standard was published, and its user community is now very large, with thousands of applications. But is SGML a standard for all times?

SGML has always faced competition from systems that now are largely forgotten. Only three years ago, a distinguished consultant proclaimed that WYSIWYG was dying. Will SGML be able to continue its record of success in the face of HTML (you mean that's not SGML?), PDF, OpenDoc, OLE, and the surprising continued vitality of proprietary systems?

SGML has been a remarkably stable standard in the past decade, but will it remain so in the next? Fashions in computing and data management have changed in the years since the development of SGML was begun. In the past year, GCA's conferences have devoted an increasing amount of time to HyTime and DSSSL, new standards that may offer foreshadowings of changes to SGML itself. Perhaps the next year will bring us the long-awaited revision of the base standard. Will SGML still be SGML?

There may be no one answer for these questions. As users and proponents of SGML, we need to take a hard look at our requirements and define what we need from the standard and its implementers. more significantly, we need to understand what information is and what we expect it to do for us. Only with that understanding can we devise good SGML applications, make the right requests from vendors, make the right links between SGML data and other kinds of information-or design a good replacement for SGML."

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19970817]

Mason, James David. "SGML and Related Standards: New Directions as the Second Decade Begins." Pages 593-596 in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Author's affiliation: Lockheed-Martin Energy Systems, Oak Ridge National Laboratory, Building 2506, M.S. 6302, Oak Ridge, TN 37831-6302; Email: masonjd@ornl.gov.

Abstract: "In 1995 and early 1996, the ISO standards process that includes SGML and related standards has seen a remarkable coalescence of efforts that should be beneficial to all of us. Most notably, DSSSL and HyTime are developing a shared approach to tree structures and query languages. A consequence of this may be the development of a set of general facilities that can be shared among all SGML-based standards and that, when incorporated into products, will make our documents easier to work with and more powerful in their ability to deliver information."

See also the ISO/IEC JTC1/SC18/WG8 Web Service, WWW server for 'Information Technology -- Document Processing and Related Communication.' James Mason is the Convenor for WG8.

See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.



Mason, James David. SGML Encoding for Technical Reports. Presentation/Paper for NIRMA Electronic Information Exchange (EIE). Oak Ridge: US Department of Energy, Advanced Publishing Technology Section, Publications Division, Oak Ridge National Laboratory, 1994 [1993?]. 29K, (computer file), approximately 12 pages.

The document discusses issues facing the DOE as it incorporates SGML into the processes of generating and distributing technical reports. The report may be obtained as http://nuke.handheld.com/NIRMA_Docs/EIE/Meetings/Mason.html or in mirror copy here [copy dated April 12, 1995; original filestamp July 4, 1994].



[CR: 19971125]

Matheson, Ken. "SGML and the On-Line Legislature." Page(s) 169-170 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Technical Project Manager, Highland Consulting.

Abstract: "The Alabama State Legislature has begun an extensive re-engineering effort to improve the process and technologies used to craft and enact legislation and to improve the means through which the public can be directly involved in the legislative process. When this project is completed, the State will have an information system that provides repository based authoring and publishing, client/server legislative operation systems with a 'real-time' Internet interface. SGML is an important component of this complex application.

"In this session we will present an overview of the application and we will discuss in more detail how we used object oriented information engineering analysis and document analysis to develop a robust information model. We will discuss the design challenges we faced integrating SGML with traditional database technologies, mult-tiered client server technologies and the Internet. Most important, we will share valuable lessons learned about designing and building repository based SGML systems."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



Matzen, Richard Walter. A formal language model for detecting ambiguity in SGML. Ph.D. Dissertation. Oklahoma State University, Department of Computer Science, 1993. xii, 144 pages.



[CR: 19990518]

Matzen, Richard Walter. "A New Generation of Tools for SGML." Markup Languages: Theory & Practice 1/1 (Winter 1999) 47-74 (with 21 references). ISSN: 1099-6622 [MIT Press]. Author's affiliation: Visiting Assistant Professor, Department of Computer Science, Oklahoma State University; Email: rmatzen@acm.org; WWW: http://b.cs.okstate.edu/ or http://www.cs.okstate.edu/~matzen/.

Abstract: "Exceptions are used in many standard DTDs, including HTML, because they add expressive power for DTD authors. However, there is a tradeoff: although they are useful, exceptions add significantly to the complexity of DTDs. Authoring DTDs is a difficult task, and existing tools are of limited use because of the lack of a suitable formal model for exceptions. This paper describes methods for constructing a static model that completely and precisely describes DTDs with exceptions. A software tool has been written to implement the methods and to demonstrate some practical applications. Examples are shown of how the tool is used for DTD authoring, and some useful extensions of the tool are described. For one example DTD, the output of the tool is converted into a regular expression grammar. Preliminary studies indicate that general case algorithms can be developed for this conversion. This would allow existing theory for the context free languages to be used in developing SGML applications. Statistical results are shown from running the software tool on a number of industry and government DTDs and for three successive versions of HTML. The results illustrate that the complexity of DTDs in practice is approaching, or has exceeded, manageable limits with existing tools. The formal model and its applications are needed for SGML and continued development of these methods may impact the evolution of HTML, XML, and related web publishing standards. Some specific projects are proposed, where continued development of the model can result in more powerful tools and new kinds of applications for SGML."

[Conclusion: The paper provides evidence to illustrate] "the complexity of DTDs with exceptions, which in turn implies high costs for DTD design and corresponding problems with quality. These results also show that the complexity of some DTDs is approaching (or has exceeded) manageable limits given existing tools for designing and understanding them. There is clearly a need for more powerful tools for DTD design and analysis and for subsequent SGML processing. The software tool described in this paper is useful for understanding (viewing) DTDs with exceptions and for detecting errors caused by the incorrect use of exceptions. Several practical extensions of the tool are described that provide other new capabilities for DTD analysis. Because exceptions are an integral part of SGML, any generalized SGML tool must support them. There are previous theoretical results for formal language models of DTDs with exceptions ([Matzen, "Model"]; [Kilpeläinen and Wood, "SGML and Exceptions"]). However, this is the first description of an implementation, and thus it provides a foundation for a new generation of applications and tools."

"The expanded DTDs output by the software tool are a powerful extension of the model; these can be used to construct DTDs without exceptions that are pseudo-equivalent to the original DTDs with exceptions. This allows authors to design DTDs using the expressive power of exceptions while managing their side-effects. Also, the methods shown for converting DTDs with exceptions to regular expression grammars provide a powerful formal foundation, the existing theory for the context free languages, to be used in developing new kinds of SGML applications. The continued development of the methods and tools described in this paper can be a significant factor in the future success of SGML, and they would affect the evolution of HTML, XML, and other standards for the World Wide Web."

The document is available online in PDF format - "A new generation of tools for SGML." [local archive copy] See also: "SGML exceptions analysis" (results from running the prototype software tool described in "A New Tool for SGML with Applications for the World Wide Web," Proceedings of the 1998 ACM Symposium on Applied Computing, February, 1998). For other articles in this issue of MLTP, see the annotated Table of Contents.

Revision: Received 22 June 1998, Revised 31 July 1998.



[CR: 19990518]

Matzen, Richard Walter; Hedrick, G. E. A New Tool for SGML with Applications for the World Wide Web. Paper presented at SAC '98 - 1998 ACM Symposium on Applied Computing. February 27 - March 1, 1998, Marriott Marquis, Atlanta, Georgia, U. S. A.. : , February 1998. Extent: 9 pages, 12 references. Authors' affiliation: Oklahoma State University. Email: rmatzen@acm.org.

Abstract: "The Standard Generalized Markup Language (SGML) is an international standard (ISO 8879) for document definition and interchange. It is widely used in government and industry, and it has received increased attention from academia since HTML evolved to a formal application of SGML. SGML is a meta-language scheme for defining the structure of documents. A Document Type Definition (DTD) is a finite set of productions called element declarations; DTDs are similar to context free grammars, but the productions are more complex. One important optional feature of element declarations is called exceptions. Exceptions add expressive power for DTD authors, and thus are used in most industry and government standard DTDs, including HTML. Although exceptions are useful, they significantly add to the complexity of DTDs. Existing tools for DTD design and analysis are of limited use, because of the lack of a static formal model for exceptions. This paper describes a static model that completely and precisely describes the effects of exceptions on DTDs; a software tool has been written to implement the theory and to demonstrate some practical applications. Results are shown for three versions of the HTML DTD. The results show that the language model and its applications are needed for SGML, and that continued development of these methods may impact the evolution of HTML and related web publishing standards."

See the associated SGML exceptions analysis (results): The results shown in [this results set] are from running the prototype software tool described in the above paper. And see the authors' paper, "Unraveling Exceptions," Conference Proceedings: SGML/XML 97, Washington D.C., December, 1997. [local archive copy]

[Check Proceedings Volume 'Paper 121'?]

Available online in Postscript format. [local archive copy] See: SAC'98, the 1998 ACM Symposium on Applied Computing and the online Abstract.



[CR: 19971227]

Matzen, Rick. "Unraveling Exceptions. New Tools for SGML." Pages 289-296 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Rick Matzen]: Adjunct Professor, Oklahoma State University, Computer Science Department, 219 Mathematical Sciences, Stillwater, Oklahoma 74078-1053 USA; Phone: +1 405-744-5668 FAX: +1 405-744-9097; Email: rmatzen@acm.org or matzen@a.cs.okstate.edu.

Abstract: "Authoring DTDs is a difficult task: they typically contain over fifty element declarations and they are often recursive. This complexity implies high costs for DTD design and subsequent document processing. It also means that DTDs may have corresponding problems with quality.

"Exceptions are used in many standard DTDs because they add expressive power for DTD authors. However, there is a tradeoff: although they are useful, they are also a big part of the complexity problem. It is difficult to view the effects of exceptions on DTDs, primarily because of the lack of a formal static model.

"This presentation describes a static model that gives a complete and precise view of DTDs with exceptions. The model provides a foundation for new kinds of applications for processing SGML. A software tool has been developed to implement the model and to demonstrate its potential. Some specific projects are outlined, where continued development of the model and tool will have a significant impact on the success of SGML and related web publishing standards. One proposed project is the development of an automated SGML to XML converter."

This paper was delivered as part of the "How To" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 1995]

Matzen, Richard Walter; George, K. M.; Hedrick, G. E. "A Formal Language Model for Parsing SGML." Journal of Systems and Software 36/2 (February 1997) 147-166 (with 17 references). Authors' affiliation: Department of Computer Science, Oklahoma State University, Stillwater, OK.

Abstract: "The Standard Generalized Markup Language (SGML) is an international standard for document definition (ISO 8879) that was adopted in 1986 and is rapidly gaining acceptance in industry and government. It is a meta-language system for document design rather than a specific scheme for document processing; almost any kind of document can be described using SGML. Productions called element declarations are used to define arbitrary elements of documents and the context in which they can occur. A finite set of element declarations called a document type definition (DTD) defines the high-level syntax of a set of documents. DTDs are similar to context-free grammars, but the productions are more complex. The Standard does not describe a formal language model for SGML, and there is little work in the literature on this topic."

"This article defines a formal language model for SGML; systems of finite automata from systems of regular expressions. The model is applied in two ways: a parser is constructed for DTDs, and methods are shown for automatically constructing parsers for the documents defined by a DTD. These methods for parsing SGML are new, and they include features of DTDs that have not previously been included in a static language model. The model applies directly to the syntactic constructs of SGML, and thus, the methods shown in this article have distinct advantages for parsing SGML over traditional context-free parsing methods." [online abstract]



Matzen, Richard W.; George, K. M.; Hedrick, G. E. "A model for studying ambiguity in SGML element declarations." Pages 668-676 (with 14 references) in Applied Computing: States of the Art and Practice - 1993. Proceedings of the 1993 ACM/SIGAPP Symposium on Applied Computing. Proceedings of the 1993 (8th) ACM/SIGAPP Symposium on Applied Computing, Indianapolis, IN, USA, 14-16 February, 1993. Edited by E. Deaton, K. M. George, H. Bergel, and G. Hedrick. New York, NY, USA: Association for Computing Machinery, 1993. Authors' affiliation: Oklahoma State University, Stillwater, OK, USA.

Abstract: The Standard Generalized Markup Language (SGML) is a meta-language system for document representation that was adopted as an ISO standard in 1986. In SGML, element declarations define the logical components (elements) of documents; a content model is the part of an element declaration that defines the content of the elements. SGML defines and prohibits "ambiguous content models" but does not show a method for detecting them. Model groups, the only required components of content models, are expressions similar to regular expressions. This paper defines ambiguous model groups and gives an algorithm for detecting them. When the optional components of element declarations are not considered, the algorithm detects ambiguous content models as defined by the standard. The algorithm is based on a construction of indexed nondeterministic finite automata (NFAs) in which each arc is bound to a particular occurrence of an element symbol in a model group.



[CR: 19950716]

Maziarka, Mike. "Midwest [US] SGML Forum Meeting." SGML Users' Group Newsletter 30 (March 1995) 7-8. ISSN: 0952-8008. Author's affiliation: Frame/Datalogics.

Report on a March 14, 1995 meeting of the Detroit Chapter Midwest SGML Forum. Contact for the Midwest Forum: maz@xyvision.com [Mike Maziarka].



[CR: 19951228]

Maziarka, Mike. "Midwest SGML [Users' Group] Forum News." SGML Users' Group Newsletter 31 (June 1995) 11. ISSN: 0952-8008.

Mike Maziarka of XyVision reports on the election of new officers for the Midwest SGML Forum, for 1995/1996. Contact: (Forum president) Mike Mercier, Deere & Company, email MM46100@deere.com.



[CR: 19950716]

Maziarka, Mike. "New Board for Midwest SGML Forum." SGML Users' Group Newsletter 28 (August 1994) 18. ISSN: 0952-8008. Author's affiliation: Frame/Datalogics.

Report on the election of the 1994-1995 board for the Midwest SGML Forum, and announcements for future meetings. Contact: maz@dlogics.com



[CR: 19971125]

Maziarka, Michael. "Publishing to the Web is More than Converting Data into HTML." Page(s) 181-184 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Director, Parlance Product Management, Xyvision, Inc., USA; WWW: http://www.xyvision.com/.

Abstract: "Publishing to the Web introduces a new set of challenges, especially when added to the requirement to produce traditional documents using the same data, at little extra cost or manpower. To the casual observer, the problem seems one of converting data from its presentation format into HTML. However, the problem is much more complex. Although converting your data into HTML is one part of the solution, issues such as modularization of your data, establishing links for traversing through the information, search aids, and adding additional data that might not be found in your traditional (paper) publications (such as navigation aids) must all be considered. This presentation explains how, through the use of SGML and document management technology, publishers can create highly automated processes for using the same data to produce paper documents, a Web Product, and possibly other electronic deliveries."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19961226]

Maziarka, Michael. "Representing Information Applicability Using SGML Constructs: Marked Sections or Element/Attribute Representations?" Pages 289-298 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Xyvision, Inc., 101 Edgewater Drive, Wakefield, MA 01880, USA; TEL: +1 (617) 245-4100 x5138; FAX: (617) 246-5308; Email: maz@xyvision.com;; WWW: http://www.xyvision.com/.

Abstract: "Business demands are forcing more producers and publishers of information to customize documents for their end users based upon the selection of product choices, options, or configurations. This demand presents new challenges for creating, managing, and distributing our data. Occasionally the differences are at a high enough level to permit complete modularization of information. But, more often the differences are at a finer level. For example, a removal procedure may be completely rewritten for different product configurations. But more often, configuration differences may result in an extra step, or may reference a different part number within the same procedure.

The problem is more than a simple modeling problem because it affects all areas of a complete editorial and production process. To start with, the ability to edit a procedure with multiple effectivities simultaneously affects how the information model is created. It is typically not satisfactory to create different instantiations of a large information fragment (e.g., procedure) for each configuration when only one step within that editable fragment may be different for different configurations.

From a data management point of view, the goal is to save information as a Minimum Revisable Unit. This is the unit at which a piece of information has meaning regardless of the context in which it is used. As such, this often dictates that different effectivities be contained within the same storage unit.

From a delivery point of view, be it on paper, CD, or the Web, a final representation of the information must be rendered to fit the user's requirements (i.e., effectivities must be resolved for the user). This requires a tool to resolve effectivites. Based upon the chosen approach, resolution might be accomplished through a parser or through some type of data transformation which selects the appropriate information based upon attribute combinations.

This paper discusses two approaches for specifying the applicability of information using SGML. The first is use of elements combined with attributes to indicate the effectivity of the element's content. The second approach is use of SGML Marked Sections to provide a wrapper for information which can be included or ignored based upon the use. The benefits and drawbacks of both approaches will be highlighted."

An online version of the paper (presentation slides) is available from GCA in PDF format: http://www.gca.org/conf/sgml96/maz.pdf [mirror copy].

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19971227]

Maziarka, Mike. "Using SGML and Document Management to Share and Reuse Information." Pages 300-306 in SGML '95 Conference Proceedings. "SGML '95: 'Expanding the Universe." Sheraton Boston Hotel and Towers, Boston, MA, USA. December 4 - 7, 1995. Sponsored by the Graphic Communications Association (GCA). Conference Chairs: Tommie Usdin and Debbie Lapeyre. Alexandria, VA: Graphic Communications Association (GCA), 1995. Extent: v + 526 pages. Author's affiliation: Director, PDM Product Management, Xyvision, Inc.; maz@xyvision.com.

Abstract: "Since its inception, proponents of SGML have heralded 'data reuse' as a key benefit of using SGML. But prior to the widespread use of document management systems, this statement was more visionary than reality. The reason being that many still used file mechanisms for storing data. Information was maintained as a whole logical unit, for example, 'document,' 'manual,' 'chapter,' or 'section'. As a result, information reuse was minimal. Now, with the introduction of document management systems which support storing information at a much finer granular level, document creators and users are able to use SGML as a mechanism for defining the storage units for their data. Now, data can be stored as 'tasks,' 'procedures,' 'warnings,' 'citations,' 'abstracts,' etc. At these levels, information can be written in a manner which will fit more than one context, and may be shared between different publications and different delivery media (e.g., paper, WWW, CDROM, etc.). This presentation will give the attendees an appreciation for the issues associated with reusing information within the scope of a document management system. It will also enable them to explore the use of SGML modeling constructs to achieve their desired results."

The presentation addresses the question of finding an appropriate "level of granularity" optimally supporting information re-use, while avoiding the introduction of too much complexity for the computing system and its users. The author discusses modular DTDs, content-oriented fragments or smallest sharable units called a "minimum revisable unit" (MRU).

Other information on "SGML '95: 'Expanding the Universe'" is referenced in the main conference entry.



[CR: 19971227]

Maziarka, Mike. Using SGML & Document Management to Share and Reuse Information. Paper presented at SGML Europe '96, Munich, Germany. Tuesday, 14 May 1996. Wakefield, MA: Xyvision, Inc, 1996. Author's affiliation: Director of PDM Product Management, Xyvision, Inc., USA; WWW: http://www.xyvision.com/.

[Abstract: See the abstract for Maziarka's December 1995 presentation in Boston, MA.]

See the main database entry for other details on the SGML Europe '96 Conference and Exposition.



[CR: 19980907]

Mazumdar, Subhasish; Yuan, Gary; Bao, Weifeng; Price, Jonathan. "Adding Semantics to SGML Databases." Pages 563-74 (with 20 references) in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). EP '98 and RIDT '98, Saint Malo, France. March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. Lecture Notes in Computer Science Series, Number 1375. New York/Berlin/Heidelberg: Springer-Verlag, 1998. ISBN: 3-540-64298-6, and 3-540-64298-6. Authors' affiliation: New Mexico Institute of Mining and Technology, Socorro, NM 87801 USA. Email: Subhasish Mazumdar.

Abstract: "Huge collections of linked documents can now be efficiently stored. However, full online access and electronic publishing through reuse of document parts require sophistication and precision in queries. Such a query facility is only possible through the inclusion of appropriate semantic information. Manually adding such information to multi gigabyte document sources is daunting for technical writers. Our approach aims at making this task feasible by exploiting a conceptual schema of the enterprise. The result is an integrated schema-one that covers the traditional information system of the enterprise as well as the information that exists solely in the world of documents."

The presentation slides are available online, also in Postscript format. See also the online abstract and the full text in PDF. [local archive copy]



McAlpine, K.; Golder, P. "A New Architecture for a Collaborative Authoring System - Collaborwriter." Computer Supported Cooperative Work (CSCW) 2/3 (1994) 159-174. 22 references. Authors' affiliation: Aston University, Birmingham, UK.

Abstract: Much research has occurred in recent years detailing computer systems which support collaborative writing. We describe a collaborative authoring system capable of handling both synchronous and asynchronous communication between authors, based upon a writing model of coordination, writing, annotation, consolidation and negotiation. This assumes that the negotiation aspects play a major role in the collaborative process. A model linking the logical structure of documents and author roles is also established, based on the Standard Generalized Markup Language (SGML).



[CR: 19970228]

McCarty, Wilard. "Because It's Time: A Commentary on the [ACLS] Program Session." American Council of Learned Societies Newsletter 4/4 (February 1997) 14-20. Author's affiliation: Centre for Computing in the Humanities, King's College London; also Editor, "Humanist" Discussion Forum. Email: Willard.McCarty@kcl.ac.uk.

The article is part of a special issue which "focuses on the presentations of a program session on Internet-accessible scholarly resources held at the 1996 ACLS Annual Meeting." The issue theme is entitled "Internet-Accessible Scholarly Resources for the Humanities and Social Sciences."

On markup for text analysis and for structuring metadata, McCarty says: "Textual markup is often misunderstood, and its potential grossly underestimated, by the tendency to think of it as belonging to a preparatory, mechanical stage one must get through before serious work can be done. In some cases, of course, tagging is largely mechanical, but the more complex the phenomena one wishes to identify consistently, and so be able to process reliably, the stronger its role as a means for manifesting one's understanding of the data. (Give the job to a graduate student and you have just made him or her a colleague.) Tagging, I like to argue, is akin to translation and so forces deep knowledge as much or more through its failure to convey meaning as by its successes. Tagging is the form of tinkertoy modeling most readily available to those with the most intimate knowledge of humanities and social science data. It is an instrument of perception, a medium for thinking about text."

[. . . and ] "The potential of markup is crudely and poorly represented by HTML, which is a highly simplified form of the Standard Generalized Markup Language (SGML, now adapted for scholarly purposes by the Text Encoding Initiative, thus TEI/SGML. Currently the preferred way of delivering primary and important secondary texts across the Internet is to render SGML-encoded text on demand into HTML, using so-called "gateway" software. This has the advantage of allowing us to exploit current Web technology without requiring that we make a large investment in tagging our data with a still primitive and severely limited meta-language. As HTML improves, or when it is discarded for something better, only the gateway software need change, not the tagging of the data."

The article is available online in HTML format: http://www.acls.org/n44mccar.htm; [mirror copy, text only].



[CR: 19960202]

McClung, Patricia. " Access to Primary Sources: During and After the Digital Revolution." In [Proceedings of the Berkeley Finding Aid Project Conference]. Berkeley Finding Aid Project Conference. Morrison Room of the Doe Library, University of California, Berkeley. April 4-6, 1995. Sponsored by the Commission on Preservation and Access. Berkeley, CA: Berkeley Finding Aid Project, 1995. Author's affiliation: Research Libraries Group. Email: bl.pam@rlg.stanford.edu.

"This Finding Aids project is testing the use of Standard Generalized Markup Language (SGML) as a navigation tool for finding aids online, and it is trying to come up with a standard document type definition that would enable widespread use of this approach. SGML is being touted as the potential standard that may emerge from the several possibilities available for encoding texts. Even The Economist recently reviewed the various humanities encoding projects underway, and speculated in positive terms about the future of SGML. But if Weissman's predictions are any where near the mark, SGML will serve only as a stopgap in the evolution of smart tools that will transform research. The fact is that no one knows for sure; and until it all sorts out, we need to learn as much as we can from the best tools available." [extract]

Available online in HTML format: http://www.lib.berkeley.edu/AboutLibrary/Projects/BFAP/ucb3.html [mirror copy]. For more on the conference, see information on the Sunsite WWW server.



[CR: 19971227]

McCool, Michael; Prescod, Paul. "Software Component Interface Description in SGML." Pages 427-432 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Michael McCool]: Assistant Professor, University of Waterloo Department of Computer Science, Computer Graphics Lab, Waterloo, Ontario Canada N2L 3G1; Phone: +1 (519) 888-4567 x4422; FAX: +1 (519) 885-1208; Email: mmccool@cgl.uwaterloo.ca; WWW: http://www.cgl.uwaterloo.ca/~mmccool/; [Paul Prescod]: University of Waterloo Department of Computer Science, Computer Graphics Lab, Waterloo, Ontario Canada N2L 3G1; Phone: +1 (519) 888-4567 x4422; FAX: +1 (519) 885-1208; Email: paprescod@cgl.uwaterloo.ca; WWW: http://itrc.uwaterloo.ca/~papresco.

Abstract: "SGML (and XML) documents consist of a grammar-constrained tree of typed, attributed nodes with ordered children. This structure can be used to represent almost any kind of information.

"We are using SGML to represent software engineering metainformation, specifically language-independent, formal class library interface descriptions. Standard SGML document transformation tools can then be used to translate the interface description into support code or into human-readable documentation. We use DSSSL to generate the documentation and Perl in conjunction with SGMLSPL to generate code.

"Unlike other systems designed to do similar things (i.e., CORBA's IDL) the SGML metadocument approach is extensible, and extra-language constraints such as protocols and design patterns can be represented. Furthermore, the transformation can be annotated to specialize the transformation for a particular target programming language.

"We have used this system to formally document the interface to a 3D graphics class library and automatically generate multiple language interfaces to it. Our METADOC DTD and transformation system is formed from a set of reusable DTD/DSSSL/Perl components which we have used to build other document types, and we will discuss our strategy in this regard."

[Conclusion:] "We have presented a prototype implementation of a system for expressing software object metainformation in SGML. We have used this metainformation to automatically generate code to build scripting interfaces and generic access to objects. This approach has much more flexibility than the CORBA IDL approach, and can be tuned to produce better interfaces in specific target languages from generic object descriptions. SGML can also combine formal and informal documentation in a clean way, and can itself be a target language. The essential feature of the SGML approach is its extensibility, unlike the fixed and restrictive format of the CORBA IDL. This flexibility arises because there is an additional layer of metainformation, the SGML DTDs and the transformations on them. Unlike CORBA IDL, a good SGML system permits small incremental change without making it impossible to recover the necessary information needed by an application. As stated above, we intend to extend these ideas and the lessons we have learned to METADTDs that can support multiple components. Benefits of such an approach include better documentation and reuse of document designs. Namespaces can also be added to current SGML systems without internal modification of existing SGML tools."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

A version of the document is available online in HTML format: "Software Component Interface Description in SGML"; [local archive copy]

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19961202]

McFadden, John R. Hybrid Distributed Database (HDDB) and the Future of SGML. Presentation [Keynote Speech] delivered at SGML Sweden '96. Ontario, Canada: OmniMark Corporation, February 17, 1996. Author's affiliation: John McFadden is President, co-founder and majority shareholder of OmniMark Corporation.

See a summary of the presentation in the SGML Sweden '96 program; [mirror copy], and "OmniMark and the Hybrid Distributed Database Model" (OmniMark White Paper).



McFadden, John R. "The National Bureau of Standards Test Suite Errata." SGML Users' Group Bulletin 3/1 (1988) 8-12. ISSN: 0269-2538. Author affiliation: [Software] Exoterica Corporation.

The article supplies a critique of the [CALS SGML] Validation Test Suite (VTS) developed under the direction of the National Bureau of Standards. The developmewnt of the test suite was originally commissioned by the US Department of Defense.



McFadden, John R. "Validation of SGML Systems and the Canonical Test Result." <TAG> 1/3 (1987) 4-5. Author's affiliation: Software Exoterica



McFadden, John R.; Hayter, Ronald S. "Element Content vs Mixed Content. Where it Matters." <TAG> 1/5 (1988) 12-13. Authors' affiliation: Software Exoterica.



McFadden, John R.; Wilmott, Sam. "Ambiguity in the Instance: An Analysis." <TAG> 9 (March/April 1989) 3-5.

This article is a response to the article of John M. Graf (John Graf, "Ambiguity in the Instance," <TAG> 7 (1988) 6-9. John Graf observed that "a document created under the exact rules of a valid DTD may very well be invalid when passed through an instance parser.") John McFadden and Sam Wilmott argue that conforming parsers and a validating parsers must be differentiated. On this distinction, cf. further the article of McFadden and Wilmott, "The SGML Conformance Testing Initiative," <TAG> 9 (March/April 1989) 1-3. The writers conclude that Graf's examples represent a misunderstanding of the use of SGML parsers, and that tag minimization is "safe" with proper DTD design.

For more on SGML's definitions of "ambiguous" and "unambiguous," see Brueggemann, Price, Price, Kaelbling, and Warmer (esp. pp. 80-83). More yet: William W. David, Jr., "OMITTAG Minimization," <TAG> 5/2 (February 1992) 4-5 (who notes "A little history may be of some help in understanding why the standard is the way it is. SGML was developed when desktop computers that had 64K were large. . .") and Jan Grootenhuis, "Disambiguation of SGML Document Models," <TAG> 12 (December 1989) 11-12.



McGaffey, Robert. "Automatic Tables Using SGML, C, and TeX"." TUGboat: The Communications of the TeX Users Group 13/3 (October 1992) 291-294.



[CR: 19951228]

McGaffey, Robert W. "SGML versus/and TeX." TUGboat - The Communications of the TEX Users Group [= Proceedings of the 1991 Annual Meeting] 12/3 (December 1991) 406-408. ISSN: 0896-3207. Author's affiliation: Oak Ridge National Laboratory, Oak Ridge, TN. Email: mcgaffeyrw@ornl.gov.

Abstract: "Everyone who handles computer documentation faces the problem of proliferating application-specific versions of a source file and the added difficulty of merging changes back into the source. SGML is a resource for building a generalized solution. TeX and SGML offer a particularly harmonious synergism for documentation applications."

Note also the articles by Andrew E. Dobrowolski and C. Michael Sperberg-McQueen in this same issue, both dealing with SGML and TeX.



McGann, Jerome. Radiant Textuality. IATH Research Report. University of Virginia, Charlottesville: IATH [The Institute for Advanced Technology in the Humanities], 1995. approximately 12 pages, 12 references. Email Address: jjm2f@lizzie.engl.virginia.edu.

Discusses the origins of the Rossetti Archive and the larger IATH project "English Poetry 1780-1910: A Hypermedia Archive of Critical Editions." Cites some of the weaknesses and limitations of current SGML-tagged texts on the Internet.

Available [was: http://jefferson.village.virginia.edu/public/jjm2f/radiant.html] in HTML format from the IATH WWW server and in mirror copy here [June 02, 1995].



McGann, Jerome. The Rossetti Archive and Image-Based Electronic Editing. IATH Research Report. University of Virginia, Charlottesville: IATH [The Institute for Advanced Technology in the Humanities], April 7, [revision]1995. approximately 14 pages, 15 references. Email Address: jjm2f@lizzie.engl.virginia.edu.

Discusses the role of images in textual research (as over against an SGML encoded-text-only representation). From the author's prefatory note: "This is a hypertext preprint version of an essay to appear in a print collection of essays being edited by Richard Finneran for a volume to be published by University of Michigan Press. This electronic version itself demonstrates some of the peculiar powers of hypermedia editing. The text here is hotlinked to various materials, some of them digital, some of them textual, which could never have been included in a paper-based text. Two matters are of greatest importance in this respect. First, the materials here are interactive. Second, some of the exemplary data is so extensive that it can only be included as severely truncated excerpts in the paper-based version of the essay."

Available in HTML format on the IATH server.



[CR: 19951206]

McGillivray, Murray; Gutwin, Carl; Reed, Todd. Managing Medieval Gigabytes. Paper presented at The Electric Scriptorium. Approaches to the Electronic Imaging, Transcription, Editing and Analysis of Medieval Manuscript Texts: A Physical & Virtual Conference. The University of Calgary, Calgary, Alberta [physical conference]. November 10-12, 1995. Sponsored by The University of Calgary, Calgary Institute for the Humanities, and SEENET. Conference coordinated by Dr. Murray McGillivray, Thomas Wharton, Blair McNaughton, and Robert McLean. Extent: approximately 11 pages (draft version). Author's affiliation: University of Calgary.

The authors report on a proposed solution to managing multi-megabytes of humanities electronic text: "a public-domain compression and indexing utility called "mg" ("managing gigabytes") developed by Ian Witten, Alistair Moffat, and Timothy Bell. . . The goal of this project is to construct a framework and pilot system for creating and making available large collections of humanities texts and images. The pilot system will be a suite of tools, based on mg and SGML, that will compress and index large collections of SGML-encoded e-texts and associated image files and that will allow users to search or browse the compressed archive using Web tools and retrieve either significant portions of e-texts in response to word and collocation searches, or whole e-texts."

The document is available on the Internet as part of the official conference record: see http://www.ucalgary.ca/~scriptor/papers/murray.html [mirror copy]. For further details on the Electric Scriptorium conference, see Electric Scriptorium Home Page.



[CR: 19961226]

McGrath, Sean. "SGML - A General Purpose Software Development Tool?" Pages 463-472 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Technical Director, Development, Digitome Ltd., Dromore West, County Sligo, Ireland. Tel: +353 96 47391; FAX: +353 96 47392; Email: digitome@iol.ie; WWW: http://www.screen.ie/digitome/.

Abstract: "JSP (Jackson Structured Programming) and JSD (Jackson System Design) are software development methodologies from the late Seventies and early Eighties. Both concentrate heavily on the concept of data models. This talk explores the relationship between JSP/JSD and SGML and considers whether SGML's superiority as a data modelling language might make SGML useful as a general purpose Software Development tool. It also examines how some of the ideas of JSP/JSD can be usefully applied to more traditional SGML processing applications.

Many of the philosophical ideas underpinning JSP/JSD are remarkably similar to those found in SGML. I.e., Concepts such as content models, exclusions, validation, LINK, tree transformation are all present albeit in disguise. JSP even grapples with CONCUR!

As well as striking similarities between SGML and JSP/JSD, there are fundamental differences. JSP/JSD provides a modelling paradigm but does not provide any software to support implementation. In other words, JSP/JSD have a modelling language like SGML's DTD but no parsing/validating capabilities. Moreover, JSP/JSD has no direct support for recursive data structures.

The net result of these differences is that SGML can be shown to be a more powerful modelling system than JSP/JSD. The intriguing thing about this is that it implies that SGML may have a role in fields where these methodologies has been used to good effect. These range from library booking systems to process control applications.

SGML continues to evolve from a document markup language to a general purpose modelling tool. Related standards such as HyTime and DSSSL expand the scope of SGML based applications above and beyond 'documents' and 'publishing'.

Comparing SGML with JSP/JSD - philosophically similar, software engineering methodologies - may give us some clues as to where SGML is headed. It may also point to the sort of SGML CASE tools we are likely to see in the future."

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19980202]

McGrath, Sean. PARSEME.1ST. SGML for Software Developers. Charles F. Goldfarb Series On Open Information Management. Upper Saddle River, NJ: Prentice Hall PTR [Professional Technical Reference], [June] 1997. Extent: 364 pages, CDROM disc. ISBN: Cloth (0-13-488967-3). Author's affiliation: Chief Technology Officer, Digitome Electronic Publishing, Enniscrone, County Sligo, Ireland..

Publisher's summary: "The first comprehensive guide for software developers charged with implementing SGML solutions. In this book, an experienced SGML developer shows software engineers and developers all they need to succeed in developing SGML systems and products. Starting with the basics of SGML documents, DTDs, instances and parsing, the book introduces SGML to software developers. Once SGML basics are covered, the book presents the detailed information and worked examples that developers need to implement SGML solutions. It covers parsing in detail, then reviews SGML processing types, and considers the programming languages and techniques that may be used to implement SGML, including line-oriented, recursive descent, event-driven and tree transformation techniques. The book covers implementations of SGML generation, information reuse, dissemination and management. It also presents SGML subtleties, data variations and optional features that software engineers should be aware of. Finally, the book reviews related standards such as the HyTime hypertext/multimedia standard, and the new DSSSL standards for processing and style." [July 26, 1997]

[Another] Publisher's marketing blurb: "This book explores the many facets of SGML that are of particular relevance to Software Engineers. It is, to date, the only existing publication that approaches SGML from the Software Engineering viewpoint. (1) Answers the question "What exactly is SGML? Is it a database? A programming language? A typesetting tool?" etc; (2) Describes the valuable tools that are part of SGML - ranging from design tools to validation tools through to processing tools - and the use of ubiquitous programming tools such as C/C++ compilers, Perl etc. to perform SGML processing; (3) Does not require the use of software to explore material in the book. (4) Provides information that is useful to PC users as well as Unix users. (5) Takes a practical approach to the subject matter." See also the online Table of Contents Listing.

A summary of the PARSEME.1ST by the author, together with the volume Preface and online Table of Contents, is available on the SGML/XML Web Page. See also the "Prentice-Hall SGML Series" web page.



[CR: 19980905]

McGrath, Sean. XML by Example. Building E-Commerce Application. Charles F. Goldfarb Series on Open Information Management. The Definitive XML Series from Charles F. Goldfarb. Upper Saddle River, NJ: Prentice Hall PTR, 1998. ISBN: 0-13-960162-7. Author's affiliation: Technical Director of Digitome Electronic Publishing. Email: sean@digitome.com WWW: http://www.digitome.com/sean.htm.

[Provisional] Summary: "This book takes a hands on approach to XML illustrating how it can be used to build electronic commerce applications on the Web. XML is covered from the ground up as are a number of XML based e-commerce initiatives such as OFX (Open Financial Exchange) and OTP (Open Trading Protocol). Important standards associated with XML (XSL, XLL and Unicode) are also covered."

Several brief reviews of XML by Example are available online from Amazom.com.



[CR: 19980410]

McGrath, Sean. "XML Programming in Python." 23/2 (February 1998) 82-104. Author's affiliation: Digitome Ltd.

Abstract: "XML, short for 'Extensible Markup Language', is a data description language developed under the auspices of the World Wide Web Consortium. Simply put, XML provides a standard way of describing and capturing the structure and content of information. Everything from flat 'name, address, and telephone number' structures to deeply hierarchical or recursive structures can be described and captured using XML. Many people see XML as the data representation format that will underpin the next generation of web applications. Python, on the other hand, is an object oriented scripting language invented and maintained by Guido van Rossum. It provides a balanced mix of functional and imperative programming features-the usual if / while / for control structures versus lists, map, and lambda functions, for instance. This highly modular, highly portable language, with its rich set of existing libraries, is easily extended-either in Python or by building Python extensions in C/C++. Python's feature mix, particularly its excellent support for object oriented and hierarchical data structures, make it well suited to processing XML encoded information. This also applies to processing HTML in Python. Add to this the variety of Internet protocols (HTTP, FTP, and the like) that Python supports, and you have an excellent Internet programming tool. In short, the combination of XML and Python is a powerful cocktail of information description, representation, and processing power." [from lmg]

Note that some XML parsing facility comes with the standard 1.5 Python distribution; see http://www.python.org/. See the database entry: "Python for SGML Processing", and also "XML and Python."



[CR: 19980430]

McKelvie, David; Drew, Cris; Thompson, Henry S. "Using SGML as a Basis for Data-Intensive Natural Language Processing [NLP]." Computing and the Humanities (CHUM) [?]/[?] ([Draft version January 30] [Forthcoming] 1998) [25 pages total] (with 27 references). ISSN: 0010-4817. Authors' affiliation: Language Technology Group, Human Communication Research Centre, University of Edinburgh, Scotland. Email: David.McKelvie@cogsci.ed.ac.uk; WWW: http://www.cogsci.ed.ac.uk/~dmck/.

Abstract: "This paper describes the LT NSL system (McKelvie et al., 1996), an architecture for writing corpus processing tools. This system is then compared with two other systems which address similar issues: the [Sheffield] GATE system (Cunningham et al., 1995) and the IMS Corpus Workbench (Christ, 1994). In particular, we address the advantages and disadvantages of an SGML approach compared with a non-SGML database approach."

The paper is available online in Postscript format; [local archive copy]. This paper is an extended version of ANLP '97 paper, referenced below. See also the database main entry: The HCRC Map Task Corpus.



[CR: 19980818]

McKelvie, David; Brew, Chris; Thompson, Henry. "Using SGML as a Basis for Data-Intensive Natural Language Processing." Computers and the Humanities (CHUM) 31/5 (1997/1998) 367-388 (with 25 references). ISSN: 0010-4817. Authors' affiliation: Human Communication Research Centre (HCRC).

Abstract: "This paper describes the LTNSL system (McKelvie96), an architecture for writing corpus processing tools. This system is then compared with two other systems which address similar issues, the GATE system from Sheffield and the IMS Corpus Workbench. In particular we address the advantages and disadvantages of an SGML approach compared with a non-SGML database approach." [Ref. No. HCRC/RP-93]

A version of the paper is available online in Postscript format; [local archive copy].



[CR: 19980606]

McKelvie, David; Drew, Cris; Thompson, Henry S. Using SGML as a Basis for Data-Intensive NLP. Paper prepared for ANLP '97 (Washington). University of Edinburgh, Scotland: Language Technology Group, Human Communication Research Centre, University of Edinburgh, 1997 . Extent: 8 pages (with 17 references).

Abstract: "This paper describes the LT NSL system (McKelvie et al., 1996), an architecture for writing corpus processing tools. This system is then compared with two other systems which address similar issues: the [Sheffield] GATE system (Cunningham et al., 1995) and the IMS Corpus Workbench (Christ, 1994). In particular, we address the advantages and disadvantages of an SGML approach compared with a non-SGML database approach."

An expanded version of this paper is referenced above: "Using SGML as a Basis for Data-Intensive Natural Language Processing [NLP]". The paper is available online in Postscript format; [local archive copy]. Possibly published in the Proceedings, ANLP '97.



[CR: 19951220]

McKelvie, David; Thompson, Henry S. TEI-Conformant Structural Markup of a Trilingual Parallel Corpus in the ECI Multilingual Corpus 1. HCRC Technical Report, Ref. No. HCRC/TR-48. Edinburgh, Scotland: Human Communication Research Centre, June 1994. Extent: 9 pages, 6 references. Authors' affiliation: Human Communication Research Centre, University of Edinburgh, 2 Buccleuch Place, Edinburgh, Scotland. Email: eucorp@cogsci.ed.ac.uk.

Abstract: "In this paper we provide an overview of the ACL European Corpus Initiative (ECI) Multilingual Corpus 1 (ECI/MC1). In particular, we look at one particular subcorpus in the ECI/MC1, the trilingual corpus of International Labour Organisation reports, and discuss the problems involved in TEI-compliant structural markup and preliminary alignment of this large corpus. We discuss gross structural alignment down to the level of text paragraphs. We see this as a necessary first step in corpus preparation before detailed (possibly automatic) alignment of texts is possible.

"We try and generalise our experience with this corpus to illustrate the process of preliminary markup of large corpora which in their raw state can be in an arbitrary format (e.g., printers tapes, proprietary word-processor format); noisy (not fully parallel, with structure obscured by spelling mistakes); full of poorly documented formatting instructions; and whose structure is present but anything but explicit. We illustrate these points by reference to other parallel subcorpora of ECI/MC1. We attempt to define some guidelines for the development of corpus annotation toolkits which would aid this kind of structural preparation of large corpora."

Available in Postscript format: ftp://scott.cogsci.ed.ac.uk/pub/HCRC-papers/tr-48.ps.gz. Abstract: ftp://scott.cogsci.ed.ac.uk/pub/HCRC-papers/abstracts-tr.html or ftp://scott.cogsci.ed.ac.uk/pub/HCRC-papers/Abstracts/abstracts. Or see mirror copy, December 1995



[CR: 19971008]

[McKenzie, Matt]. "David Siegel: Bad Boy of Web Design." Seybold Report on Internet Publishing 2/2 (October 1997) 3-8. ISSN: 1090-4808.

The article features an interview with David Siegel, publisher of High Five Magazine and a number of influential books on Web design. Siegel speaks his mind about the role of XML (Extensible Markup Language) at several junctures.



[CR: 19980515]

McKenzie, Matt; Walter, Mark. "Adobe, Macromedia Vie for Leadership in Vector Graphics. The Web Could Really Use a Standard: Is There Room for Both PGML and Flash?" Seybold Report on Internet Publishing 2/9 (May 1998) 21-24. ISSN: 1090-4808. Authors' affiliation: Seybold Publications.

The authors review two recent technology proposals for delivering vector graphics on the Web: Flash and PGML (Precision Graphics Markup Language). They explain why vector graphics have certain advantages over raster graphics in the Web context, and explain some of the particular advantages of PGML's use of XML encoding (e.g., direct searchability, manipulation by the client, exposure to the DOM). The Flash specification from Macromedia has now been published, making it more open to scrutiny; in some cases, its native binary format might result in smaller graphics files that may be rendered more quickly than PGML. The authors conclude that "both the open Flash and PGML announcements are good news for Web publishers."

See details on Precision Graphics Markup Language (PGML) in the dedicated database entry. A version of this article is available on the XML.com Web site: "Adobe's PGML Proposal is Built on PDF and XML."



[CR: 19950828]

McKinley, Tony. "From Books to the Web: To Digitize a Library. A Rare Combination: High-Tech Access and Age-Old Wisdom." Imaging World 4/9 (September 1, 1995) 58-59. ISSN: 1060-894X. Author's affiliation: President, Intelligent Imaging, Berwyn, PA. Email: tonymck@imagebiz.com.

The author describes an ambitious project at the Marx Law Library at the University of Cincinnati. Scanning, OCR, and SGML tagging are at the heart of the process to preserve documents and to make them available electronically. One of the motivations for using SGML is that legal scholars want to be able to retrieve information based upon queries that address low-level descriptions of text objects -- not simply string data. This "Project Diana" is under the direction of Nick Finke. Avalanche FastTag and SoftQuad's Author/Editor software are used in the structuring of the encoded data (with tagging down to paragraph level), and Electronic Book Technologies DynaWeb software is used to create electronic books in SGML for viewing and searching under DynaText. The documents are also down-translated into HTML for use on the Web.

See a related article (with text examples) "From Books to the Web: The On-line OCR Lab" available in the Internet: OCR Lab.



[CR: 19951226]

McNamara, Michael J. "The Document Database: Relational, Object Oriented or Hybrid?" In Proceedings of the Second SGML BeLux Users' Conference. SGML BeLux '95: Second annual conference on the practical use of SGML, Antwerp, Belgium. October 25, 1995. Edited by Hans C. Arents. Leuven, Belgium: Katholieke Universiteit Leuven, 1995. Author's affiliation: Xyvision Limited, 246 Bedford Avenue, Slough, Berkshire, SLI 4RJ United Kingdom. Email: Mike.McNamara@xyvuk.com.

The author argues for taking an eclectic (hybrid) approach if the situation warrants. Extract: "The whole notion of documents as data is relatively new. With the acceptance of sophisticated encoding languages, such as SGML (Standard Generalized Markup Language), it is now possible for high-end publishing and document storage and retrieval systems to break down documents into components (tables, figures, text blocks, etc.) and store these as uniquely addressable items. Those items can later be selectively retrieved, published as a whole document or combined and recombined to produce a variety of products."

The document is available online in HTML format: "The Document Database: Relational, Object Oriented or Hybrid?" [mirror copy, December 1995] . For further details on the 1995 Conference and BeLux, see the contact information for SGML BeLux.



[CR: 19950727]

McQuarrie, Liz. The Accessibility of PDF and Adobe Acrobat Viewers for the Visually Disabled. Adobe position paper, posted to Usenet News, comp.text.pdf and comp.text.sgml (6-July-1995). Adobe Systems, June 30, 1995. Author's affiliation: Adobe Systems Incorporated.

"Adobe's Portable Document Format, the native file format of the Adobe Acrobat products, is a final form description language for documents that is not tied to any operating system or application. PDF provides the document layout richness of Adobe PostScript and allows publishers to retain the look and feel of their publication. On the World Wide Web, PDF is becoming increasingly popular for documents that need the layout richness that HTML currently does not provide. Corporations are also using PDF to disseminate electronic documents over corporate networks, via e-mail, or on CD-ROM. For the visually disabled, however, there are currently some accessibility issues associated with PDF and the use of Adobe Acrobat viewers (Reader and Exchange) for viewing PDF files. This document describes Adobe's plans for making both the Adobe Acrobat viewing products and the PDF file format accessible for the visually disabled."

A significant thread on "PDF versus SGML" may be found in the CTS archives for July, 1995. The document cited here represents an official answer from Adobe on the question of its intent to support the needs of the disabled in its new technologies.

Liz McQuarrie may be reached as: lmcquarr@adobe.com. A copy of the document is available here.



[CR: 19960826]

Mea, V. D.; Beltrami, C. A.; Roberton, V.; Brunato, D. "HTML Generation and Semantic Markup for Telepathology." Computer Networks and ISDN Systems 28/7-11 (May 1996) 1085-1094 (with 15 references). Authors' affiliation: Department of Anatomic Pathology, Udine University, Italy.

"Abstract: The paper presents a new strategy for the authoring of hypermedia documents, and describes an HTML generator called HistMaker and its application to the domain of anatomic pathology. A simple extension to HTML is presented, whose aim is to introduce a general-purpose grouping construct to allow the semantic markup of hierarchically structured hypermedia documents. Such structural information can be used for the effective authoring, browsing and searching of documents. The authoring tool HistMaker is introduced on the basis of a model of a pathologic case; its implementation and practical results are also discussed."

[Based upon presentation at the Fifth International World Wide Web Conference, Paris, France, 6-10 May 1996.]



[CR: 19981109]

Megginson, David. Structuring XML Documents. Charles F. Goldfarb Series on Open Information Management. Upper Saddle River, NJ: Prentice Hall PTR, [March] 1998. Extent: xxxviii + 425 pages, CDROM. ISBN: 0-13-642299-3. Author's affiliation: Microstar Software Ltd.; WWW: http://home.sprynet.com/sprynet/dmeggins/; Email: ak117@freenet.carleton.ca or dmeggins@microstar.com.

Structuring XML Documents is likely to have very positive reviews, and to receive strong endorsements from SGML/XML experts. See the online Table of Contents and overview of the book for details. Alternately, see the publisher's descriptive information, http://www.prenhall.com/ptrbooks/ptr_0136422993.html. See also the "Prentice-Hall SGML Series" web page.

Description: "The book is appropriate for technical writers, documentation project managers, document systems implementors and consultants. It covers both XML and SGML, and reflects the final version of the XML standard. The CD-ROM includes five comprehensive, industry-standard DTDs, a leading parser and other great software tools. The book is written for users who are ready to build sophisticated XML or SGML DTDs that solve complex, real-world document systems challenges. In this book, David Megginson shares his extensive experience and wisdom about quality structured document design and DTD development. Discover proven techniques for building DTDs that are easier to learn, use, and process. Working from five detailed industry-standard models, learn how to analyze DTDs and adapt them for your specific needs. Understand how to ensure structural compatibility throughout your DTDs. Finally, learn how to use the brand-new Architectural Forms standard to simplify many of the most complex DTD problems." [adapted from the publisher's Web site, 980402]

See also the reviews: on the XMLxperts site, and by Michael R Hahn in <TAG>.

Chet Ensign published a review of this book in "Structure Rules! Why DTDs Matter After All" (Markup Languages: Theory & Practice Volume 1, Number 1 [Winter 1999]). See the abstract in the issue summary, and the expanded/annotated Table of Contents in Deborah A. Lapeyre's complementary review article.



[CR: 19950922]

Melby, Alan. "E-TIF: An Electronic Terminology Interchange Format." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/2 (1995) 159-165.

Abstract: "This article begins by emphasizing the importance of terminology in this modern age of technological innovations and machine-based translation systems, establishing the need for a terminology interchange format, and distinguishing between lexicography and terminology. It then reviews previous attempts to establish terminology interchange formats and concludes with a forceful argument for a new system based on the TEI-based notions of elements and attributes."



[CR: 19971227]

Mellier, Pierre; Grize, François. "Electronic Publishing of a Chinese Encyclopaedia in French." Pages 143-150 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Pierre Mellier]: University of Lausanne, Computer Science Insitute, Collège Propédeutique, CH-1015 Lausanne, Switzerland; Phone: 00 41 21 692 35 90; FAX: 00 41 21 692 35 85 Email: pierre.mellier@iismail.unil.ch WWW: http://www-iis.unil.ch; François Grize, University of Lausanne; Email: francois.grize@iismail.unil.ch.

Abstract: "The main purpose of the Ricci Institute (Paris and Taipei) is to publish an encyclopaedia of Chinese language and culture in the main Western languages, including of course French. After issuing a first dictionary of the Chinese language in 1976, the Ricci Institute soon realized that the continuation of its work leading up to the publication of a second encyclopaedia Le Grand Ricci (the Complete Ricci) would not be possible without the use of powerful computer tools.

"Unfortunately, the computer structures developed in the early days of the project proved difficult to use. Based on techniques borrowed from compilation theory, the Computing Institute at Lausanne University developed a translator to convert the content of the dictionary into SGML (Standard Generalized Markup Language).

"Many existing types of software are incapable of mastering the problems associated with the main systems of script. A work like Le Grand Ricci obviously cannot be fitted into the straight-jacket of standardized norms (such as BIG5, UNICODE, etc.). These norms are only capable of handling current characters, whereas the dictionary is encyclopaedic."

"The Grand Dictionnaire Ricci (Complete Ricci Dictionary) of the Chinese language was begun in 1950 by a group of Jesuit sinologists, mostly of French origin. For twenty years, they assembled, selected and studied around 11,000 characters and 180,000 expressions. This thesaurus, which fills 40 volumes typewritten in two copies, led to the publication of the Petit Ricci: dictionnaire français de la langue chinoise which contains translations into a rich, clear and precise French of some 6,000 Chinese characters and 50,000 expressions. Over 18,000 copies of this work have been sold to date, not counting the versions published in other Western languages."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



Michalski, Jerry. "Content in Context: The Future of SGML and HTML." Release 1.0 94/9 (September 27, 1994) 1 (total 13 pages).



[CR: 19971227]

Micksch, Beth. SGML Document Management. Panel Presentation at Documation '97. Thursday, Santa Clara, California, Convention Center.. New York, NY: RAI, February 27, 1997. Author's affiliation: Application Development Manager, Research Institute of America Group.

[Session] Summary: "You have heard the theoretical benefits, now hear the 'real scoop' on what SGML has done for information providers. Experienced users will present their document management applications and describe the challenges, successes and benefits in using SGML. The session will highlight practical information on cost savings, process and time-to-market improvement and the added value that SGML brought to the their organizations."

See the conference description for other information.



[CR: 19971227]

Micksch, Beth. "Entity Management Across Groups and Platforms." Pages 319-326 in SGML '95 Conference Proceedings. "SGML '95: 'Expanding the Universe." Sheraton Boston Hotel and Towers, Boston, MA, USA. December 4 - 7, 1995. Sponsored by the Graphic Communications Association (GCA). Conference Chairs: Tommie Usdin and Debbie Lapeyre. Alexandria, VA: Graphic Communications Association (GCA), 1995. Extent: v + 526 pages. Author's affiliation: Senior Technical Manager, Workgroup Application Integration, Intergraph Corporation.

Abstract: "Systems and general text entities can easily be managed and manipulated by an SGML author or technical writer. Some of these entities require standardization within a corporation so that they can be shared, but not modified by a random user. Others must be customized for a particular document or group. A description of tools and methods used to accomplish this task as well as a description of actual user environments where it is implemented will be outlined. Special features used in document translation will also be discussed."

Other information on "SGML '95: 'Expanding the Universe'" is referenced in the main conference entry. For other information on SGML entities and entity management: SGML Entity Types, and Entity Management.



[CR: 19971227]

Micksch, Beth. ISO 12083 Announcement. Presentation at SGML '93 (December 6-9, 1993, Boston, MA, USA). Huntsville, AL: Intergraph Corporation, 1993. Author's affiliation: Intergraph Corporation; [1997: Application Development Manager, Research Institute of America Group].

Summary: "This presentation was intended to provide a brief history and update on ISO 12083 'The Electronic Manuscript Preparation and Markup' Standard. Formerly an ANSI standard ( Z39.59, but generally referred to as the "AAP"), ISO 12083 is now being fast-tracked through the ISO standards process. The first ballot on the Draft International Standard(DIS) was in November 1992, and the voting went as follows: 14 positive, 5 negative, and one abstention."

"The Standard is intended to facilitate the creation and interchange of books, articles and serials in electronic form. It is meant to provide a basic toolkit which users can pick up and modify according to their needs. The Standard is meant for use by authors, publishers, libraries, library users, and database vendors.Use of the Standard is indicated by its public identifier (e.g. ISO 12083:1993//DTD Book//EN - for the Book DTD). Elements or entity references may be removed or modified as needed. Users can declare their own elements in external parameter entities, and the parameter entities defined in IS0 12083 can be overridden to modify order and occurrence or to specify user defined elements/attributes; alias elements are not permitted. The Standard allows SHORTTAG and OMITTAG, although the revised usage examples will be fully normalized. The application must conform to ISO 8879:1986. ISO 12083 contains four DTDs: Book, Article, Serial, and Mathematics. It has a very large Annex (A) which comments on the DTDs and covers such things as design philosophy, structure descriptions, special characters, electronic review, mathematics, tables, braille/large print/computer voice facilities, and HyTime facilities. Annex B contains descriptions of the elements, and indicates how all the elements relate to one another. Annex C contains examples, some of which are normalized versions of the examples which first appeared in the ANSI standard."

"Beth closed by remarking that the second edition of Eric van Herwijnen's book "Practical SGML" has been produced using the ISO 12083 Standard (including the HyTime capabilities), and things seem to have worked pretty well. The indications from other tests which are currently underway have been equally positive.

Note: the summary above was adapted from the SGML '93 Conference Report, by Michael Popham.

See the database entry ISO 12083 DTDs [EPSIG] for other information on the standard.



[CR: 19971227]

Micksch, Beth. "Southeast SGML Users' Group." <TAG> 6/5 (May 1993) 11. ISSN: 1067-9197. Author's affiliation: Intergraph Corporation.

The author supplies a notice on the formation of an SGML Users' Group in the southeastern region of the Unites States, under the aegis of the International SGML Users' Group. Previously, Beth Micksch also organized the Mid-Atlantic SGML Users' Group and the Midwest SGML Forum.



[CR: 19970212]

Mikula, Norbert H. "Electronic Databooks: Proof of Concept." In: Proceedings of the 3rd Annual Conference on the Practical Use of SGML. "A Decade of Power." Third Annual [Belux] Conference on the Practical Use of SGML. Business Faculty, Sint-Lendriksborre 6, Brussels, Belgium. October 31, 1996. Sponsored by SGML Belux (Belgian-Luxembourg Chapter of the International SGML Users' Group). Leuven, Belgium: Belux, 1996. Author's affiliation: Philips Semiconductors, Building BE-24 - P.O. Box 218, NL-5600 MD Eindhoven, The Netherlands. Email: nmikula@edu.uni-klu.ac.at.

Abstract: "The semiconductors industry of today faces a highly competitive market. It is not enough anymore to develop high quality products in less time than others. How to get information about a product to the customer is one of the key factors for success in today's business environment. Some of the most crucial factors influencing the effective dissemination of information are: (1) Online access to ensure up-to-date information; (2) Convenient access and query mechanisms to find relevant chunks of information in documents; (3) Distribution in a platform and vendor independent standard data format

"The Pinnacles Component Information Standard (PCIS), developed by the semiconductor industry, is being used as a means of distributing rich and computer-understandable information. The approach presented in this paper targets these areas by deploying the advantages of publishing via the Internet combined with the new powerful Java programming language and support for ISO standard 10179, DSSSL (Document Style Semantics and Specification Language)."

The system discussed in this paper can be divided into 3 parts: (1) Cappuccino - An SGML parser written in Java; (2) Yade - Yet another DSSSL engine- A prototype implementation of a DSSSL engine based on a Scheme interpreter written in Java (Kawa 0.2, Copyright by Copernican Solutions). The DSSSL engine aims to incorporate the basic concepts of ISO 10179 with special emphasis on DSSSL-Online, the "light" version of DSSSL designed for online use; (3) PSC-EDB - Philips Semiconductors Electronic Databook. An application interface to the DSSSL engine using the Abstract Window Toolkit (AWT, the Java windowing interface)."

Available online in HTML format: Electronic Databooks: Proof of Concept, by Norbert H. Mikula; [mirror copy]. For further information on the conference, see: (1) the description in the conference announcement and call for papers, and (2) the full program listing, or (3) the main conference entry in the SGML/XML Web Page.



[CR: 19971125]

Mikula, Norbert H. "PDoS - Pinnacles DSSSL-O Stylesheet: Stylesheet Design for Online and Paper-Based Delivery." Page(s) 233-244 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: University of Klagenfurt, Department of Informatics, Austria; Email: nmikula@edu.uni-klu.ac.at; WWW: http://www.edu.uni-klu.ac.at.

Abstract: "PCIS DSSSL-O Stylesheet (PDoS). PCIS - The Pinnacles Component Information Standard is an ISO 8879 application designed to meet the markup needs of the semiconductors industry. It was designed under the auspices of the Pinnacles Group, a consortium consisting of major companies in the semiconductors industry. The PCIS tag-set is the de-facto standard in the semiconductors industry. DSSSL-O is a subset of ISO/IEC 10179 (DSSSL), designed the address formatting requirements in the area of online-display (rendering) of SGML/XML data using DSSSL.

PDoS is a first attempt to create a DSSSL stylesheet for the PCIS DTD. This paper discusses a variety of aspects encountered/addressed during the development of this prototype stylesheet, such as modular stylesheet design and user configerable stylesheet architectures. Testing and development of the presented concepts has been done using James Clark's DSSSL engine Jade and the authors Java based DSSSL renderer Yade.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19950408]

Miller, John J. H. (editor). PROTEXT IV. Proceedings of the Fourth International Conference on Text Processing Systems. International Conference on Text Processing Systems, Boston, MA, USA 20-22 October 1987. Sponsored by INCA - Institute for Numerical Computation and Analysis. Dun Laoghaire, Ireland: Boole Press, Ltd., 1987. vii + 153 pages. ISBN: 0-906783-80-1 (hardback); 0-906783-79-8 (paperback).

The following topics were dealt with: digital type fonts for text processing systems; complexity in structured documents, user interface, issues; text processing in ideographic languages; multilingual editor based on GNU emacs and TEX; run time adaptive interfaces; advanced Chinese text processing and typesetting system; text processing systems with linguistic knowledge; making WYSIWYG characters shape up; spacing a line of music; best match searching in document databases; font matching with flexifonts; parser generator for SGML; typefounder-collection of digital font creation tools; SGML as a software development tool; and editor for structured technical documents. Abstracts of individual papers can be found under the relevant classification codes in this or other issues.



Miller, R. S. "Textual Databases: An Object-based Model." Pages 313-317 (with 4 references) in 14th National Online Meeting. Proceedings - 1993 (National Online Meeting, New York, NY, 4-6 May 1993.) Edited by M. E. Williama. Medford, NJ: Learned Information, 1993. xii + 452 pages.

Abstract: "Large information delivery systems of the future will be distributed, object-based systems. These systems will very likely be implemented with client/server architectures where special-purpose servers work together in a loosely coupled environment to deliver information to their clients. This information will include text and also other forms of data such as image, graphic or tabular data. These systems will incorporate and extend the idea of markup to include the content of a variety of object types as well as text objects. This extended marking of objects will function in much the same way that SGML functions today for documents. This marking will provide for data interchange, structuring and output interpretation. Object content marking (tokenization) will add intelligence and programmability to database objects and provide the basis for standardizing object structure and database technology. The idea of open systems will be extended to databases and the objects they contain. Therefore, text objects will be more formal in order to function in future information delivery systems. These text objects will contain structures of instance variables which will enable objects to identify themselves to the system and also provide for ad hoc association in sets and hierarchies."



[CR: 19960806]

Milowski, R. Alexander. A Theory of Documents: How SGML Can Change the Way We Look at Information. Paper presented at the SGML '95 Conference (December 4-7, 1995). Minneapolis, MN: Copernican Solutions Inc., December 1995. Extent: approximately 14 pages. Author's affiliation: Copernican Solutions.

"Traditional information systems have usually treated the problem of producing documents with a process of extraction. That is, one must run a "report query" and the information is extracted, processed, transformed, and either presented for viewing or printing. Such a process means that document generation can be a tedious process and can easily get out of synchronization with the information that produced the documents."

'In the following passages, a document-centric framework will be outlined such that SGML application systems and SGML information can be produced that encapsulates all facets of the production and process of information. These systems will use SGML as the native interchange and medium for semantics in both presentation and manipulation." [from the Overview]

"Three major points have been made in this essay: (1) Repositories of information can become entities through storage managers; (2) DTD and instance components allow the reuse of units of information; (3) Semantic Attachment allows documents (and repositories) to execute as many different kinds of applications. . . What this means is that a document-centric system can be created such that a user is always presented with, processing, and using documents. These documents may be in the form of raw SGML to SGML applications in which the user does not even know that SGML is being used. In any case, SGML is glue that pulls many kinds of information together and presents a process-independent way of representing information on which many semantics may be attached." [from the Conclusions]

Available online: the full text of the paper, in SGML format; slides from the presentation.



[CR: 19961226]

Milowski, R. Alexander. "Transformation as the Basis of Application: DSSSL in Practice." Pages 449-462 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: President/Principal Researcher, Copernican Solutions Incorporated, 1313 Fifth St. SE, Suite 311, Minneapolis, Minnesota 55414, USA. Tel: +1 (612) 379-3608; Email: alex@copsol.com; WWW: http://www.copsol.com/.

Abstract: "Transformations allow a developer and user to think of their documents as active parts of a system. In doing so, we can re-orient our documentation systems and other document-related systems to use transformations as the means by which documents are processed or produced.

With the advent of DSSSL as a standard, we now have the means to be able to create systems that not only read both standard documents but also standard transformations. Simple tasks like editing can be re-oriented as a transformation process. Thus, transformation takes 'center stage' as the 'conductor' of the processes necessary to produce your documents.

This talk will introduce the concept of transformation as the basis of an application and cover the infrastructure necessary to produce such systems using SGML, HyTime, and DSSSL."

See, for example, the description of the SENG Transformation Engine from Copernican Solutions Incorporated as an experimental DSSSL engine that is now [December 1996] being extended to include support for the DSSSL transformation language: SDQL has been implemented, and the WIP 0.1 preview will support abstract groves.

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19971227]

Milowski, R. Alexander. "Web Application Frameworks. Applications of SGML, XML, DSSSL, and Java for a Web Environment." Pages 473-486 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [R. Alexander Milowski]: President/Principal Researcher, Copernican Solutions Incorporated, 1313 Fifth St. SE, Suite 311, Minneapolis, Minnesota 55414 USA; Email: alex@copsol.com; WWW: http://www.copsol.com/.

Abstract: "This presentation will introduce the concept of Web Application Frameworks for standards technology like SGML, XML, and DSSSL within web environments. It will outline their relationship of these standards to web-oriented languages like Java. The talk will focus on examples of solving problems and delivering solutions through frameworks in a web environment. The idea of 'resources' will be introduced and several standard resources will be identified that should be available on both the client (browser) and server (web server) through these examples. The presentation will introduce several design patterns for how these standards in conjunction with programming language and implementation standards can be used to deliver complex document applications to arbitrary web clients.

"In addition, this presentation will introduce several of the application programming interface standards initiatives that have taken place within the industry recently. From the examples presented within the talk, a vision of standard Web Application Framework will be developed outlining what further development in both standards and technology needs to happen to realize such frameworks."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950716]

Min Zheng; Rada, R. "A Model for Computer Supported Collaborative Work and Document Re-use." Intelligent Tutoring Media 4/1 (February 1993) 3-14 (with 30 references). Authors' affiliation: Institute of Computer Science and Technology, Peking University, China.

"Abstract: A model is presented for computer supported collaborative work (CSCW) which shows collaborative authoring and reading and information interchange to be facilitated by international standards (SGML, HyTime), and a widely accepted hypertext reference model (the Dexter model). The model, called the Dexter-groupware (DG) model, is an extension of the Dexter hypertext reference model and incorporates a conversion and interchange model (SGML-HyTime-Dexter model) that provides an interchange mechanism to support collaborative work. Based on the DG model, an open environment for CSCW and document re-use is proposed, and a prototype, the MUCH (many using and creating hypermedia) environment, is described. The MUCH system forms the centre of the environment and a set of import/export tools allows the MUCH system to interchange information with other systems by going through international standards. The MUCH environment also provides a mechanism which enables non-hypermedia systems to share the information in the MUCH database by exporting a part of the database into a linear document, reading and editing in a text editor, and then importing back into MUCH without losing any hypermedia information."



[CR: 19950716]

Min Zheng; Rada, R. "Text-hypertext Mutual Conversion and Hypertext Interchange through SGML." Pages 139-147 (with 31 references) in Proceedings of the Second International Conference on Information and Knowledge Management. CIKM 93. Second International Conference on Information and Knowledge Management, Washington, DC, USA, 1-55 November, 1993. Edited by B. Bhargava, T. Finin and Y. Yesha. New York, NY: ACM Press, 1993. Authors' affiliation: Institute of Computer Science & Technology., Peking University, Beijing, China.

"Abstract: The paper presents an SGML-MUCH (S-M) system, the I/O subsystem of a collaborative authoring and reuse system called MUCH (Many Using and Creating Hypermedia), for text and hypertext mutual conversion and hypertext interchange. The S-M system can dynamically generate text documents from the MUCH document database by traversing a subgraph of the hypertext database. With various options on traversal, different versions of documents can be generated from the same set of nodes and links in the document database. The documents generated from the MUCH document database include not only document content but also groupware and hypertext information that can be used for hypertext interchange. With this generated document, hypertext for different systems such as Guide and Hyperties can be automatically generated and new features, like alternate outline and automatically derived indices can also be incorporated. The S-M system, which was developed using several public domain development tools, adheres to standards where possible, and generally focuses on 'openness'."



[CR: 19971106]

Miner, Robert R.; Ion, Patrick D. F. "HTML-Math. Mathematical Markup Language Working Draft." Pages 83-89 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Authors' affiliation: [Miner]: Geometry Center, University of Minnesota; [Ion]: American Mathematical Society.

Abstract: "The HTML-Math Working Group recently released another version of its Working Draft of MathML. The full text of this Working Draft is available at http://www.w3.org/TR/WD-math. This note should serve to point the way to the proposal outlined in the full Working Draft, and will describe a little of the history, current state, and the future of the HTML-Math work."

Note: The Working draft of July 10, 1997 clarifies the relationship of HTML-Math to XML as follows: "Mathematical Markup Language, or MathML, is an XML application for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the Web, just as HTML has enabled this functionality for text. [...] The Mathematical Markup Language, or MathML, working draft defines an XML compliant mark-up language for describing equation content and presentation. Equation presentation mark up is carried out in a way which respects logical expression structure. This allows content description to be attached in a natural and effective way. For example, one might add an annotation to a superscript indicating it denotes function composition instead of power in this expression. Alternatively, authors may directly use equation content tags to mark up common things like trig functions, powers, and so on. Content tags are associated with notational conventions, for example, adding mark up for a power, which is by default rendered as a superscript."



[CR: 19980218]

Mittelbach, Frank; Rowley, Chris. "Language Information in Structured Documents: Markup and Rendering -- Concepts and Problems." TUGboat: The Communications of the TEX Users Group [Proceedings of the 1997 Annual Meeting] 18/3 (September 1997) 199-205. ISSN: 0896-3207. Authors' affiliation: LaTeX3 Project. Email, [Mittelbach]: Frank.Mittelbach@eds.com; [Rowley]: C.A.Rowley@open.ac.uk.

Abstract: "In this paper we discuss the structure and processing of multi-lingual documents, both at a general level and in relation to a proposed extension to the (no longer so new) standard LaTeX. Both in general and in the particular case of this proposal, our work would be impossible without the enormous support, both practical and moral, that we get from our fellow members of the LaTeX3 project team (who maintain and enhance LaTeX) and from people all over the world who contribute to the development of LaTeX with their suggestions and comments."

Compare the LaTeX(3) solution discussed in the article to that in XML 1.0 (February 1998): "In document processing, it is often useful to identify the natural or formal language in which the content is written. A special attribute named xml:lang may be inserted in documents to specify the language used in the contents and attribute values of any element in an XML document. In valid documents, this attribute, like any other, must be declared if it is used. The values of the attribute are language identifiers as defined by [IETF RFC 1766], Tags for the Identification of Languages. . . The intent declared with xml:lang is considered to apply to all attributes and content of the element where it is specified, unless overridden with an instance of xml:lang on another element within that content."

The paper was presented at the Second International Symposium on Multilingual Information Processing, March 26-28, 1997, Tsukuba, Japan, Sponsored by ELECTROTECHINICAL LABORATORY, MITI. 11:30-12:15, March 27th, TeX for multilingual environment, Organized by Dr. Yannis Haralambous. See also the conference entry. For more on SGML/XML and TeX, see the dedicated database entry and the topical bibliography listing.



[CR: 19980409]

Mittelbach, Frank; Rowley, Chris. "The LaTeX3 Project." TUGboat: The Communications of the TEX Users Group [Proceedings of the 1997 Annual Meeting] 18/3 (September 1997) 195-198. ISSN: 0896-3207. Authors' affiliation: LaTeX3 Project. Email, [Mittelbach]: Frank.Mittelbach@eds.com; [Rowley]: C.A.Rowley@open.ac.uk.

Abstract: "This article describes the motivation, achievements, and future of the LaTeX3 Project, which was established to produce a new version of LaTeX, the widely-used and highly-acclaimed document preparation system. It also describes how you can help us to achieve our aims."

Summary: According to this article, several facilities are being designed and developed to directly support the processing of SGML/XML-encoded documents through LaTeX. See the full text of the "Description" from the article for details. The primary requirements being fulfilled in this effort include:

  • Provision of a syntax that allows highly automated translation from popular SGML DTDs into LaTeX document classes; to be provided as standard with the new version of LaTeX
  • Support for the SGML concepts of 'entity,' 'attribute,' and 'short reference' in the syntax of the new LaTeX user interface, implemented in a way that makes it possible to map these constructs directly to the corresponding SGML features
  • Support for hyperlinks of the kind used in HTML and XML, and support for other features of online documents
  • Straight forward style-designer interface to support the independent specification of typographic requirements and their mapping to SGML constructs in document instances, so that different layouts may be specified for the same DTD
  • Visual, menu-driven interface for typographic style-design interface
  • Support for the DSSSL specification in the interface, and for HTML/XML stylesheets

For other literature on SGML/XML and (La)TeX, see the dedicated document with references.



[CR: 19971125]

Möller, Henning. "Towards an SGML Diff Using Temporal Documents." Page(s) 251-259 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Siemens AG, Corporate Technology, München, Germany; WWW: http://www.inf.fu-berlin.de/inst/ag-db/moeller/hm.htm; Email: henning.moeller@mchp.siemens.de.

Abstract: "A method to develop an SGML diff is presented. The extendible approach is based on a change-oriented model for structured documents containing objects of type 'data', 'element', 'attribute', or 'link'. The difference between two documents, A and B, is computed and expressed as a sequence of changes that must be incrementally performed on the objects in A in order to obtain the state B.

"Companies that must handle huge amounts of technical documentation can use SGML to mark up their documents in order to process, interchange, and archive them independently of any particular proprietary format. An author often faces different versions and variants of a document. In case these are two SGML document instances that are similar, but where the relation between them is not clear, he needs a meaningful description of their differences.

"As SGML is used to encode additional information about the structure of a document an SGML diff should reflect the data structure of the documents it compares. It is not enough to get a description of differences between two SGML document instances on the basis of their unstructured textual contents. [...] The aim of this paper is to propose a method to build an SGML diff that computes the difference between two SGML document instances A and B as the sequence of changes that would produce B if incrementally performed on A. The task of developing such an SGML diff is threefold: first, the object types of an SGML document must be determined, and second the possible changes for those objects have to be specified. Finally an algorithm is needed to compute the delta between two documents."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19961226]

Moffatt, Godfrey. "Introducing SGML Into the RAF Flight Manuals World OR Throttle to Bottle in Two Extraordinary Years." Pages 643-650 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Royal Air Force Handling Squadron, Boscombe Down, Salisbury, Wilts SP4 0JF, UK.

Abstract: "RAFHS produce the Aircrew Manuals and Flight Reference Cards required by the aircrew of all three United Kingdom services - Army, Navy and Airforce. Members of RAFHS team are specialists in the aircraft types flown by the Forces. They are not computer professionals and therefore the system acquired had to be intuitive, modern and have excellent user interface. RAFHS produced Camera Ready Copy (CRC) using a commerical DTP application. Information was received from a variety of sources including paper and proprietary word processing format. Graphics were always provided on paper and needed to be scanned-in by the authors and saved electronically. Any changes to graphics had to be returned to the originators for amendment and the whole process started again. Management of all these documents was a manual paper based system, as was the audit trail for revisions.

We learned these lessons on the way: (1) Assemble a small in-house team who are aware of your business processes and are forward thinking; (2) Educate all concerned because a little background goes a long way; (3) Know the principles of SGML; (4) Plan for change; (5) What do you require from your system and therefore your hardware and software; (6) Work closely with the consultants to ensure they understand your requirements; (7) Insist on a thorough, comprehensive document review; (8) Plan for change; (9) Understand your document structures and graphics requirements; (10) Graphics Packages; (11) Decide on the type of DTD - modular or document based; (12) Plan for change; (13) Mark-up examples of your product; (14) Plan for change; (15) Test the DTD against examples of your documentation; (16) Don't forget the small details, like attributes; (17) Plan for change; (18) Expect to rework, again and again; (19) Sort out the problems of publication; (20) The styles (FOSI? DSSL? System specific output?)

The RAFHS installed system provides them with an integrated solution providing SGML author/editing, document management, revision tracking to provide future proofed data, an airworthiness audit trail, and finally output formatting and pagination by a composition engine."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19971125]

Moffatt, Godfrey. "Introducing SGML into the RAF Flight Manuals World or Throttle to Bottle in Two Extraordinary Years." Page(s) 83-89 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Royal Air Force Handling Squadron, Boscombe Down, Salisbury, Wilts, United Kingdom.

Abstract: "RAFHS produce the Aircraft Manuals and Flight Reference Cards required by the aircrew of all three United Kingdom services -- Army, Navy, and Airforce. Members of the RAFHS team are specialists in the aircraft types flown by the Forces. They are not computer professionals and therefore the system acquired had to be intuitive, modern, and have an excellent user interface. The RAFHS system provides an integrated solution including SGML author/editing, document management, revision tracking to provide future-proofed data, an airworthiness audit trail, and finally output formatting and pagination by a composition engine."

"RAFHS (ROYAL AIR FORCE HANDLING SQUADRON) produce the AM (Aircrew Manuals) and FRC (Flight Reference Cards) required by the aircrew of all three United Kingdom services -- Army, Navy and Air Force. Members of RAFHS are specialists in the aircraft types and roles flown by the forces. The mix is approximately 50/50 serving officers and retired officers. The retired officers, in general, are over 55 and only just computer literate having moved in 1989, reluctantly, from a cut and paste editorial system. A system that owed its genesis to advanced technological change much earlier in the century; essentially coloured pencils, scissors and glue.

"AM are substantial documents covering the handling of the aircraft under normal and abnormal flight conditions, the description of the aircraft structure and systems and their operation under normal and failure modes and the limitations that apply to all phases of flight. Extensive use is made of A3 and A4 illustrations in both black and white and colour. FRC are, for most modern aircraft, large documents in A5 format and can be as large as 160 sheets. The contents are checks and drills to be carried out by the crew during the period they occupy the flight deck or cockpit and include normal and emergency procedures.

"Moving to SGML provides considerable benefits, many well documented which need not be rehearsed again here, others less well known like the focusing, within our organisation, of the authors' energies. To achieve these benefits cost effectively and efficiently many lessons can be drawn from our, initially quite naive, but ultimately very successful change to the standard. [...] We now have the basis of a system which will protect our information in the long term.

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19961119]

Möller, Anders. En introduktion till Standard Generalized Markup Language. Lund, Sweden: Utbildningshuset Studentlitteratur [Bratt International AB], 1994. Extent: [?] pages. ISBN: 91-44-47001-0. Author's affiliation: [KTH]; E-mail: moller@msi.se, or moller@msivax.sunet.se.

See the online description from the publisher; Studentlitteratur AB, P.O. Box 141, S-221 00 Lund, Sweden; Phone: +46 - 46 31 22 02, fax: +46 - 46 30 49 62.



[CR: 19970429]

Moeller, Michael. "Markup Language Takes HTML to Task." PC Week 14/15 (April 14, 1997) 6.

"Abstract: The World Wide Web Consortium revealed details of the initial draft specification for constructing complex hyperlinks in eXtensible Markup Language (XML) during the week of April 7, 1997. XML is an innovative linking technology that is based in SGML but reduces SGML's superfluous features. XML expands, however, on HTML by allowing for the creation of larger and more structured documents as well as complex, multiple hyperlinking. XML is promised to also differ from HTML by allowing users to develop custom tags. XML distinguishes content formats from presentation formats that allows XML Web pages to be used on various devices including PDA's and smart phones. XML has been in development for the past year and is almost complete. Microsoft, Adobe Systems, Sun Microsystems, Novell and HP have endorsed XML and Netscape is considering the potential benefits of XML technology."

The text of the article is available online. See the main XML entry for more information on the Extensible Markup Language ("sub-variant") of SGML.



[CR: 19950716]

Moen, Bill. "Metadata for Network Information Discovery and Retrieval: Workshop Report." Information Standards Quarterly 7/2 (April 1995) 1-4. Author's affiliation: Syracuse University.

Discusses the origins of the Dublin Core Metadata Element Set (defined in a DTD) to be used for describing network information. The workshop was held on March 1-3, 1995; it was sponsored by OCLC and NCSA (National Center for Supercomputing Applications). During the workshop, Michael Sperberg-McQueen and Susan Hockey contributed insights which have been learned in the development of the TEI header. See the entry for the workshop on the conference page, with further links.

A record of the workshop is available via the OCLC WWW server.



Mohan, Suruchi. "Markup Language a Mixed Bag for Publishers." Computerworld 29/14 (April 3 1995) 53.

"Abstract: The Research Institute of America (RIA) uses the Standard Generalized Markup Language (SGML) to automate publication of its reference materials covering taxation and other regulatory information. The company was able to save $2.2 million a year with the new publishing system, but editors feel deprofessionalized because they now do work that clerks had been doing previously, according to RIA VP and CIO Paul Jensen. The publishing system includes an SGML database with tools for editing and composition; editors determine what needs to be edited and updated, pull it from the database, and use the SGML editor to edit the material. The text is sent back to the database after markup, and the materials are composed automatically and printed on a laser printer. Dataquest principal analyst Jennifer Mitchell says SGML forces authors to think in terms of fragments of information rather than sequences of still, paginated images."

Discusses the use of SGML by the Research Institute of America (RIA); cf., RIATAX.



Mohr, W.; Rostek, L. "TEDI: An Object-Oriented Terminology Editor." Pages 363-374 (with 8 references) in TKE '93. Terminology and Knowledge Engineering. Proceedings of the Third International Congress on Terminology and Knowledge Engineering. International Congress on Terminology and Knowledge Engineering [TKE'93], Cologne, Germany, 25-27 August 1993. Edited by: K.-D. Schmitz. Frankfurt/Main, Germany: Indeks Verlag, 1993. viii + 472 pages. Author's affiliation: GMD-IPSI, Darmstadt, Germany.

Abstract: The paper discusses TEDI, a prototype object-oriented terminology editor, which was developed within the research department PaVE at GMD IPSI during the last three years. PaVE stands for Publication and Visualization Environment and its primary task is to develop concepts, models, and innovative tools for the preparation and presentation of hypermedia documents. Applications are hypermedia reference works and an Individualized Electronic Newspaper. TEDI is the central component of the Editor's Workbench within this publication development environment and is an object-oriented tool for building a consistent terminology base that allows for the extraction of print as well as hypermedia publications. TEDI supports import and export of SGML files.



Moline, Judi; Benigni, Dan; Baronas, Jean (eds). Proceedings of the Hypertext Standardization Workshop (January 16-18, 1990 National Institute of Standards and Technology, Gaithersburg, MD). Gaithersburg, MD: NIST, March, 1990.

Several papers in this proceedings volume reference SGML, HyTime and SMDL as potentially valuable in creating hypertext/hypermedia standards. Reports from the workshop's Data Interchange Group and User Requirements Discussion Groups likewise identified SGML or SGML-like GIs as having probable priority in emerging standards formulations. [NIST Special Publication 500-178. CODEN: NSPUE2. ]



[CR: 19950925]

Moller, H. "CODE-Consistent Document Engineering: Consistency and Correctness." Pages 145-162 (with 18 references) in Hypertext - Information Retrieval - Multimedia. Proceedings of HIM '95. Hypertext, Information Retrieval, Multimedia '95, Konstanz, Germany, 5-7 April, 1995. Edited by R. Kuhlen and M. Rittberger. Konstanz, Germany: Universitatsverlag Konstanz, 1995. Author's affiliation: Siemens AG, Germany.

"Abstract: With the increasing use of SGML (Standard Generalized Markup Language) for the exchange of technical documentation usually comes along a semantic structuring of such documents. This leads to new ways to support authors in the creation of new versions. Starting with a short analysis of the problems that arise in the development of extensive technical documentation, this article investigates the central problem, how to preserve consistency. A model for documentation is presented which extends essential concepts of SGML for non-hierarchical structures. Two semantic classes of access structures are identified, namely procedures, cycles, and the concept of a road net is introduced. To model consistency, four possible relations are described-identity, translation, comes-before, explains-and their relation to the access structure. After the definition of the terms correctness and consistency, an example shows how inconsistencies can be checked and corrected, either automatically or interactively by the author."



[CR: 19961226]

Montgomery, Neil E; Fye, Robert F.; Billington,Timothy. "A Study in Contrast: An IETM Compared to an ETM." Pages 687-700 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Aquidneck Management Associates, LTD, Aquidneck Corporate Park, 28 Jacome Way, Newport, RI 02842, USA; Tel: 401-849-8900; FAX: 401-848-0638; Email: NMontgomery@amaltd.com; Email: RFye@amaltd.com; WWW: http://amaltd.com.

Abstract: "Electronic Technical Manuals (ETMs) vary from simple raster 'page turners' to complete IETMs. For each type, an overview of major aspects will be presented. SGML-based ETMs and SGML-based IETMs (Interactive Electronic Technical Manuals) will be compared and contrasted, highlighting fundamental differences in function, architecture, and applicability. An ETM display engine and sample ETM document will be used with an IETM display engine for the demonstration. As part of the presentation the information structuring capabilities of the MIL-PRF-87269 IETM DTD (Document Type Definition) will be covered."

Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



Moore, Mark; Bowen, Ted Smalley. "SGML: The Lingua Franca of Tomorrow's Open Systems?" PC Week 11/2 (January 17, 1994) 31-32.

"Abstract: Standard Generalized Markup Language (SGML) is rapidly becoming the standard to which document management system vendors must conform to achieve true cross-platform compatibility. SGML, which has been in existence for 10 years, defines the properties of documents so they may be viewed and manipulated by any system. Documents are broken down into component data objects, rather than being stored as monolithic files, enabling organizations to distribute, reuse and maintain the electronic files in heterogeneous environments. Every SGML document is associated with a document type definition (DTD), which defines structural rules. Thereby, all graphics, text and multimedia attributes are tagged, and their formatting and structural organization is recorded. Electronic Book Technologies Inc. utilizes SGML in its DynaText electronic information publishing tool, which is being used for document processing by such firms as AT&T and navigational chart vendor Jeppesen Sanderson Inc.



[CR: 19980413]

Moran, Kevin; Waldt, Dale. "Industrial-Strength SGML on the Web." <TAG>: The SGML Newsletter 11/3 (March 1998) 4-12. ISSN: 1067-9197. Authors' affiliation: Research Institute of America Group.

The article provides a case study of the RIA Group's [RIAG, Research Institute of America Group] implementation of OnPoint System and CHECKPOINT, the latter being a Web-based delivery of a multi-gigabyte SGML repository. RIAG publishes information on US tax law, human resources, and corporate financial data; it needs to deliver this information on four channels: print, CDROM, online services, and the World Wide Web.

See also the following entry.



[CR: 19971227]

Moran, Kevin; Waldt, Dale. "Industrial-Strength SGML on the Web." Pages 585-592 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Kevin Moran]: VP Product Technology, Research Institute of America Group, Inc.; Email: kmoran@riag.com; [Dale Waldt]: VP Product Systems, Research Institute of America Group, Inc.; Email: dwaldt@riag.com.

Abstract: "The largest commercial application of SGML being fed to the Web was released this summer by RIAG, Inc. CHECKPOINT (tm) is a paid-access site offering gigabytes of tax-related law, regulations, cases and analytical material to professionals in the accounting and corporate finance markets. The frequency of update, depth and breadth of coverage, and sophisticated functionality of this system have challenged even the most powerful Web-based tools and systems. RIAG also had to reengineer internal SGML editorial systems to take advantage of some of these unique capabilities."

"The main Architect of this system, Kevin Moran, VP of Product Technology at RIAG, Inc., will describe and demonstrate the capabilities of CHECKPOINT (tm) as well as the intense development project requirements needed to deploy CHECKPOINT. Dale Waldt, VP of Product Systems will describe the reengineering efforts of the SGML-based publishing systems that were needed to support the CHECKPOINT system data feeds."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



Morris, Robert A.; Blachman, Ed M.; Meyer, C. "A constraint-based editor for linguistic scholars." Electronic Publishing: Origination, Dissemination, and Design (EPODD) EP '94. Fifth International Conference on Electronic Publishing, Document Manipulation, and Typography, Darmstadt, Germany, 13-15 April 1994. 6/4 (December 1993) 349-360. 10 references. Authors' affiliation: Department of Mathematics and Computer Science, Massachusetts University, Boston, MA, USA.

Abstract: "A constraint-based interactive structure editor for use by linguists is described. Multiple, interrelated constraint sets are supported. A novel search mechanism is introduced which modifies itself locally dependent on document structure as the search progresses."



[CR: 19971202]

Morrison, Alan; Fix, Jakob. "Delivering Electronic Texts Over the Web -- The Experiences and Practices of the Oxford Text Archive ." Pages 99-102 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Oxford Text Archive, Oxford University; Email: jakob.fix@COMPUTING-SERVICES.OXFORD.AC.UK.

Summary: "The Oxford Text Archive specializes in the area of electronic texts, and strongly advocates the use of TEI-conformant SGML. The majority of the Archive's collection is now stored as TEI Lite texts, and it is these materials that we seek to distribute as part of our contribution to the workings of the AHDS. However, we also need to operate within the framework of the AHDS, which will mean making our holdings accessible via the AHDS' integrated catalogue (which aims to integrate the collections held at all the Service Providers), and catering for the requirements of end users who may have little knowledge of either SGML or TEI. The Archive is currently facing the dual challenge of how to make our texts accessible through the AHDS catalogue (which is likely to offer little or no support for TEI-conformant texts), and how to deliver our texts to end-users in a format they will find useful for their purposes. We believe that the issues faced by the Oxford Text Archive are in many ways more crucial that those confronting the other Service Providers, as many of those cater for dedicated, subject-oriented communities (e.g. archaeologists, historians etc.), whereas electronic texts are frequently of interest to a broad range of humanities disciplines, and beyond."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/fix.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.



Moser, Karen D. "SGML Standard Gaining Momentum: Market Grows for Standard-Based Wares that Ease Creation of Compound Documents." PC Week 11/9 (March 7, 1994) 27-28.

"Abstract: The Standard Generalized Markup Language (SGML) specification emerges from obscurity as vendors release software products for creating compound documents at the Documation '94 trade show in Los Angeles. Information Systems (IS) organizations require the specification to integrate documents created in disparate applications into a single resource. SGML, which originated as a US Dept of Defense standard, offers a strong framework to manage multiple documents and facilitates the search for information. The standard divides documents into data, structure and format information types and names each piece of information. SoftQuad International Inc. intends to employ SGML, as do Information Dimensions Inc, HAL Computer Systems and Folio Corporation."

The text of the article is available online via the SoftQuad WWW server.



[CR: 19961226]

Motoyama, Tetsuro. "Brief Introduction to Standard Page Description Language: ISO/IEC 10180." Pages 203-222 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Ricoh Corporation, 3001 Orchard Parkway, San Jose, CA 95134-2088, USA; Tel: +1 408-954-5445; Email: motoyama@rsivax.str.ricoh.com.

Abstract: "SPDL was published in 1995 as a language to describe the final form of a document. The document processing model of ISO/IEC JTC1/SC18 WG8 described SPDL as the final stage of three steps; creation/edit (SGML), format (DSSSL), and presentation (SPDL). SPDL had two editors, one from Xerox and the other from Adobe. Having two editors might have had some impact on the publication schedule of SPDL.

The architecture of SPDL has influence of both Xerox Interpress and Adobe PostScript. Unlike PostScript, SPDL has a document structure using elements such as Picture and Pageset. This hierarchical structure defines the scope of various settings such as dictionaries, dictionary stack, and various imaging parameters. Under SPDL, Picture is a unit for imaging and can contain other Pictures. Because of the document structure, SPDL knows when to image one page by keying on the highest level of Picture. Therefore, SPDL does not require an operator, showpage, of PostScript to notify the imaging device to perform rendering on the imaging medium. SGML is used by the clear text encoding of the document structure.

PDL in one of the subordinate element (Token Sequence) under Picture describes the images and graphics to be rendered. This PDL is a stack oriented languages very similar to PostScript. In fact, many PDL operators of SPDL in clear text format are taken from PostScript.

SPDL uses ISO/IEC 9541 for font handling and glyph referencing. One way to reference glyphs in the mapping between integers and glyphs within SPDL is to use the registration numbers assigned by AFII. One of the format is afiixxxx where xxxx is the AFII registration number.

In addition to the clear text encoding, SPDL has binary encoding using ASN.1 for the document structure and the own encoding for PDL section. Except the positions of comments in the document structure, SPDL clear text encoding and binary encoding can map to each other easily.

One possible application of SPDL is to incorporate Picture element as an imaging portion in such a application as HTML. Using Picture, graphs can be sent as PDL programs rather than images. The start tag of Picture contains an attribute to identify the content to be ISO/IEC 10180 SPDL.

The C source code to translate SPDL document to PostScript for printing is available at ftp://ftp.ornl.gov/pub/sgml/wg8/spdl/software and http://www.ornl.gov/sgml/wg8/spdl/software/."

Further information on SPDL (Standard Page Description Language) is available in the main entry of the SGML/XML Web Page.

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.



[CR: 19951113]

Mulder, Koen H. Pitfalls[Markup 1990 Keynote Address]. [SGML] Markup 1990 Converence, Charleston, South Carolina. Tuesday, June 5, 1990. Sponsored by GCA. Arnhem, Netherlands: Gouda Quint, 1990. Extent: 25 pages. ISBN: 90-6000-738-7. Author's affiliation: .

This booklet is the monograph version of the text of Koen Mulder's keynote address at Markup 1990. It featured a series of lessons called "pitfalls" - surprises which likely await large companies making the migration to SGML (as Mulder's company did), if critical features of the paradigm schift are not understood within the corporation.



Mumford, Anne (editor). Document Exchange: The Use of SGML in the UK Academic and Research Community. Workshop Proceedings 5-7 March 1990. Advisory Group on Computer Graphics, 1990.

This proceedings volume contains several important contributions on SGML (submitted by Anne Mumford, Paul Ellison, Martin Bryan, Angella Scheller, David Duce and Ruth Kidd, Tim Niblett, Lou Burnard, John Larmouth, Paul Bacsich and Paul Lefrere, Malcolm Clark, and Kathleen Crennell). The volume is available from the organizer: Ann M. Mumford, Computer Centre, Loughborough University, Loughborough LE11 3TU, UNITED KINGDOM; TEL: 44 509 222312; FAX: 44 392 211603. See a full list of contributors and presentation-titles in "Document Exchange in UK Universities," SGML Users' Group Newsletter 17 (August 1990) 10.



[CR: 19961018]

Munson, Ethan V. "A New Presentation Language for Structured Documents." Pages 125-138 (with 18 references and 6 figures) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Author's affiliation: Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee. Email: munson@cs.uwm.edu.

Abstract: "PSL is a new presentation specification language for structured documents. It is the first such language that is fully configurable, and it is also extensible. PSL is able to support a very general form of out-of-order layout without having to provide a general system of tree transformations. PSL also makes an explicit distinction between the specified layout of the elements of a documentand the actual layout that results from the formatting process. PSL's syntax and semantics are simple and general. This paper describes the syntax and semantics of PSL using a simple text document as a running example, and compares PSL to a number of other presentation specification languages."

PSL is "the specification language of Proteus, a portable presentation specification system that supports multiple synchronized presentations." The advantages of PSL (relative to DSSSL, for example) are said to be the high degree of configurability; an interface function mechanism for extensibility; a means for specifying out-of-order layout without requiring a tree-transformation process (e.g., as with DSSSL's STTP; making an explicit distinction between specified layout and actual layout; simplicity.

For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.



[CR: 19980507]

Murata Makoto. Data Model for Document Transformation and Assembly. Extended Abstract for PODDP '98 Presentation. Takatsu-ku, Kawasaki-shi, Japan: Fuji Xerox Information Systems Co., Ltd, , 1995. Extent: 13 pages (with 14 references). Author's affiliation: Fuji Xerox Information Systems. Postal: KSP/R&D 9FA-7, 2-1 Sakado 3-chome, Takatsu-ku, Kawasaki-shi, Kanagawa 213, Japan. Email: murata@apsdc.ksp.fujixerox.co.jp; WWW: Home Page.

"Abstract. This paper shows a data model for transforming and assembling document information such as SGML or XML documents. The biggest advantage over other data models is that this data model simultaneously provides: (1) powerful patterns and contextual conditions, and (2) schema transformation. Patterns and contextual conditions capture conditions on subordinates and those on superiors, siblings, subordinates of siblings, etc, respectively, and have been recognized as highly important mechanisms for identifying document components in the document processing community. Meanwhile, schema transformation has been, since the RDB, recognized as crucial in the database community. However, no data models have provided all three of patterns, contextual conditions, and schema transformation.

"This data model is based on the forest-regular language theory. A schema is a forest automaton and an instance is a finite set of forests (sequences of trees). Since the parse tree set of an extended-context free grammar is accepted by a forest automaton, this model is a generalization of Gonnet and Tompa's grammatical model. Patterns are captured as forest automatons; contextual conditions are pointed forest representations (a variation of Podelski's pointed tree representations). Controlled by patterns and contextual conditions, an operator creates an instance from an input instance and also creates a reasonably small schema from an input schema. Furthermore, the created schema is often minimally sufficient; any forest permitted by it may be generated by some input instance."

Conclusion: "We have presented a data model that provides patterns and contextual conditions as well as schema transformation. Patterns and contextual conditions have been heavily used by SGML/XML transformation engines, while schema transformation is common in the database theory. But none of the previous works provides all three of patterns, contextual conditions, and schema transformation. We believe that this data model provides a theoretical foundation of future SGML/XML database systems. However, there are many remaining issues. First, we do not really know if our operators are powerful enough. There might be some other useful operator that cannot be mimicked by our operators. Second, in order to perform pattern matching and contextual condition checking without scanning the entire document, we probably have to impose some restrictions on patterns and contextual conditions. Such restrictions help to provide index files for examining patterns and contextual conditions."

The document is an extended abstract for a PODDP '98 presentation by Makoto in the session "Document models and structures". PODDP '98: Workshop on Principles of Digital Document Processing, March 29-30, 1998, Saint Malo, France. The paper shows a data model for transforming and assembling document information such as SGML and XML documents; uses a declarative query language based upon principles articulated in "DTD Transformation by Patterns and Contextual Conditions", published also in an earlier related version in LNCS 1293, "Transformation of Documents and Schemas by Patterns and Contextual Conditions." Available online in PDF format; [local archive copy]. See also the database section, SGML/XML and Forest/Tree Automata Theory.



[CR: 19971227 MD: 19980507]

Murata, Makoto. "DTD Transformation by Patterns and Contextual Conditions." Pages 325-332 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Makoto Murata]: Fuji Xerox Information Systems; Email: murata@fxis.fujixerox.co.jp; KSP R&;D 9A7, 2-1 Sakado 3-chome, Takatsu-ku, Kawasaki-shi 213 Japan; Phone: +81-44-812-7230; FAX: +81-44-812-7231.

Abstract: "On the basis of the tree automaton theory, this paper demonstrates DTD transformation. Controlled by patterns and contextual conditions, operators transform not only XML documents, but also DTDs. It is guaranteed that transformation of XML documents permitted by the input DTD creates XML documents permitted by the output DTD. Furthermore, the output DTD is minimally sufficient. Patterns are conditions on (possibly non-immediate) subordinate nodes, and contextual conditions are conditions on non-subordinate nodes (e.g., superior nodes, ancestor nodes, sibling nodes, and subordinates of sibling nodes)."

"This presentation shows transformations of DTDs and XML documents. [...] Such DTD transformation is highly important for at least two reasons. First, it helps DTD evolution; by writing an update program with operators, we can update not only instances, but also DTDs. Second, transformation from a DTD to another DTD becomes much easier; we can examine DTDs created by transformation operators. This work is based on the theory of tree automatons. A DTD is first translated to a tree automaton. This tree automaton is then repeatedly transformed. Finally, this tree automaton is translated back to a DTD. Patterns and contextual conditions of operators are also captured as tree automatons."

"[Conclusion and future work:] We have presented an example of DTD transformation. Although this example uses a single source DTD, the underlying framework can easily handle multiple source DTDs. Implementation on top of an automaton construction toolkit called 'Grail' is in progress. A declarative query language for document database systems is also in progress."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Available online in HTML format: "DTD Transformation by Patterns and Contextual Conditions."; [local archive copy]. See also the database section, SGML/XML and Forest/Tree Automata Theory.

Also available online, twenty-five slides from the presentation have been made available by the author. See also the "mathematical" version of this paper, restricted to binary trees: "Transformation of Documents and Schemas by Patterns and Contextual Conditions", pages 153-169 (with 13 references) in Principles of Document Processing. Proceedings of the Third International Workshop, 1996.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19971227]

Murata, Makoto. DTD Transformation by Patterns and Contextual Conditions. Paper presented at SGML/XML '97 [97/12/8]. Kanagawa, Japan: Fuji Xerox Information Systems, December 1997. Author's affiliation: Fuji Xerox Information Systems, Kanagawa, Japan; Email: murata@apsdc.ksp.fujixerox.co.jp or makoto@mars.netspace.or.jp.

Abstract: "DTD transformation using tools based on the theory of tree automata are described. Controlled by patterns (conditions on descendant nodes) and contextual conditions (conditions on ancestors, siblings, and descendants of siblings), operators can transform not only SGML instances, but the DTDs which govern them. Transformations of SGML instances permitted by the input DTD are guaranteed to conform to the output DTD. Such tools can help manage DTD evolution, keep document instances synchronized with an evolving DTD, develop transformations between unrelated DTDs, and guarantee that SGML-to-SGML transformations produce documents which actually conform to the target DTD."

Twenty-five slides from the presentation have been made available online. See also the "mathematical" version of the paper, restricted to binary trees: "Transformation of Documents and Schemas by Patterns and Contextual Conditions".



[CR: 19970726]

Murata, Makoto. File Format for Documents Containing both Logical Structures and Layout Structures." Electronic Publishing: Origination, Dissemination and Design (EPODD) 8/4 (December 1995 [appeared July 1997]) 295-317. With 12 references. ISSN: 0894-3982. Author's affiliation: Fuji Xerox Information Systems Co. Ltd., KSP 9A7, 2-1 Sakado 3-Chome, Takatsu-ku, Kawasaki-shi, Kanagawa-ken 213, Japan. Email: murata@apsdc.ksp.fujixerox.co.jp. Note: Murata is also a member of the Advisory Editorial Board for EPODD.

"Abstract: A file format for documents containing both the logical and layout structures is presented. Unlike CONCUR of SGML, this format can represent a large class of WYSIWYG documents, such as documents containing footnotes. The key idea is to untangle the logical-layout relationship by introducing embedding nodes and mould nodes. Embedding nodes are nodes representing logical-layout correspondences; mould nodes are temporary dummy nodes, which will later be replaced by layout nodes. Once the logical-layout relationship is untangled, it becomes possible to represent a document by a sequential data stream which can be readily stored in a file. This file format has the following advantages: first, it is compact. No leaf nodes are duplicated; tags for embedding nodes and mould nodes do not require much space; the text formatting output can be omitted. Second, the depth-first traversal of logical structures and that of layout structures are efficient and easy to implement. Files are sequentially read only once, and neither logical structures nor layout structures need to be copied into the main memory. Third, the logical-layout correspondences are explicitly represented by tags for embedding nodes."

Summary: The file format presented in the paper may be regarded "as an extension to [SGML] CONCUR", emerging from research on the theory behind CONCUR. CONCUR, it is said, will handle only simple documents; the proposed file format can handle a large class of WYSIWYG documents.

[Paper received February 20, 1995; revised November 16, 1995.]

See also, substantially: Murata, Makoto, File Format for Documents Containing Both Logical Structures and Layout Structures, Presentation at SGML '94 Conference, Theory Track.



[CR: 19960724]

Murata Makoto. File Format for Documents Containing Both Logical Structures and Layout Structures. Presentation at SGML '94 Conference, Theory Track. Green-Tech Nakai, Nakai Town, Ashigarakami County, Kanagawa Prefecture, Japan: Fuji Xerox, [Presented] Thursday, November 10, 10:45 AM, 1995. Author's affiliation: Fuji Xerox Information Systems. Postal: KSP/R&D 9FA-7, 2-1 Sakado 3-chome, Takatsu-ku, Kawasaki-shi, Kanagawa 213, Japan. Email: murata@apsdc.ksp.fujixerox.co.jp.

Abstract: "A file format for documents containing both the logical and layout structures is presented. The key idea is to disentagle the logical-layout relationship by introducing embedding nodes and mold nodes. Embedding nodes are ones representing logical-layout correspondences; mold nodes are temporary dummy nodes, which will later be replaced by layout nodes."

"To store a document in a file, embedding nodes and mold nodes are first introduced into the document. Then, a data stream is created from the document and stored in the file. This file format has the following advantages. First, it is compact. No leaf nodes are duplicated; tags for embedding nodes and mold nodes do not require much space; the text formatting output can be omitted. Second, the depth-first traversal of logical structures and that of layout structures are efficient and easy to implement. Files are sequentially read only once, and neither logical structures nor layout structures need to be copied into the main memory. Third, the logical-layout correspondences are explicitly represented by tags for embedding nodes. Last, unlike CONCUR of SGML, many WYSIWYG documents, such as documents containing footnotes, can be represented." [Alternate abstract]

A version of this document has been accepted for publication as "File Format for Documents Containing Both Logical Structures and Layout Structures" in Electronic Publishing: Origination, Dissemination and Design (EPODD), Volume 9 Number 4. See also a favorable review and appraisal of the paper by C. M. Sperberg-McQueen in the SGML '94 Trip Report [". . . Makoto Murata, of Fuji Xerox, gave what I thought was the most substantial technical paper of the conference. . ."].



[CR: 19980505]

Murata Makoto. Forest-Regular Languages and Tree-Regular Languages. Technical Report. [Green-Tech Nakai, Nakai Town, Ashigarakami County, Kanagawa Prefecture,] Japan: Fuji Xerox, May 26, 1995. Extent: 12 pages (with 7 references). Author's affiliation: Fuji Xerox Information Systems. Postal: KSP/R&D 9FA-7, 2-1 Sakado 3-chome, Takatsu-ku, Kawasaki-shi, Kanagawa 213, Japan. Email: murata@apsdc.ksp.fujixerox.co.jp.

Summary: "Forest-regular languages were studied by Pair et al. [PQ68] and Takahashi [Tak 75]. They are extensions of tree-regular languages [Tha87]. We borrow some concepts from these papers but adopt definitions more similar to those for string-regular languages." [from the Introduction]

The document is available online in PDF format; [local archive copy]. See Murata's other papers referenced in this database (e.g., the PODP '96 paper) for relevance to SGML/XML documents.



[CR: 19980505]

Murata, Makoto. "Transformation of Documents and Schemas by Patterns and Contextual Conditions." Pages 153-169 (with 13 references) in Principles of Document Processing. Proceedings of the Third International Workshop. PODP '96, Third International Workshop. Palo Alto, California. September 23, 1996.. Edited by Charles Nicholas (Department of Computer Science and Electrical Engineering, UMBC, Baltimore, MD) and Derick Wood (Department of Computer Science, HKUST, Clear Water Bay, Kowloon, HONG KONG). Lecture notes in artificial intelligence. Lecture Notes in Computer Science, 1293. Berlin / London: Springer-Verlag, 1997. ISBN: 354063620X. Author's affiliation: Fuji Xerox..

Abstract: [With irregular representation of symbols] "On the basis of the tree-regular language theory, we study document transformation and schema transformation. A document is represented by a tree t, and a schema is represented by a tree-regular language L. Document transformation is defined as a composition of a marking function m/sub C//sup P/ and a linear tree homomorphism h, where P is a pattern and C is a contextual condition. Pattern P is a tree-regular language, and contextual condition C is a pointed tree representation. Marking function m/sub C//sup P/ marks a node if the subtree rooted by this node matches P and the envelope (the rest of the tree) satisfies C. Linear tree homomorphism h then rewrites the tree, for example, by deleting or renaming marked nodes. Schema transformation is defined by naturally extending document transformation; that is, the result of transforming a schema L, denoted h(m/sub C//sup P/(L)), is {h(m/sub C//sup P/(t))[]t in L}. Given a tree automaton that accepts L, we can effectively construct a tree automaton that accepts h(m/sub C//sup P/(L)). This observation provides a theoretical basis for document transformation engines and document database systems."

The paper is available online in PDF format; [local archive copy]. See also the database section, SGML/XML and Forest/Tree Automata Theory.



Murata Makoto; Hayashi, Koichi. "Formatter Hierarchy for Structured Documents." Pages 77-94 (with 11 references) in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Authors' affiliation: System Technology Research Lab, Fuji Xerox Co., Ltd.

Abstract: This paper describes a formatting model of structured documents. In this model, a document is formatted by a hierarchy of co-interacting formatters. Each formatter creates layout subtrees, by pouring logical streams into layout streams. This formatting model was originally proposed in Interscript. We extend it for tnt, and clearly illustrate the formatting algorithm. Finally, we propose some new techniques for incremental formatting, reduced formatting, and parallel formatting.



[CR: 19970726]

Murata Makoto; Nakatsuyama, H. "A Theoretical Foundation of the DSSSL Location Model." Mathematical and Computer Modelling 25/4 (February 1997) 95-107 (with 12 references). Author's affiliation: Fuji Xerox Information Systems Co. Ltd., KSP 9A7, 2-1 Sakado 3-Chome, Takatsu-ku, Kawasaki-shi, Kanagawa-ken 213, Japan.

"Abstract: In the location model of the Document Style Semantics and Specification Language (DSSSL), one can use tree patterns to locate nodes in logical structures of documents. A tree pattern consists of conditions on nodes and those on their hierarchical relationships. As a first step towards efficient implementations, the paper shows a theoretical foundation of the location model. Tree patterns are first expressed by sentences of branching time temporal logic. These sentences are then converted to well formed attribute grammars. Thus, the library of attribute grammar evaluation techniques can be used to implement the location model. It is our belief that this observation is significant for future implementers of DSSSL. Furthermore, the converted attribute grammars can be evaluated by traversing logical structures several times. The number of required traversals can be found by examining the original sentences."



[CR: 19960912]

Murphy, Gregory. "Review of Making Hypermedia Work: A User's Guide to HyTime, by Steven J. DeRose and David Durand." Computers and the Humanities (CHUM) 30/1 (1996) 93-97. ISSN: 0010-4817. Author's affiliation: CETH, Text Systems Manager.

See the bibliographic reference for Making Hypermedia Work: A User's Guide to HyTime.



[CR: 19950828]

Murphy, Gregory. "Using the TEI to Encode Textual Variations: Some Practical Considerations." Pages 83-84 [extended abstract from paper] in ACH/ALLC '95: The 1995 Joint International Conference. Conference Abstracts, Posters and Demonstrations. ACH/ALLC '95: The Joint International Conference, Santa Barbara, California, July 11-15, 1995. Santa Barbara: University of California/ACH/ALLC, 1995.

The author discusses strategies for using TEI/SGML for the encoding of textual variants. The work described has been done at CETH (Center for Electronic Texts in the Humanities) using currently-available SGML software.



[CR: 19951226]

Murray, Alan J. "Electronic document management for North Sea offshore oil and gas platforms." In Proceedings of the First SGML BeLux Users' Conference . SGML BeLux '94, Brussels. March 22, 1994. Edited by Hans C. Arents. Leuven, Belgium: Katholieke Universiteit Leuven, 1994. Author's affiliation [Murray]: IT Portfollo Head, Shell U.K. Exploration and Production, United Kingdom.

"Abstract: Shell U.K. Exploration and Production have implemented an integrated electronic document management system following a thorough analysis of the document lifecycle process. The system integrates a number of commercially available software products which operate to international standards (e.g. SGML-ISO 8879; CGM-ISO 8632) and have been integrated in such a way that they can be changed out as and when the need arises. The resulting system ensures that information is accurate and up to date. This information is distributed as electronic books to offshore platforms via a CD-ROM, allowing offshore personnel to search for and view documents using a PC. The project faced a number of challenges, some of which were predicted, others had to be addressed as they arose. The learning points include a confirmation of the usefulness of human factors expertise, the value of data and process analysis, the need to adopt and stand by international standards, and the criticality of senior management commitment and support."

The document is available online in HTML format: "Electronic document management for North Sea offshore oil and gas platforms" [mirror copy, December 1995]. For further details on the Conference and BeLux, see the contact information for SGML BeLux.



[CR: 19971107]

Murray-Rust, Peter. "Chemical Markup Language: A Simple Introduction to Structured Documents." Pages 135-147 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: The Virtual School of Molecular Sciences (VSMS).

Abstract: "Structured documents in XML are capable of managing complex documents with many separate information components. In this article, we describe the role of the XML-LANG specification in supporting this. Examples are supplied explaining how components can be managed and how documents can be processed, with an emphasis on scientific and technical publishing. We conclude that structured documents are sufficiently powerful to allow complex searches simply through the use of their markup."

"XML is the ideal language for the creation and transmission of database entries. The use of entities means it can manage distributed components, it maps well onto objects and it can manage complex relationships through its linking scheme. Most of the software components are already written." [conclusion, online version]

A version of this document is available online in HTML format: http://www.venus.co.uk/omf/cml/doc/tutorial/xml.html, or http://www.ch.ic.ac.uk/ectoc/echet96/+CML/epub/xml.html.



[CR: 19971107]

Murray-Rust, Peter. "JUMBO. An Object-Based XML Browser." Pages 197-206 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: Virtual School of Molecular Sciences.

Abstract: "JUMBO (Java Universal Markup Language) is an object-oriented XML browser/editor and transformation tool, written in Java. It has been developed as a development tool to explore the emerging XML-LANG and XML-LINK specifications, and implements most of the current proposals. Its emphasis is on the management of structured documents; specifically, their interpretation as trees. It provides behavior for ELEMENTS by providing Java classes for rendering and transformation. It is particularly aimed at nontextual applications where ELEMENTs (such as those in technical disciplines) require complex processing. JUMBO also implements much of the current XML-LINK spec, including TEI extended pointers and simple aspects of EXTENDED XML-LINKs."

Note: other information on JUMBO may be found on the VSMS server or on a server located in San Diego.



[CR: 19970804]

Murray-Rust, Peter. "Scientific Publishing in the 21st Century - XML!" International SGML Users' Group Newsletter 3/3 (July 1997) 14. ISSN: 0952-8008. Author's affiliation: Virtual School of Molecular Sciences (URL: http://www.vsms.nottingham.ac.uk/.

The author explains why he thinks XML provides a suitable framework for solving the "data exchange" problem insofar as scientific information and documentation can be described uniformally using an agreed-upon syntax. TechML (Technical markup Language) and its DTD is also discussed.



Nadile, Lisa. "Electronic Publishing Moves Toward Object Technology." PC Week 12/11 (March 20, 1995) 33-34.

"Abstract: XSoft introduces at the Documentation show in Long Beach, California, its Astoria document management system, which can manage reusable components throughout a document. The program employs object-oriented technology developed by XSoft that supports multiple document types simultaneously. Also included are revision control, full-text retrieval and application-development tools. Storing information in reusable components is imperative for electronic publishing, and useful document-management systems should be SGML-aware, offer version-control tools and support the ability to replace pages in documents. Astoria is due to ship by mid 1995."



[CR: 19951227]

Naggum, Erik. "About 1500 postings to CTS [Usenet newsgroup comp.text.sgml] offering informed, intelligent, and lucid technical commentary on the SGML standard [ISO 8879:1986], many being necessitated as corrective to misinformation supplied on CTS by less expert contributors." Editorial vigilance and critical comment by maintainer of the official CTS archive. 1990-1995 [and continuing]. Author affiliation: SGML Repository, and Naggum Software (Postboks 1570 Vika, 0118 OSLO, NORWAY. Telephone: +47 2295 0313. Email: erik@naggum.no.

Editorial Note: Postings to CTS (since about 1989) are not generally cited in this bibliography, since they now [December 1995] number some 11,000 contributions. Many, however, deserve recognition as "published" articles based upon their technical merit and research quality. This characterization applies preeminently to the CTS postings submitted by Erik Naggum (the vast majority of them), who, while probably not infallible as regards understanding and interpretation of the Standard, has consistently contributed the highest quality technical commentary in his articles. His careful research and generous contribution of time merit special mention in this ad hoc bibliographic entry, as well as the deep gratitude of researchers who can now benefit from Erik's work through judicious use of the CTS archives. These archives are held in electronic format at the SGML Repository, as well as at some mirror sites.



[CR: 19970624]

Naggum, Erik. "Arguments against SGML." 1996. Published as part of the author's Web page.

The article summarizes a number of poorly designed features of SGML (especially some in the FEATURES clause, and those intended to minimize markup). Other formal language experts having a deep understanding of the 8879:1986 Standard have written criticisms in the same vein, though few so succinctly, or with such powerful credentials.

Summary: "All of the features and facilities which were created to satisfy specific needs are so ill-designed as to interfere with each other, creating an unknown set of features for anything smaller than an entire document. This actually means that to obtain interoperability and reuse, general agreement must be obtained on a specific set of features, precisely what SGML was trying to avoid."

Available online: "Arguments against SGML", by Erik Naggum [mirror copy].

A series of three posters delivered at SGML '94 presents some of the same criticisms in outline format: (1) poster 1 [mirror copy]; (2) poster 2 [mirror copy]; (3) poster 3 [mirror copy]. See also: a posting from July 1996 which summarizes Erik's decision to abandon active work in the SGML arena.



Naggum, Erik. "Answers to Frequently-Asked-Questions (FAQs) - for the UseNet Newsgroup comp.text.sgml." FAQ Document, Draft Version 0.0 Oslo, December 15, 1991.

The FAQ is available via Internet anonymous-FTP as ftp.ifi.uio.no:pub/SGML/FAQ.0.0. The latest version of the FAQ document may be fetched at any time from this public disk region, generously sponsored by The University of Oslo, Department of Informatics with oversight by Erik Naggum. The FAQ will also be found on servers which archive collections of FAQs. Suggestions for additional questions (or answers) to be included in the FAQ may be directed to the author: Erik Naggum; Naggum Software; Boks 1570, Vika; 0118 OSLO, NORWAY; Email: erik@naggum.no OR enag@ifi.uio.no on the Internet.



Naggum, Erik. "DSSSL." Posting, Congratulatory note on DSSSL. December 5, 1994. Author affiliation: SGML Repository, and Naggum Software (+47 2295 0313).

Posting to Usenet Newsgroup comp.text.sgml, December 5, 1994. Preliminary comments on DSSSL draft, i.e., ISO/IEC DIS 10179.2:1994. Information Technology - Text and Office Systems - Document Style Semantics and Specification Language (DSSSL)]; see the DIS full citation. Gives positive affirmation of the work of James Clark, Sharon Adler, and other members of the DSSSL team. The article will be found in the comp.text.sgml archives, as well as in the comp.text.sgml Digest (see digest entry) Volume 5, Issue 6 (1994-12-05). A copy of the article is also provided here.



[CR: 19980304]

National Bureau of Standards [US]. Computer Graphics Metafile (CGM). Federal information processing standards publication, FIPS PUB, 128. Gaithersburg, MD: U.S. Dept. of Commerce/National Bureau of Standards, 1987.

For sale by the National Technical Information Service, Springfield, VA. Shipping list no. 87-330-P. 1987 March 16. See also the FIPS 128-2 edition. For other information on CGM, see the main database entry for Computer Graphics Metafile.



[CR: 19970312]

Nica, Anisoara; Rundensteiner, Elke Angelika. "Uniform Structured Document Handling Using a Constraint-based Object Approach." Pages 83-101 in Digital libraries: research and technology advances. ADL '95 Forum. Selected Papers. Forum on Research and Technology Advances in Digital Libraries, ADL '95. McLean, Virginia, USA, May 15-17, 1995. Sponsored by NASA. Edited by Adam, Nabil R.; Bhargava, Bharat K.; Halem, Milton; Yesha, Yelena. Lecture Notes in Computer Science, volume 1082. Berlin/Heidelberg, Germany: Springer-Verlag, 1996. ISBN: 3-540-61410-9. ISSN: 0302-9743. Authors' affiliation: Department of Electrical Engineering & Computer Science, 1301 Beal Avenue, University of Michigan, Ann Arbor, MI, USA. WWW: Home Page [Rundensteiner]; Tel: +1 (313) 936-2971; Fax: +1 (313) 763-1503..

Abstract: "Complex multimedia document handling, including modeling, decomposition, and search across digital documents, is one of the primary services that must be provided by digital library systems. We present a general approach for handling structured documents (e.g., SGML documents) by exploiting object-oriented database technology. For this purpose, we propose a constraint-based object model capable of capturing in a uniform manner all SGML constructs typically used to encode the structural organization of complex documents. We present a general strategy for mapping arbitrary document types (e.g., article, journal, and book DTDs) expressed using standard SGML into our model. Most importantly, we demonstrate that our model is designed to handle the integration of diverse document types into one integrated schema, thus avoiding the generating of numerous redundant class definitions for similar document subtypes. The resulting document management system (DMS) is thus capable of supporting the dynamic addition of new document types, and of uniformly processing queries spanning across multiple document types. We also describe the implementation of our approach on the commercial DBMS system Illustra to demonstrate the ease with which our approach can be realized on current OODB technology, without requiring any special purpose constructs. Our DMS system provides support for integrated querying of both structural as well as content-based predicates across arbitrarily complex document types."

Available online in Postscript format: ftp://ftp.eecs.umich.edu/people/rundenst/papers/r-95-8.ps; [mirror copy]. See also: A. Nica and E. A. Rundensteiner, "A Constraint-based Object Model for Structured Document Management," Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, Technical Report CSE-TR-237-95, 1995; Abstract foe the TR; [mirror copy].



National Information Standards Organization. American National Standard for Electronic Manuscript Preparation and Markup. (ANSI/NISO Z39.59-1988). New Brunswick, NJ: Transaction Publishers [Published for NISO (National Information Standards Organization) by Transaction], 1991. xv +167 pages. ISBN: 0-88738-945-7. ISSN: 1041-5653.

This AAP (Association of American Publishers) standard is an application of SGML. An earlier form of the document was Standard for Electronic Manuscript Preparation and Markup. (ANSI/NISO Z39.59-1988). 1987, 1988. ANSI Z39.59-1988 was promoted to ISO DIS in 1992, and was to be published in revised format as ISO 12083:1993 in late 1993. See now "ISO 12083". The AAP/EPSIG application is SGML-conforming, and provides a suggested tagset for authors and publishers. The standard is said to "represent the first industry wide application of SGML (Standard Generalized Markup Language, ISO 8879). The standard defines the format syntax of the application of SGML publication of books and journals. The standard achieves two goals. First, it establishes a standard way to identify and tag parts of an electronic manuscript so that computers can distinguish between these parts. Second, it provides a logical way to represent special characters, symbols, and tabulator material, using only the ASCII character set usually found on a standard keyboard." The standard is available for $75 (75 US dollars) from Transaction Publishers or from NISO: Transaction Publishers, Department NIS091, Rutgers--The State University, New Brunswick, NJ 08903, TEL: (1 908) 932-2280; FAX: (1 908) 932-3138; NISO is at National Information Standards Organization, P.O. BOX 1056, Bethesda, MD 20827, Tel: (1 301) 975-2814, FAX: (1 301) 869-8071; Email (Internet): niso@enh.nist.gov (or BITNET) niso@nbsenh. Discounts are available for purchase of multiple copies. Equally, the volume may be ordered from EPSIG.



[CR: 19980104]

National Information Standards Organization (NISO). Z39.18-195. Scientific and Technical Reports. Elements, Organization, and Design. National information standards series, 1041-5653. Bethesda, MD: NISO Press, 1995. Extent: viii + 38 pages. ISBN: 1-880124-24-6.

Summary: Provides explicit guidance on the preparation of reports in the traditional print environment and includes de facto document type definitions (DTDs) to describe the structure of reports so the document can be electronically processed using document imaging, OCR, compression/decompression, and optical media storage of full text. Describes the data elements that should appear on the cover and title page of a report, the scope of each section of a report and instruction on the best way to present textual and visual information and tabular materials." [from the publisher]

"ANSI/NISO Z39.18-1995. Developed by the National Information Standards Organization, approved March 21, 1995 by the American National Standards Institute." See the publisher's web site: http://www.niso.org/



[CR: 19971107]

Nelson, Theodor Holm. "Embedded Markup Considered Harmful." Pages 129-134 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: Project Xanadu; WWW: Ted Nelson's Home Page.

Summary: "I want to discuss what I consider one of the worst mistakes of the software world, embedded markup; which is, regrettably, the heart of such current standards as SGML and HTML [...] There is no one reason this approach is wrong: I believe it is wrong in almost every respect."



[CR: 19970121]

National Information Standards Organization. Codes for the Representation of Languages for Information Interchange (ANSI/NISO Z39.53-1994). Bethesda, MD: NISO Press [for NISO], 1994. ISBN: 1-880124-10-6. ISSN: 1041-5653.

Overview: "The National Information Standards Organization (NISO) has published a revised standard for language codes. Codes for the Representation of Languages for Information Interchange (ANSI/NISO Z39.53-1994) is used by libraries, information services, and publishers as the standard for designating languages in which documents or document handling records (such as order records or bibliographic records) have been created. The revised standard reflects a thorough review of the 1987 edition and includes many changes requested by users. Codes have been added for 28 languages or language groups previously not represented. The list codifies names for 399 languages. Numerous minor changes also have been made to reflect current accepted usage in language names. The USMARC Code List for Languages is kept consistent with ANSI/NISO Z39.53 and will be revised to incorporate the changes in this new edition." [from a NISO-L news announcement; see the complete text for details.] [Note: need to clarify he relationship between this and ISO 639/2.]

The standard was approved on September 21, 1994, by the American National Standards Institute. It was developed for NISO by an ad hoc working group composed of John Byrum (Chair), Rebecca Guenther, Sally H. McCallum, and Millicent Wewerka. It is a revision of ANSI Z39.53-1987. The 399 language codes are for contemporary and historical languages. The codes are based (largely) upon an existing MARC list of language names, where the MARC language codes have been used in the cataloging of millions of bibliographic works in a library setting. See unofficially: NISO 3-character language codes (Z39.53-1994), [mirror copy]. Also, for several proposed additions and deletions to Z39.53-1994, approved as of January 1997: see the update to USMARC Code List for Languages from November 15, 1996: "Any changes listed below [in this MARC code list] that were not included in Z39.53 will be incorporated at the next revision of that standard"; [mirror copy].



[CR: 19950828]

Neuman, Michael. "You Can't Always Get What You Want. Deep Encoding of Manuscripts and the Limits of Retrieval." Pages 84-86 [extended abstract from paper] in ACH/ALLC '95: The 1995 Joint International Conference. Conference Abstracts, Posters and Demonstrations. ACH/ALLC '95: The Joint International Conference, Santa Barbara, California, July 11-15, 1995. Santa Barbara: University of California/ACH/ALLC, 1995.

The author discusses the limitations of "markup" to encode critical analysis of text in a way that makes the encoded information optimally available to researchers. When encoding content objects, compositional features, and editorial activities within the same text, the results -- experienced during the tagging process and in the final result -- are not always satisfying. Problems of "deep encoding" are compared to similar challenges faced with indexing and retrieval in library science (descriptive cataloging).



[CR: 19980630]

Neumann, Andreas. Unambiguity of SGML Content Models - Pushdown Automata Revisited. Forschungsbericht Nr. 97-05. Paper presented at the Third International Conference on the Developments in Language Theory (DLT '97), Thessaloniki, Greece. Trier: University of Trier - Computer Science, July 1997. Extent: 22 pages, 12 references. Author's affiliation: Universität Trier - Mathematik/Informatik; Email: neumann@PSI.Uni-Trier.DE.

Abstract: "We consider the property of unambiguity for regular expressions, extended by an additional operator &. This denotes concatenation in any order, and must have arbitrary arity since it is not associative. This extension gives us high succinctness in expressing equivalent regular expressions without &. The property of unambiguity means for a regular expression e, that a symbol in a word from its language must not match two different occurrences of that symbol in e without lookahead. We extend this notion to &-unambiguity which helps us deal with operator &. We then give a first method for deciding in polynomial time whether a regular expression with & is &-unambiguous, and if it is, whether it is unambiguous. Our method is constructive - it provides a deterministic automaton with polynomial representation thats accepts the language of the expression, if it is unambiguous and &-unambiguos. If it is only unambiguous then the automaton accepts a subset of the language."

See also the database section "The SGML Notion of) Ambiguity" and the collection of postings from Fall 1997, "SGML and Ambiguity".

The document is available in DVI and Postscript formats: conference paper, Postscript, [local archive copy]; technical report, Postscript, [local archive copy]; presentation slides, Postscript. See also A. Neumann's publications page.



[CR: 19980630]

Neumann, Andreas; Seidl, Helmut. Locating Matches of Tree Patterns in Forests. Forschungsbericht Nr. 98-08. Paper submitted to FST&TCS '98. Trier: University of Trier - Computer Science, July 1997. Extent: 18 pages, with 13 references. Authors' affiliation: Universität Trier - Mathematik/Informatik; Email: neumann@PSI.Uni-Trier.DE.

Abstract: "We deal with matching and locating of patterns in forests of arbitrary arity. A pattern consists of a structural and a contextual condition for subtrees of a forest, both of which are given as tree or forest regular languages. We adopt the notation of mu-formulae for uniformly specifying both kinds of conditions. In order to implement pattern matching we introduce the class of pushdown forest automata. We identify a special class of contexts such that not only pattern matching but also locating all of a forest's subtrees matching in context can be performed in a single traversal. W e also give a method for computing the reachable states of an automaton in order to minimize the size of transition tables."

See also the database section "SGML/XML and Forest Automata Theory."

The document is available in DVI and Postscript format: Postscript [local archive copy]. See also A. Neumann's publications page.



[CR: 19950716]

Newcomb, Steven R. "Multimedia Interchange Using SGML/HyTime. Part I: Structures." IEEE MultiMedia 2/2 (Summer 1995) 86-89 (with 2 references). ISSN: 1070-986X. Author's affiliation: TechnoTeacher, Inc.

"Abstract: HyTime is a standard-neutral markup language for representing hypertext, multimedia, hypermedia and time-based documents in terms of their logical structure. Documents represented in HyTime conform fully to the ISO Standard Generalized Markup Language (SGML). In effect, HyTime extends SGML by adding certain syntactic conventions called SGML architectural forms, with which it represents certain semantic constructs. HyTime cannot be understood or evaluated without understanding, at least to some extent, the significance and usefulness of the SGML standard on which it is based. This article examines the structure of the two standards. HyTime places unprecedented demands on document processing systems-demands which they have yet to meet. For example, a full implementation of HyTime would allow one to create a hyperlink to whatever happens to be going on at a particular time and/or place in a finite coordinate space (FCS), even if the event, location and time are not yet known, because of binding-time issues. HyTime allows a document to specify where and/or when the results of traversing a hyperlink will be rendered. HyTime provides constructs for specifying how events scheduled in one FCS are to be 'projected' onto another, e.g., from a 3D FCS to a 2D FCS, or from a virtual measurement domain to a real one."

Part I of a 2-part article.



[CR: 19950716]

Newcomb, Steven R. "Multimedia Interchange Using SGML/HyTime. Part II: Applications." IEEE MultiMedia 2/3 ([forthcoming] 1995) xxx-xxx. ISSN: 1070-986X. Author's affiliation: TechnoTeacher, Inc.

Part II of a 2-part article.



[CR: 19951020]

Newcomb, Steven R. "SGML Architectures: Implications and Opportunities for Industry." <TAG> The SGML Newsletter 8/8 (August 1995) 1-5. ISSN: 1067-9197. Author's affiliation: TechnoTeacher, Inc.

"SGML has always offered the means whereby the syntax and semantics associated with a given information construct (an element type) can be expressed; these are expressed in element definitions in document type definitions (DTD). Differing document types may contain some similar or identical element definitions, and sets of software applications can be made to contain or use similar or identical software modules for processing such similar or identical element types. In this way, anyone who controls some set of DTDs can heighten the application-neutrality of the information contained in documents conforming to those DTDs, save money on software development, and reduce expensive confusion in general, by maximizing the generality of each information construct (element type), and by avoiding, insofar as possible, any duplication of semantics which do not also duplicate syntax. However, until quite recently, with the advent of the HyTime (ISO/IEC 10744) international standard, there was no agreed-upon formalism for the expression of similarity in structure and semantics. Now these things can be expressed formally, and enforced, at least to some extent, automatically, in a new, more abstract kind of document type definition called a ``meta-DTD.'' A meta-DTD describes the structure and semantics of a class of documents which therefore conforms to an `SGML architecture.'" [from the Introduction, online version]

The document is available in a revised form on the Internet; [mirror copy, October 12, 1995], or: (June 1996) mirror copy.



[Newcomb, Steven R.] "Standard Generalized Markup Language (SGML; ISO/IEC 8879/1986)." Communications of the Association for Computing Machinery 34/11 (November 1991) 72-73. ISSN 0001-0782.

Abstract: "The Standard Generalized Markup Language (SGML) is designed to describe documents in terms of their logical structure. SGML provides a meta-syntax for expressing agreed-upon syntaxes for individual document types, and for the syntax of the generic coding in the documents themselves. The language allows one document to appear transparently on dissimilar systems, even when those systems require distinct distribution methods among various files. Both private and public enterprises are turning to SGML as a general solution for their information-handling problems; SGML is amenable to certain kinds of processing, and all SGML documents can be validated by a single validating parser. The biggest commercial user of SGML today is perhaps the US Defense Department's Computer-aided Acquisition and Logistic Support Initiative." (Sidebar to the article on HyTime, by Steven R. Newcomb; see the related presentation.)



[CR: 19950716]

Newcomb, Steven R. "Standards. Standard Music Description Language Complies with Hypermedia Standard." IEEE Computer 24/7 (July 1991) 76-79. ISSN: 0018-9162. Author's affiliation: Florida State University, Tallahassee, FL, USA; TechnoTeacher, Inc..

"Abstract: The Standard Music Description Language (SMDL), an application of the HyTime Hypermedia/Time-based document structuring facilities, is described. The discussion covers the domains of information that SMDL associates with any piece of music, the timing of cantus events, pitch in cantus events, gamut-based pitches, just-intoned pitches, user-defined functions for pitches, chords and chord symbols, instrumental and vocal sounds, and non-western music."

See also the main entry for SMDL.



Newcomb, Steve. "TechnoTeacher's MarkMinder and HighMinder Engines." SGML Users' Group Newsletter 26 (February 1994) 23-24. Steven R. Newcomb, TechnoTeacher Inc., PO Box 3208, 1810 High Road, Tallahassee, FL 32303-3208, USA.



[CR: 19960331]

Newcomb, Steven R. Using the Information Addressing Model of HyTime (ISO 10744) to Add Hypermedia Functionality to Legacy Data and Systems. Paper presented at the Second International Workshop on Incorporating Hypertext Functionality into Software Systems, held in conjunction with the ACM Hypertext '96 conference, Washington, U.S.A.. Rochester, NY: TechnoTeacher, Inc., March 1996. Extent: approximately 5 pages. Author's affiliation: President, TechnoTeacher, Inc. (Email: srn@techno.com).

"The HyTime standard allows any pieces of information to be addressed (and therefore to become the anchors of hyperlinks) in any convenient terms. These terms can be expressed in _any_ notation and they can be used to address _any_ information in _any_ notation. In addition, HyTime provides more specialized mechanisms to meet frequently-encountered addressing needs, including: (1) addressing of information represented in SGML in terms of all of the inherent properties of information represented in SGML (e.g., hierarchy, attribute values, all other markup phenomena); (2) addressing of information represented in HyTime in terms of all of the inherent properties of information represented in HyTime (e.g., anchor status, position, extent); (3) addressing of information represented in any other (i.e., non-HyTime) SGML architecture in terms of all of the inherent properties of information represented in that architecture; (4) addressing of information that has, as one or more of its inherent properties, position in any arbitrary finite coordinate space." [extracted]

Available on the Internet in HTML format: http://space.njit.edu:5080/HTFII/Newcomb.html [mirror, partial links].



[CR: 19971205 MD: 19971227]

Newcomb, Steven R. "Document Architectures. What You Need to Know About the New HyTime." International SGML Users' Group Newsletter 3/4 (October 1997) 6-8. ISSN: 0952-8008. Author's affiliation: President, TechnoTeacher Inc, Rochester, NY; Email: srn@techno.com.

Summary: Newcomb predicts that SGML users with any of a dozen common requirements will want to "read the appropriate section of Annex A ('SGML Extended Facilities') of the HyTime Second Edition: A.2 for lexical modeling of element content and attribute values; A.3 for information about inheriting the semantic and syntactic characteristics of other DTDs in your own DTD. Personally, I think A.3 is the single most revolutionary and far-reaching aspect of the new HyTime standard, and I would urge most SGML veterans to start there. The HyTime standard is now primarily two things: the HyTime architecture itself (which is essentially a very abstract DTD for hyperdocument structuring described in clauses 1-11), and the SGML Extended Facilities (Annex A). In the original edition of HyTime, there was only the HyTime DTD, whose element types were called 'architectural forms'. Each of the element types were designed to be inherited by actual element types in actual DTDs (that's why the HyTime DTD was called a 'meta-DTD'). Now, though, the HyTime DTD is regarded as one aspect of the definition of the 'HyTime Architecture', and A.3 describes how to inherit not only the HyTime architecture, but any combination of architectures (i.e., any combination of DTDs) you like. A.3 is called the 'Architectural Form Definition Requirements' or, simply, 'AFDR'."

"Annex A.4 contains information about the 'property set' and 'grove' paradigms. Many would say that this is the single most revolutionary and far-reaching aspect of the second edition of HyTime, and it would be hard to argue with them. A 'property set' is the formal description of all the kinds of things that a parser ('notation processor') might find in an information resource expressed in some particular notation, and the relationships between those things. An SGML parser is expected to recognize certain kinds of things in an SGML document; those things are described in the 'SGML Property Set' that can be found in Annex A.7."

An online version of the document is available in HTML format as: "What You Need to Know About the New HyTime,", by Steven R. Newcomb, of TechnoTeacher Inc.. Now accessible via a link from HyTime User's Group Web server. [local archive copy]



Newcomb, Steven R.; Kipp, Neill A.; Newcomb, Victoria T. "The 'HyTime' Hypermedia/Time-based Document Structuring Language." Communications of the Association for Computing Machinery 34/11 (November 1991) 67-83. ISSN: 0001-0782.

Abstract: HyTime, a proposed standard for digital communications, should enable authors of electronic documents to incorporate active references to other on-line documents regardless of their notations. HyTime, which stands for Hypermedia/Time-based Document Structuring Language, is built on the Standard Generalized Markup Language (SGML). SGML/HyTime enables all types of documents to package the 'information about information' using standard 'markup,' which provides information about the notations and structure of the document so that any application with an appropriate data importation facility can understand and interpret it. The documents' structured character will also make them useful for querying, access and version control, maintenance and nonsequential browsing.



[CR: 19950716]

Newcomb, Steven R.; Newcomb, V. T. "Some Background Information about HyTime." Journal of the Institute of Image Electronics Engineers of Japan 21/5 (October 1992) 459-467 (with 23 references). Authors' affiliation: TechnoTeacher Inc., Tallahassee, FL, USA.

"Abstract: HyTime, the Hypermedia/Time-based Structuring Language, is a result of cooperation among representations from many fields. Groups of information users and creators in these fields recognized a common need for a standard representation for the interchange of complex information. The Standard Generalised Markup Language (SGML) provided the substrate upon which HyTime, and the standard from which it emerged, the Standard Music Description Language (SMDL), were built."



[CR: 19951229]

Nicholas, Charles K.; Welsch, Lawrence A. "On the Interchangeability of SGML and ODA." Electronic Publishing: Origination, Dissemination and Design (EPODD) 5/3 (September 1992) 105-130. 15 references. ISSN: 0894-3982. Authors' affiliation: Comput. Syst. Laboratory, National Institute of Standards and Technology (NIST), Gaithersburg, MD, USA.

"Abstract: SGML and ODA are international standards for the markup and interchange of electronic documents. These standards are incompatible in the sense that in general a document encoded using SGML cannot be used directly in an ODA-based system, and vice versa. The authors first describe these two standards, and suggest criteria under which a bridge between the two standards could he evaluated. They evaluate the Office Document Language (ODL), an SGML application specifically designed for ODA documents, with respect to these criteria. They describe conditions under which reliable automatic translation between SGML and ODA can be achieved, and describe a translation program that converts SGML documents to ODA and back."

Apparently based upon the NIST Technical Report. See the bibliographic entry for NISTIR 4681. Available online: "On the Interchangeability of SGML and ODA" [mirror copy, December 1995].



Nicholas, Charles K.; Welsch, Lawrence A. On the Intechangeability of SGML and ODA. Technical Report, NISTIR 4681. Gaithersburg, MD : U.S. Department of Commerce, National Institute of Standards and Technology (NIST), January 1992. ii + 19 pages.

Apparently published in EPODD. See the bibliographic entry.



[CR: 19971125]

Nicholson, Simon. "Authoring and Translation for the International Market." Page(s) 73-79 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Chrystal Software Inc., A Xerox New Enterprise Company, Slough, Berks, UK; WWW: www.chrystal.com; Email: simon_nicholson@chrystal.co.uk.

Abstract: "Only a few markets around the globe can mandate the universal use of a single language for documentation. Further, it was once the case that the author had some sight of the user of the information. With global markets this luxury has all but vanished. Today authors have little sight of the intended user of the information. It is more than likely the user will work in a different culture, a different language, using different media. Organisations must be cogniscent of these conditions of entry into the market, and in most cases the requirement to provide localised, translated information must be absorbed as part of the cost of entry. Such costs can rapidly exceed the original startup costs for production of the source language version. Today the pressure is on to find ways to reduce startup and ongoing costs and time frames whilst maintaining or improving quality.

"The presentation discusses such initiatives. The key argument presented will be that translation activity and management of information encoded in SGML (Standard Generalized Markup Language) can provide reductions in cost and timescales whilst offering real opportunity to improve the quality and consistency of the content. The advantages offered by the SGML context when applied against technologies and concepts such as Translation Memory, SGML Element-level management and Controlled Terminology will be presented and discussed. The application of these capabilities will then be presented as part of a Component Based Document Management System providing a concurrent translation processing environment."

"Within the presentation references will be made to ongoing initiatives to implement such systems, providing an analysis of the issues to be addressed, and the savings generated have provided towards the justification for the use of SGML."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.



[CR: 19971227]

Nicholson, Simon. "The Need for Component Methodologies in Global Applications." Pages 241-248 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Simon Nicholson]: Business Development Manager, Chrystal Software Inc, A Xerox New Enterprise Company, 1606 19th Street NW, Washington, DC USA 20009; Phone: +1 (202) 332 7882; FAX: +1 (619) 676 7710; Email: simonn@chrystal.com; WWW: http://www.chrystal.com.

Abstract: "Only a few markets around the globe can mandate the universal use of a single language for documentation. Further, it was once the case that the author had some sight of the user of the information. With global markets this luxury has all but vanished. Today authors have little sight of the intended user of the information. It is more than likely the user will work in a different culture, a different language, using different media. Organisations must be cogniscent of these conditions of entry into the market, and in most cases the requirement to provide localised, translated information must be absorbed as part of the cost of entry. Such costs can rapidly exceed the original startup costs for production of the source language version. Today the pressure is on to find ways to reduce startup and ongoing costs and time frames whilst maintaining or improving quality.

"The presentation discusses such initiatives. The key argument presented will be that translation activity and management of information encoded in SGML (Standard Generalized Markup Language) can provide reductions in cost and timescales whilst offering real opportunity to improve the quality and consistency of the content. The advantages offered by the SGML context when applied against technologies and concepts such as Translation Memory, SGML Element-level management and Controlled Terminology will be presented and discussed. The application of these capabilities will then be presented as part of a Component Based Document Management System providing a concurrent translation processing environment.

"Within the presentation references will be made to ongoing initiatives to implement such systems, providing an analysis of the issues to be addressed, and the savings generated have provided towards the justification for the use of SGML."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).



[CR: 19950916]

Nicol, Gavin Thomas. The Multilingual World Wide Web Technical Report. [Privately published], [September], 1995. Extent: ca. 67K HTML document. Author affiliation: Electronic Book Technologies, Japan [1-29-9 Tsurumaki, Setagaya-ku; Tokyo 154, Japan; Tel: +81-3-3706-7351; Email: gtn@ebt.com].

Introduction: "The World Wide Web has enjoyed explosive growth in recent years, and there are now millions of people using it all around the world. Despite the fact that the Internet, and the World Wide Web, span the globe, there is, as yet, no well-defined way of handling documents that contain multiple languages, character sets, or encodings thereof. Rather, ad hoc solutions abound, which, if left unchecked, could lead to groups of users suffering in incompatible isolation, rather than enjoying the true interoperability alluded to by the very name of the World Wide Web. This paper discusses the issues, and hopes to offer at least a partial solution to the problems discussed."

Draft version(s) available online: HTML version [or ditto, from December 1995 of 'The Multilingual WWW` (or see interim draft mirror copy, September 16, 1995; December 1995).



[CR: 19960529]

Nicol, Gavin Thomas. DynaWeb: Interfacing large SGML repositories and the WWW Technical Paper, WWW4. Fourth International World Wide Web Conference. [MIT, OSF, etc.], December 11-14, 1995. Author affiliation: Electronic Book Technologies, Japan [1-29-9 Tsurumaki, Setagaya-ku; Tokyo 154, Japan; Tel: +81-3-3706-7351; Email: gtn@ebt.com].

"Many companies are now establishing a presence the World Wide Web, and are facing the problem of how to make their data available in an efficient, cost effective, and presentable manner. For large documents in non-HTML formats, the traditional approach has been to convert the data to a large number of small HTML pages. These pages are then made available on the WWW, however this process results in lost information fidelity, and increased costs due to double-handling. DynaWeb is an HTTP 1.0 compatible server and CGI script that performs the conversion, and fragmentation at runtime, and uses the very same data used for publishing to other media. The rationale for this is that it dramatically simplifies the information management process, and thereby reduces the costs of publishing on the Internet. This paper discusses the design of DynaWeb, and the concepts behind it."

Available online: [ mirror copy, text only, Postscript].



[CR: 19950716]

Nonnekes, J. P. C. "Latest News from the Dutch [SGML Users' Group] Chapter [November 1994]." SGML Users' Group Newsletter 30 (March 1995) 6. ISSN: 0952-8008.

Report on a meeting "The Vendor's Viewpoint", with about 150 members in attendance. Presentations by Jan Maasdam and vendor representatives.



[CR: 19950716]

Nonnekes, J. P. C. "Next Dutch Chapter's SGML Congress." SGML Users' Group Newsletter 28 (August 1994) 17-18. ISSN: 0952-8008. Author's affiliation: Shell Common Information Services, Rijswijk.

Note on the Dutch preparations for an annual SGML Congress. Congress proceedings are available for the 1993 Congress [14-September-1993[. Contact: Jan Maasdam at Intermedia bv, +31-1720-66-612



[CR: 19960312]

Nordhausen, Bernd. Improving the Quality of SGML Documents. Presentation at the SGML '95 Conference, December 1995. Cupertino, CA: Pasage Systems, . Extent: approximately 6 pages. Author's affiliation: Passage Systems.

Available online in HTML format: [mirror copy, partial links].



Nordin, Brent; Barnard, David T.; Macleod, Ian A. A Critique of the Standard Generalized Markup Language (SGML). Technical Report 91-308. Kingston, Ontario: Department of Computing and Information Science, Queen's University, Kingston, Ontario, Canada, 1991.



Nordin, Brent; Barnard, David T.; Macleod, Ian A. "A Review of the Standard Generalized Markup Language (SGML)." Computer Standards and Interfaces (Amsterdam, Netherlands: Elsevier Science Publishers B.V./North-Holland) 15/1 (May 1993) 5-19. 33 references. ISSN: 0920-5489. [Nordin]: ZIFTech Computer Systems, Inc., 120 Herchmer Crescent, Kingston, Ontario, Canada K7M 2V9; [Barnard and Macleod]: Department of Computing and Information Science, Queen's University, Kingston, Ontario, Canada K7L 3N6.

Abstract: The international standard ISO 8879:1986 and its related material describes both a text markup scheme and an implementation of a text parser based on that markup scheme. By trying to clarify the relationship between the documents and an implementation, the authors show that optional SGML features properly belong to separate applications. The result suggests more general and powerful mechanisms which could be obtained.



[CR: 19960127]

Noreault, Terry R.; Crook, Mark A. Page Image and SGML: Alternatives for the Digital Library. Paper presented on August 23, 1995, at ISDL'95: International Symposium on Digital Libraries 1995. Dublin, OH: OCLC, August 1995. Extent: approximately 12 pages. Author's affiliation: [Noreault]: Director of Research and Special Products, OCLC, 6565 Frantz Rd., Dublin OH 43017; Tel. (614) 764-4392; FAX: (614) 764-2344; email: noreault@oclc.org; WWW: http://www.oclc.org:5047/~noreault/; [Crook]: Sr. Consulting Systems Analyst, OCLC Office of Research; email: mark_crook@oclc.org; WWW: http://www.oclc.org:5046/~crook/markpage.html.

"Abstract: As the Digital Library evolves, important questions about technology selection for information capture and delivery arise. Today, the two most widely-used technologies are tagged (e.g., SGML) documents and document page images. OCLC Online Computer Library Center, Inc. (OCLC) has successfully applied both techniques in the Electronic Journals Online and SiteSearch Image Extension systems. OCLC's experience in both technologies is a source of practical insight into the relative strengths and weaknesses of the respective approaches. This paper compares these methods of electronic access and delivery of digitized, journal literature and highlights considerations for the evolving Digital Library."

Draft version available in HTML format. A refereed version of this paper will be published (with notes) in the Proceedings of the International Symposium on Digital Libraries 1995 which was held in Japan on August 22-25, 1995. The symposium was sponsored by the University of Library and Information Science, Tsukuba Science City, Ibarake, Japan. The electronic proceedings are available in page image format: see http://www.dl.ulis.ac.jp/ISDL95/proceedings/. See the Symposium Program for further conference details.



[CR: 19971002]

Light, Richard; North, Simon; Allen, Charles A. Presenting XML. Edited and with a Foreword by Tim Bray. Indianapolis, IN: SAMS.NET [Sams Publishing, Macmillan Publishing USA], 1997. 414 pages. ISBN: 1-57521-334-6. Authors' affiliation: [Light] Richard Light Consultancy; [North] Synopsys Inc; [Allen] WebMethods; [Bray] Textuality.

Through no fault of its own, a reviewer says, the book "suffers from being a snapshot of a moving target, but [is] a worthy first volume in the soon-to-be-large XML library." Description: ". . .this reference takes you on an introductory tour of this robust technology, showing you how the technology can work to your advantage. You'll learn to create XML documents, separate style from content, and create power links with XML. In addition, you'll find out how XML is being used today and what impact it will have in the future. With Presenting XML, you'll get a quick, efficient introduction to XML and everything it has to offer, and you'll learn why this dynamic markup language is the wave of the future." [publisher's blurb]

See provisionally the description from Macmillan's superlibrary.com server, or the announcement from Simon North. Alternately, check the companion web site for the volume.

"For the technically minded, we authored in SGML (TEI-lite DTD) and used Jade to produce the RTF for publishing in Microsoft Word, and the HTML (and XML) for online use." [North, CTS posting]


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/bib-mn.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org