Cover Pages: SGML/XML Bibliography Part 4, I

[CR: 19971227]

Iantosca, Michael J. "The Power of Parameter Entities." Pages 99-104 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Michael J. Iantosca]: Advisory Software Engineer, IBM Corporation, IBM Information Strategy and Technology, Research Triangle Park, North Carolina 27709; Phone: +1 (919) 254-0534; FAX: +1 (919) 543-4118; Email: publish@ibm.net.

Abstract: Parameter entities were once thought to be the domain of only DTD designers. Parameter entities, and their references, can also be placed in the internal DTD subset of document instances. By doing so, authors can indirectly include shared entity declarations or collections of entity declarations. Such indirection can enable groups of authors to share and reuse entities that change frequently. Whereas parameter entities enable entity sharing and reuse, Hytime content location addressing can provide granular reuse of elements within file entities. When combined, paramater entities and content location addressing can enable sharing and reuse of SGML components in either local and far-flung environments."

"The scenario [discussed] is not fictitious; the problems are real and the requirements and objectives are quite common for Company X, as they are for many organizations, large and small. Of course, this paper was written in the referenced DTD and uses all of the features discussed; the SGML markup for this document is available from publish@ibm.net. The creative and judicial use of the features described in this paper provide a reasonable degree of reuse and data management across an organization of virtually any size, without requiring the use of an SGML-enabled data manager. However, a capable SGML-enabled data manager, combined with one or more of these features, can provide an organization with a formidable, extensible, and highly automated reuse environment."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).

[CR: 19951130]

Ide, Nancy. "Encoding Standards for Large Text Resources: The Text Encoding Initiative.". Pages 574-578 in Proceedings of the 15th International Conference on Computational Linguistics, COLING'94. International Conference on Computational Linguistics, Miyako Hotel, Kyoto, Japan, August 5-9, 1994. Sponsored by the International Committee on Computational Linguistics. Edited by [??]. [pubLocation]: [publisher], 1995 [?].

The volume apparently has not yet been published (November 1995).

[CR: 19971202]

Ide, Nancy; McGraw, Tim; Welty, Chris. "Representing TEI Documents in the CLASSIC Knowledge Representation System ." Pages 85-91 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Authors' affiliation: Department of Computer Science, Vassar College, Poughkeepsie, New York 12604-0252; WWW addresses: Nancy Ide, Tim McGraw, Chris Welty.

Summary: "Beyond the need to query and retrieve based on tags which exist in a TEI document, a means to manipulate and query classes of objects is also desirable. The TEI DTD uses SGML entity definitions to create "classes" of elements and attributes, in particular, for groups of elements with common structural properties (e.g., all elements that can appear between paragraphs), groups of attributes which apply to certain classes of elements (e.g., attributes for pointer elements), etc. In addition to grouping together elements and attributes with common structural properties, the definition of such classes recognizes common semantic properties among elements and attributes. However, the SGML entity definition mechanism provides only for string substitution within the DTD itself, thereby enabling easy reference to these classes in later element definitions; the common semantic properties that are implicit in the classification scheme are lost for the purposes of retrieval and document manipulation. Obviously, a means to refer to and manipulate classes of elements and attributes in a query and retrieval system would provide substantial additional power for the user."

"We are experimenting with the representation of a DTD and associated documents (i.e., documents conformant to the DTD) in a knowledge representation (KR) system, in order to provide more sophisticated query and retrieval from TEI documents than current systems provide. We are using CLASSIC, a frame-based representation system developed at AT&T Bell Laboratories . Like many KR systems, CLASSIC enables the definition of structured concepts/frames, their organization into taxonomies, the creation and manipulation of individual instances of such concepts, and inference such as inheritance, relation transitivity, inverses, etc. In addition, CLASSIC provides for the key inferences of subsumption and classification. By representing a document as an individual instance of a hierarchy of concepts derived from the DTD, and by allowing the creation of additional user-defined concepts and relations, sophisticated query and retrieval operations can be performed. This paper briefly describes the CLASSIC system, the representation of a DTD and a document conforming to that DTD in CLASSIC, and provides an overview of the kind of query and retrieval that can be performed.

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/ide.html; [local archive copy]. Also on the Vassar server: http://www.cs.vassar.edu/~ide/papers/tei10.html See the main database entry for additional information about the conference, or the Brown University web site.

[CR: 19950823]

Ide, Nancy; Sperberg-McQueen, C. M. "The Text Encoding Initiative: Its History, Goals, and Future Development." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/1 (1995) 5-15.

Abstract: "This paper traces the history of the Text Encoding Initiative, through the Vassar Conference and Poughkeepsie Principles to the publication, in May 1994, of the Guidelines for Electronic Text Encoding and Interchange. The authors explain the types of questions that were raised, the attempts made to resolve them, the TEI project's aims, the general organization of the TEI committees, and they discuss the project's future."

[CR: 19950922]

Ide, Nancy; Véronis, Jean. "Encoding Dictionaries." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/2 (1995) 167-179.

Abstract: "This article describes the major problems in devising a TEI encoding format for dictionaries, which, because of their high degree of structuring and compression of information, are among the most complex text types treated in the TEI. The major problems for this task were: (1) the tension between generality of the description, in order to be widely applicable across dictionaries, and descriptive power, that is, the ability to describe with precision the particular structure of any given dictionary; and (2) the need to accommodate different views and uses of the encoded dictionary, for example, as printed object and as a database of information."

[CR: 19951130]

Ide, Nancy; Véronis, Jean. "MULTEXT (Multilingual Text Tools and Corpora)." Pages 90-96 (with 11 references) in Proceedings of the International Workshop on Sharable Natural Language Resources (SNLR). Workshop on Sharable Natural Language Resources (SNLR). Ikoma, Nara, Japan. August 10-12, 1994. Sponsored by Nara Institute of Science and Technology, with support from the Foundation of Nara Institute of Science and Technology. Edited by Yuji Matsumoto and Takenobu Tokunaga. Nara, Japan: Nara Institute of Science and Technology, 1995 [?]. Author's affiliation: LABORATOIRE PAROLE ET LANGAGE CNRS & Université de Provence 29, Avenue Robert Schuman 13621 Aix-en-Provence Cedex 1 (France). E-mail: ide@fraix11.univ-aix.fr, veronis@fraix11.univ-aix.fr.

"Abstract: MULTEXT (Multilingual Text Tools and Corpora) is the largest project funded in the Commission of European Communities Linguistic Research and Engineering Program. The project will contribute to the development of generally usable software tools to manipulate and analyse text corpora and to create multi-lingual text corpora with structural and linguistic markup. It will attempt to establish conventions for the encoding of such corpora, building on and contributing to the preliminary recommendations of the relevant international and European standardization initiatives. MULTEXT will also work towards establishing a set of guidelines for text software development, which will be widely published in order to enable future development by others. All tools and data developed within the project will be made freely and publicly available."

The goals of MULTEXT are the creation of "reusable software for multi-lingual linguistic corpus annotation and exploitation; software standard for tool design; TEI-based markup standard for corpus encoding; multi-lingual corpus (English, Dutch, German, French, Italian, Spanish), including a small speech corpus, partially parallel, portions marked up and validated for part of speech and alignment." As for markup: "The TEI Guidelines provide the basis for markup at levels 0 (the TEI header), 1 and 2 as well as many elements of level 3. In collaboration with Eagles, MULTEXT is extending the TEI scheme in order to specify a TEI -conformant Corpus Encoding Style (CES) that is optimally suited to NLP research and can therefore serve as a widely accepted TEI-based style for European corpus work. Application of the CES to CEE languages, which may require minor modifications to accomodate CEE language-specific information and structures, will provide a test of both the TEI Guidelines and MULTEXT and Eagles' extensions to it."

The paper is available on the Internet ftp://ftp.aist-nara.ac.jp/pub/nlp/conferences/SNLR/papers/14.ps.gz in conjunction with the online proceedings; see also the mirror copy in Postscript format and in PDF format. MULTEXT work sponsored by the Commission of European Communities Linguistic Research and Engineering Project 62-050. For more on the project, see the main entry. For other SNLR conference papers, see the online TOC: http://cactus.aist-nara.ac.jp/lab/events/SNLR/snlr.html.

[CR: 19960922]

Ide, Nancy; Véronis, Jean, (editors, with a volume preface by Charles F. Goldfarb and volume bibliography by Robin C. Cover). The Text Encoding Initiative: Background and Context. Dordrecht, Netherlands: Kluwer Academic Publishers, [August] 1995. Extent: vi + 242 pages. ISBN: 0-7923-3689-5 (hardbound); 0-7923-3704-2 (paperback).

"The Text Encoding Initiative (TEI) Guidelines for Electronic Text Encoding and Interchange are the result of over six years' work by dozens of scholars from all over the world. As such, they represent a pioneer effort in an area where only occasional and isolated attempts were made before. They will certainly serve as the primary basis for encoding texts in electronic form for the foreseeable future. The work of participants in the TEI not only involved consideration of problems of text encoding that are likely to be with us for decades to come, but also required the development of a methodology - from scratch - for approaching these problems. These pioneering efforts, while likely to be refined and extended, must not be lost: they provide the intellectual basis upon which text encoding practices will build in the future. This collection is therefore documents the course of these efforts. `The TEI Guidelines are extraordinary. Even if they were never adopted they would stand as a significant contribution to scholarship for their detailed analysis of the information sets of a huge range of complex text types.' (From the Preface by Charles F. Goldfarb, inventor of the Standard Generalized Markup Language)."

The contents of this volume are also published as a special triple-issue of Computers and the Humanities (CHUM volume 29, numbers 1-3, 1995). The volume bibliography on SGML/TEI (pages 233-242), however, is included only in this book version. Articles in the first CHUM issue: Charles F. Goldfarb, Preface; Nancy Ide and Michael Sperberg-McQueen, The Text Encoding Initiative: Its History, Goals, and Future Development; C. M. Sperberg-McQueen and Lou Burnard, The Design of the TEI Encoding Scheme; Lou Burnard, What is SGML and How Does It Help; Harry Gaylord, Character Representation; Richard Giordano, The TEI Header and the Documentation of Electronic Texts; Dominic Dunlop, Practical Considerations in the Use of TEI Headers in Large Corpora. Articles in the second issue: David Chisholm and David Robey, Encoding Verse Texts; John Lavagnino and Elli Mylonas, The Show Must Go On: Problems of Tagging Performance Texts; Robin Cover and Peter Robinson, Encoding Textual Criticism; Daniel Greenstein and Lou Burnard, Speaking With One Voice: Encoding Standards and the Prospects for an Integrated Approach to Computing in History; Stig Johansson, The Encoding of Spoken Texts; Alan Melby, E-TIF: An Electronic Terminology Interchange Format; Nancy Ide and Jean Véronis, Encoding Dictionaries. Articles in the third issue: Steven J. DeRose and David Durand, The TEI Hypertext Guidelines; David Barnard, Lou Burnard, Jean-Pierre Gaspart, Lynne A. Price, C.M. Sperberg-McQueen, and Giovanni Battista Varile, Hierarchical Encoding of Text: Technical Problems and SGML Solutions; D. Terence Langendoen and Gary Simons, Rationale for the TEI Recommendations for Feature-Structure Markup.

See a volume description for further details, and the order blank from Kluwer.

For other journal special issues and monographs dedicated to the Text Encoding Initiative, see the relevant subentry for TEI.

[CR: 19950823]

Ide, Nancy; Véronis, Jean; Durand, David. "A Data Architecture for Multi-lingual Linguistic Corpora." Pages 60-62 [extended abstract] in ACH/ALLC '95: The 1995 Joint International Conference. Conference Abstracts, Posters and Demonstrations. ACH/ALLC '95 Joint International Conference, July 11-15, 1995. Santa Barbara, Califoirnia: University of California/ACH/ALLC, 1995.

[CR: 19950825]

Ide, Nancy M.; Le Maitre, Jacques; Véronis, Jean. "Outline of a Model for Lexical Databases." Information Processing and Management 29/2 (1993) 159-186.

See a similar article below.

Ide, Nancy M.; Véronis, Jean; Le Maitre, Jacques. "Outline of a Database Model for Electronic Dictionaries." Pages 375-393 (with 25 references) in Intelligent Text and Image Handling: Proceedings of a Conference on Intelligent Text and Image Handling, "RIAO91" Barcelona, Spain, 2-5 April, 1991 [Conference organized by the Centre de Hautes Etudes Internationales d'Informatique Documentaire (CID), Center for the Advanced Study of Information Systems, Inc. (CASIS). Sponsored by the Commission of the European Communities, Minister of Education and Sciences, Spain; Minister of "Industrie en Aménagement du Territoire", France; et al.] Edited by André Lichnerowicz [Collège de France, Académie des Sciences de Paris]. Amsterdam/London/New York/Tokyo: Elsevier, 1991. Extent: xiii + 999 pages. ISBN: 0-444-89361-X. Authors' affiliation: [Ide] Department of Computer Science, Vassar College; [Véronis, Le Maitre] Groupe Représentation et Traitement des Connaissances, Centre Nationale de la Recherche Scientifique[.

The growing availability of dictionaries in electronic form calls for a model sophisticated enough to represent the richness of entries and enable complex information retrieval. Electronic dictionaries are a special kind of object, intermediary between a text and a database. Textual models are not powerful enough to handle complex information retrieval, and conventional database models are not flexible enough to handle the richness of their information. In this paper, we outline a scheme for representing electronic dictionaries which departs from previously proposed models. In particular, it allows for a full representation of sense nesting and defines an inheritance mechanism which enables the elimination of redundant information. The model provides flexibility which seems able to handle the varying structures of different monolingual dictionaries.

Ide, Nancy; Véronis, Jean. "Toward a Comprehensive Set of Text Encoding Principles [Let's Not Rebuild the Encoding Tower of Babel]." Pages 109-110 in Colloque International "Consensus ex Machina?". Abstracts International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratoire "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages.

[CR: 19961226]

Imago, Satosi; Nishimura, Mina. "An Architecture to Derive SGML DTDs." Pages 601-608 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Imago]: RICOH Co.,Ltd., Information and communication R&D center, 3-2-3 Sin'yokohama, Kohoku-ku, Yokohama, 222 Japan; Email: imago@ic.rdc.ricoh.co.jp; [Nishimura]: RICOH Co.,Ltd..

Abstract: "SGML, which is used for document interchange among various environment, is a meta language to describe documents. Before marking up a document, we need to prepare a DTD that defines a document structure.

In general, a DTD applicable to diverse document classes is incompatible with a DTD focusing on the semantic features of documents. If the number of DTDs grows, the costs of developing application programs for the DTDs would also skyrocket.

To apply a DTD focusing on the semantic features to diverse document classes, we developed a system which, from a base generic DTD, derives a different DTD for each document class. Our system also has a function that translates derived DTD instances to base DTD instances. This function frees us from the burden of developing application programs separately for each of the derived DTDs."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

"InterConsult Updates SGML Market Study: Sales Projections, Vendor Profiles." Seybold Report on Publishing Systems 24/1 (September 9, 1994) 26.

The article summarizes important findings in a recent market research report published by InterConsult, Inc. The 1994 market study on SGML (Standard Generalized Markup Language) "asserts that SGML expenditures now represent 21% of the overall publishing software market, and predicts that the percentage will rise to 30% by 1998, as worldwide revenues for sgml software and services continues to grow more than 30% annually." According to the report, revenues from SGML services for 1993 were "$77 million higher than what was predicted in 1992..." "The study predicts market revenues for the next four years in nine market segments: integration services, conversion services, electronic delivery, parsing, composing, graphics, database and document management, autotagging and conversion software and authoring. Some of these will still be growing well in four years; other segments will peak as SGML becomes more of a mainstream technology..." Contact: InterConsult, 366 Massachusetts Ave., Arlington, MA 02174; Tel: (617) 646-9600, FAX (617) 646-9615.

Interleaf, Inc. The SGML Guide. Corporate document M73071-001. Waltham, MA: Interleaf, 1994. vi + 83 pages.

No personal author is given. The volume was announced as available for free: call 1-800-955-5323; or contact via surface mail: Interleaf. Inc., Prospect Place, 9 Hillside Avenue, Waltham, MA 02154. A copy of the work is also available via HTTP (HTML format): connect with a WWW client to Interleaf, or in case of link failure, use this mirror copy.

[CR: 19970121]

International Organization for Standardization (ISO). ISO 639:1988 (E/F). Code for the Representation of Names of Languages. First edition, 1988-04-01. Reference number: ISO 639:1988 (E/F). Geneva: International Organization for Standardization, 1988. iii + 17 pages.

Revision and addition of part 2 (alphabetic 3-character codes) is underway: see ISO 639-2 below. For ISO 639:1988, See provisionally (a) the primary data from the 1988 standard as given here from Keld, or (b) a different compilation of the ISO 639:1988 language codes, or (c) the comparable MARC 3-character language codes, from about 1991, and (d) now, the update to USMARC Code List for Languages from November 15, 1996; [mirror copy]..

ISO 639:1988 is a technical revision of ISO 639:1967, prepared by Technical Committee ISO/TC 37. The two-character language codes of ISO 639 are relevant to SGML encoding in two respects. First, the SGML standard (ISO 8879) itself specifies that declaration of 'public text language' should be given using the language code(s) from ISO 639; see ISO 8879-1986(E) page 36, section 10.2.2.3. Second, the WSD (Writing System Declaration) implemented in the Text Encoding Initiative uses the two-character language code of ISO 639 (as amended) as a 'language.code' attribute of the 'nat.language' declaration, specifying the language in which the WSD is written.

ISO 639 contains much other information about the use of language symbols, registration of new symbols, etc. The language codes of ISO 639 are said to be "devised primarily for use in terminology, lexicography and linguistics, but they may be used for any application requiring the expression of languages in coded form." The registration authority for ISO 639 is given as Infoterm, Österreiches Normungsinstitut (ON), Postfach 130, A-1021 Vienna, AUSTRIA.

The two-character language codes of ISO 639 are recognized as being inadequate for use as SGML language attributes when tagging text, viz, for use as global 'lang' attributes attached to any element to identify the language of the text element or a language shift. In principle, there should be nothing wrong with tagging language using SGML elements rather than attributes, if the encoder has principled reasons for not using attributes (e.g., indexing engines which read simple tags but not SGML attributes). But the two-character codes of ISO 639 are neither sufficiently mnemonic nor complete for the world's languages: whereas ISO 639 supplies codes for only about 136 languages, the Ethnologue published by the Summer Institute of Linguistics identifies over 6100 languages (see Ethnologue: Languages of the World, ed. Barbara Grimes. 11th edition. Dallas, TX: Summer Institute of Linguistics, 1988). A revision of ISO 639 completed late 1990 supplies 3-character language codes (following MARC 3-character language codes in part), based upon the code sequence of the American National Standard (ANSI Z39.53). This draft will be circulated for worldwide review in 1991/92. See below under ISO CD 639/2:1991 (CD part 2). [entry needs update]

International Organization for Standardization (ISO). ISO CD 639/2:1991. Code for the Representation of Names of Languages: alpha-3 Code. Geneva: International Organization for Standardization, 1991. iii + 52 leaves.

Abstract: "This part of ISO 639 provides 3-character alphabetic symbols for the (re)presentation of names of languages. The symbols were devised primarily for libraries, information services, and publishers to use to indicate language in the exchange of information, especially in computerized systems. These symbols have been widely used in the library community, however, they may be used for any application requiring the expression of language in coded form, including use by terminologists and lexicographers. The list is considered to be an open list. This part of ISO 639 also includes guidance on the creation of language symbols and on their use in some of these applications. Languages designed exclusively for machine use, such as computer programming languages, are not included in this code list." There are about 404 language names in the list. See, for comparison: the bibliography entry for the ANSI/NISO standard, or NISO 3-character language codes (Z39.53-1994) [unofficial], [mirror copy]. ISO 639-2 codes are supposed to be based upon (?) the ANSI/NISO set.

ISO CD 639/2:12/16/91 culminates more than three years of intense collaboration between the representatives of ISO TC 37/SC2 (Layout of Vocabularies) and ISO TC46/SC4 (Computer Applications in Information and Documentation). It preserves the principal features of ISO 639-1 (the existing alpha-1 list) while articulating a code that meets the needs of librarians, managers of bibliographic services, and information specialists. The document is out for DIS ballot until April 15, 1992; it is anticipated that executive action will be taken on the DIS following the meeting of ISO TC/46 in London, May 18-22, 1992. Since the list of 3-character language codes is considered to be an open list, the ISO Council has designated a registration authority for 639 part 2. Proposals for allocating new language symbols should be directed to this authority. It is the Library of Congress, c/o Collection Services, Washington, DC 20540. See the list of language codes from a 1992 draft version.

International Organization for Standardization (ISO). ISO 3166:1993 (E/F). Codes for the Representation of Names of Countries. Fourth edition. Reference number: ISO 3166:1993 (E/F). Geneva: International Organization for Standardization, 1993. ii + 30pages.

Under: Technical committee / subcommittee: TC 46. Online lists: FTP from the RIPE server: ftp://info.ripe.net/iso3166-countrycodes, [mirror copy], or: Codes for Representation of Names of Countries (ISO 3166-1993 (E), [mirror copy].

See also: ISO/DIS 3166-1 Codes for the representation of names of countries and their subdivisions -- Part 1: Country codes (Revision of ISO 3166:1993); and: ISO/DIS 3166-2 Codes for the representation of names of countries and their subdivisions -- Part 2: Country subdivision code.

[CR: 19980304]

International Organization for Standardization/International Electrotechnical Commission (ISO/IEC). ISO/IEC 8632-1-4] 1992(E). Information technology--computer graphics--metafile for the storage and transfer of picture description information. [Technologies de l'information--infographie--métafichier de stockage et de transfert des informations de description d'images.]. . [Genève, Switzerland: ISO/IEC, 1992.

ISO/IEC 8632-1[-4] 1992(E). Second edition. Part 1. Functional specification; Part 2. Character encoding; Part 3. Binary encoding; Part 4. Clear text encoding. This standard supersedes the earlier standard: CGM:1986 (ANSI X3.122-1986). For other information on CGM, see the main database entry for Computer Graphics Metafile.

[CR: 19951208]

International Organization for Standardization (ISO). ISO 8879:1986. Information Processing - Text and Office Systems - Standard Generalized Markup Language (SGML) [= Traitement de l'information, Systèmes bureautiques, Langage standard géneralisé de balisage (SGML)]. First edition 1986-10-15; Reference No. ISO 8879:1986 (E). Geneva: International Organization for Standardization, 15 October 1986.

With Amendment A1 (1988), ISO 8879 constitutes the core specification for SGML. A subset of SGML became a US FIPS (Federal Information Processing Standard) in 1988. The British Standards Institution adopted SGML as a national standard (BS 6868) in 1987, and in 1989 SGML was adopted by the CEN/CENELEC Standards Committees as a European standard, #28879. Australia has dual numbered versions of ISO 8879 SGML and ISO 9069 SDIF (AS 3514 - SGML 1987; AS 3649 - 1990 SDIF). The full text of this ISO standard with Amendment A is incorporated into the text Charles Goldfarb's commentary (SGML Handbook), and is available in electronic form on a CDROM disc published by Exoterica Corporation. [Entry needs update to mention results of 5-year review, and the revision process.] ISO 8879 SGML sales figures, 1986-1995.

International Organization for Standardization (ISO). ISO 8879:1986 / A1:1988 (E). Information Processing - Text and Office Systems - Standard Generalized Markup Language (SGML), Amendment 1 [Traitement de l'information, Systèmes bureautiques, Langage standard géneralisé de balisage (SGML), Amendment 1]. Geneva: International Organization for Standardization, July 01 1988. 15 pages.

This amendment is incorporated into the text of Charles Goldfarb's SGML commentary (SGML Handbook).

International Organization for Standardization (ISO). ISO 9069:1988. Information Processing - SGML Support Facilities - SGML Document Interchange Format (SDIF) [= Traitement de l'information - Bureautique - Format d'échange de document SGML (SDIF)]. First edition: 1988-09-15; Reference number ISO 9069:1988 (E). Geneva: International Organization for Standardization, 15 September 1988.

Also available as The British Standard Guide to SGML Document Interchange Format (SDIF), BS 7138 1989 (ISO 9069: 1988; see in "Snippets," SGML Users' Group Newsletter 14 (October 1989) 12. [needs update]

International Organization for Standardization (ISO). ISO/IEC 9070:1991. Information Processing - SGML Support Facilities - Registration Procedures for Public Text Owner Identifiers [= Technologies de l'information, Facilités de support SGML, Procédures d'enregistrement pour identificateurs de propriétaire de texte public]. Second edition, 15 April 1991; Reference number ISO/IEC 9070:1991 (E).. Geneva: International Organization for Standardization/International Electrotechnical Commission, 15 April 1991. iv + 12 pages.

The "public text" envisioned in this standard as applied to SGML might be DTDs (Document Type Definitions), or declaration subsets of DTDs, public entity sets, etc. Names include an owner name and an object identifier. Equivalent encodings for the names in ASN.1 and SGML may be supplied for interchange purposes. Note: "The intention of the amendment that has resulted in a 2nd edition is to extend 9070 beyond the simple boundaries of SGML only. It is now used by 9541 (and 10036) for the definition of 'structured names'. A New Work Item Proposal is being submitted to change the title and scope of 9070 to show its extended usefulness." (note from Paul Ellison, December 1991) [needs update]

[May 1996]: See also the main entry for ISO 9070 with information on the relevant WWW site.

[CR: 19970604]

International Organization for Standardization (ISO). ISO/IEC TR 9573:1988 (E). Information Processing - SGML Support Facilities - Techniques for Using Standard Generalized Markup Language (SGML). Edited by Anders Berglund. Geneva: International Organization for Standardization/International Electrotechnical Commission, December 09 1988. vi + 124 pages.

A major revision of this TR underway (as of May 1990) will result in a new TR with (16) parts: (1) SGML Tutorial (2) Basic Techniques (3) Advanced Techniques (4) Using Short References for Identifying Markup (5) Using non-Latin Alphabets (6) Referencing and Synchronisation (7) Mathematics and Chemistry (8) Tables (9) Using SGML for Computer-to-Computer Interchange (10) Designing Applications for Database Interfacing (11) Application at ISO CS for International Standards and Technical Reports (12) Public Entity Sets for General and Publishing Symbols (13) Public Entity Sets for Mathematics and Science (14) Public Entity Sets for Latin Based Alphabets (15) Public Entity Sets for non-Latin Based Alphabets (16) Public Entity Sets for Ideograms (adapted from Ludo Van Vooren, "SGML Standards Committee Update: Activities of ISO SC 18 WG8," <TAG> 14 (May 1990) 11-12. See also Joan M. Smith in "More Liaison Statements to ISO," SGML Users' Group Newsletter 13 (August 1989) 6-7. A description of this ISO document is found in "Publication of Techniques for Using SGML," SGML Users' Group Newsletter 11 (January 1989) 3-4. Further update of parts 1-5 of TR 9573 will be delayed until the 5-year revision of SGML (ISO 8879) is completed. [needs update]

See further information on this standard in the Related Standards page. A future version is to include an "ISO chemical character set, ISOchem"; see a note by Martin Bryan (September 1995).

See also: SGML Public Entity Sets, Proposals. [relative to: http://www.ornl.gov/sgml/wg8/9573ent/ENTITIES.HTM]. Sample collections of entities and glyphs (proposed) for potential inclusion into ISO 9573. For: Ugaritic, Old Persian, Glagolitic, Croatian, Buginese, Cherokee, and Gothic Uncials. Developed by Anders Berglund and others.

International Organization for Standardization (ISO). ISO/IEC 10036:1993 Information Technology - Font Information Interchange - Procedure for Registration of Glyph and Glyph Collection Identifiers. Geneva: International Organization for Standardization/International Electrotechnical Commission, 1993. [needs update]

International Organization for Standardization (ISO). ISO/IEC TR 10037:1991. Information Processing - SGML and Text Entry Systems - Guidelines for SGML Syntax-Directed Editing Systems. Geneva: International Organization for Standardization/International Electrotechnical Commission, 15 March 1991.

The document supplies technical guidance for the development of context- sensitive SGML editors. See "Guidelines for Syntax-Directed Editing Systems," SGML Users' Group Newsletter 14 (October 1989) 3. [needs update]

International Organization for Standardization (ISO). ISO/IEC DIS 10179.2:1994. Information Technology - Text and Office Systems - Document Style Semantics and Specification Language (DSSSL). Edited by Sharon Adler [and James Clark (?)] "ISO/IEC 10179 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology." Geneva: International Organization for Standardization/International Electrotechnical Commission, 1994. viii + 142 pages.

Voting on the current DIS began 1994-08-10 [and was to end mid-December 1994 or early 1995]. A posting to CTS in early 1995 by James Clark confirmed that negative votes had not been received, and that the vote was therefore expected to pass.

SUMMARY: "This International Standard defines the Document Style Semantics and Specification Language (DSSSL) used to specify formatting and other transformations of SGML-encoded documents. The initial focus of DSSSL is on formatting for both paper and electronic media, and on the conversion of SGML documents encoded according to different DTDs.

This International Standard has been structured to permit future sections to be added to this International Standard to cover the other areas of document processing and data management.

The main objective of the DSSSL Standard is to provide a specification language for expressing formatting and other document processing specifications in a formal and rigorous manner so that these specifications may be processed by a broad range of formatters, either natively or using a translation mechanism.

The DSSSL specification language will include tree transformation specifications and formatting specifications and other semantics to allow users to specify the types of formatting to be applied to various objects during composition and layout and pagination.

For formatting, a DSSSL-driven implementation can create a style sheet language that can be mapped into the DSSSL typographic characteristics and other composition and layout semantics.

In addition to the basic formatting semantics, DSSSL includes a language for writing a general transformation specification that provides the capability to transform documents from one SGML application into another.

DSSSL is designed to allow for specifications that apply to a class of documents. These specifications are applicable to all possible document instances in an SGML application as well as to a particular document instance.

The DSSSL specification language is declarative; it is not intended to be a complete programming language, although it contains constructs normally associated with such languages and provides a well-defined interface to a user-selected programming language, if such a capability is required. DSSSL specifications can be unambiguously parsed and interpreted among heterogenous systems. In addition, DSSSL specifications can be used by existing formatting systems through the use of "front-end" DSSSl processors and translators. DSSSL has no bias toward batch or WYSIWYG formatting systems and does not prescribe any predefined formatting algorithms.

The standardization of formatting semantics is provided in DSSSL through a set of basic structures known as flow objects and the associated set of formatting characteristics that are applied to these objects. DSSSL provides mechanisms for defining and extending the semantic constructs so that a DSSSL application designer can construct a DSSSL application in a manner that best reflects his application environment." [transcription from the Introduction (DIS 1994-08-10)]

International Organization for Standardization (ISO). ISO/IEC DIS 10180:1995. Information Processing - Text Composition - Standard Page Description Language (SPDL). Geneva: International Organization for Standardization/International Electrotechnical Commission, 1995.

For a summary, see: (1) SGML Users' Group Newsletter 20 (September 1991) 17-18; Peter J. Robinson, and Stephen M. Strasen, "Standard Page Description Language," Computing Communications 12/2 (April 1989) 85-92; (2) "Text Composition Standards," SGML Users' Group Newsletter 15 (January 1990) 7-8. Note: "ISO/IEC 10180 has now passed DIS ballot with no negative votes. The joint editors are expected to have the final text ready for publication during 1992" (so Paul Ellison, December 1991). [needs update]

See now [June 1995] further information in a separate SPDL entry within this database, including pointers to availability of the 1995 draft standard via the Internet (e.g., from the WG8 FTP server and from the SGML Repository).

International Organization for Standardization (ISO). ISO/IEC DIS 10743:1995. Information Technology - Standard Music Description Language (SMDL). Geneva: International Organization for Standardization/International Electrotechnical Commission, July 1995. [Was: ISO/IEC CD 10743:1991 (April 1, 1991).]

Description from the 1991 CD version: SMDL "defines a language for the representation of music information, either alone, on in conjunction with text, graphics, or other information needed for publishing or business purposes." Multimedia time sequence information in also supported. SMDL is a HyTime application conforming to ISO/IEC DIS 10744 Hypermedia/Time- based Structuring Language (HyTime), and an SGML application conforming to Standard Generalized Markup Language (ISO 8879:1986). An earlier version was published by ANSI (American National Standards Institute), as ANSI X3V1.8M Journal of Development. ANSI Project X3.542-D. Standard Music Description Language (SMDL). X3V1.8M/SD-8. 60 pages. Sixth Draft. April 15, 1990. See a description of SMDL in an overview article: Steven R. Newcomb, "Standards. Standard Music Description Language Complies with Hypermedia Standard," IEEE Computer 24/7 (July 1991) 76-79. (See the full bibliographic record.)

See now [July 1995] further information in a separate SMDL entry within this database, including pointers to availability of the 1995 draft standard (DIS) via the Internet. Or see an overview taken from the DIS.

International Organization for Standardization (ISO). ISO/IEC 10744:1992. Information Technology - Hypermedia/Time-based Structuring Language (HyTime). Edited by Charles F. Goldfarb (with assistance from Steven R. Newcomb). Geneva: International Organization for Standardization/International Electrotechnical Commission, 1992.

"HyTime is a standard neutral markup language for representing hypertext, multimedia, hypermedia, and time- and space-based documents in terms of their logical structure. Its purpose is to make hyperdocuments interoperable and maintainable over the long term. HyTime can be used to represent documents containing any combination of digital notations. HyTime is parsable as Standard Generalized Markup Language (ISO 8879:1986). HyTime provides standardized means of expressing (1) intra- and extra-document locations, and arbitrary links between them, (2) the scheduling of multimedia objects in 'finite coordinate spaces,' and (3) rendering instructions for arbitrarily projecting such objects onto other finite coordinate spaces, and other constructs." [taken from an abstract in CACM 34/11 (November 1991) 67-83.]

For further information on HyTime, see (1) the WWW SGML Page HyTime main entry, (2) the book by Steve DeRose and David Durand, (3) the book by Eliot Kimber, and (4) the CACM article by Steve Newcomb.

See also Technical Corrigendum 1 to ISO/IEC 10744 [by Charles F. Goldfarb], Draft for ballot: March 27, 1995. The relevant documents are available from the SGML Repository or via this server as three text files: httc1.txt (24K), hi1anarc.txt (46K), and hi1anfsi.txt (22K)

[CR: 19961029]

International Organization for Standardization (ISO). ISO 12083:1993(E) Information and documentation - Electronic manuscript preparation and markup. First edition, 1994-01-15. Geneva: International Organization for Standardization, 1994. 96 pages.

The standard was prepared by Technical Committee ISO/TC 46, Information and documentation, Subcommittee SC4, Computer applications in information and documentation. The title "ISO..." appeared on the print copy distributed in mid-1994 by NISO/EPSIG, despite errors: it was apparently a premature printing. This "ISO" standard supercedes the 1988 (EPSIG/AAP) standard authorized by ANSI/NISO; see the bibliographic reference. The standard included three public DTDs (books, articles, serials) in "final" form and a provisional DTD for mathematics. The ISO 12083 DTDs [though not now in final form (November 1994)] are available on the Exeter SGML Project server and elsewhere; try: Exeter ftp://info.ex.ac.uk/ISO-12083/ or else ftp://actd.saic.com/pub/SGML/ISO-12083/. Although several requests have been made on CTS for release of electronic copies of the DTDs into public space, it remains unclear whether ISO will authorize this form of distribution for the DTDs.

See the EPSIG description "About the Standard"; [mirror copy]

[CR: 19961222]

International Organization for Standardization (ISO). ISO/IEC CD 13240 Information Technology - Standard Hypermedia/Multimedia Scripting Language (SMSL). Geneva: International Organization for Standardization, 199?. 15 pages.

SMSL "Extends HyTime by providing SGML meta-DTD architectural forms for describing the object classes, virtual functions, messages, aggregates and class/data membership used in a multimedia presentation's script. Also contains a definitions for a starter-set of functions used by scripting languages." [from: Index of OII Standards Report.

The SMSL Committee Draft ISO/IEC 13240 is available in Postscript format; [mirror copy, December 22, 1996]. See the main SMSL entry for other details.

International Organization for Standardization (ISO). ISO/IEC DIS 13673:1993 Information Technology - Text and Office Systems - Conformance Testing for Standard Generalized Markup Language (SGML) Systems. First edition. Geneva: International Organization for Standardization/International Electrotechnical Commission, 1993.

Voting was 1993-08-12 thru 1994-02-12. [Entry needs update. Make links to Conformance Testing (Initiative) on main page.]

[CR: 19971227]

Ives, Donna. "BNA's Publishing Systems Project." Pages 109-111 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Donna Ives]: Director, Data Administration, The Bureau of National Affairs, Inc. (BNA), Data Administration, 1231 25th St., N.W., Room III-420 1231 25th St., N.W., Washington, D.C 20037; Phone: +1 (202) 785-6850; Email: dives@bna.com.

Abstract: "This presentation explains the goals for BNA's new publishing system, why BNA chose SGML as an integral part of that system, and provides an overview of how BNA implemented the system. Topics covered include undertaking business process re-engineering, adopting SGML, converting legacy data, and lessons learned during the process. BNA (The Bureau of National Affairs, Inc.) and its subsidiaries provide labor, legal, economic, and regulatory information to business, professional, government, and academic users."

"It really all boils down to the data and the fact that the data is the company's most valuable resource (second only to the people who create it). We used the term 'data repository' to refer to BNA's entire collection of documents and other data, including primary source laws, regulations, opinions, internally created news stories, legal headnotes, and reference materials. BNA has acknowledged that we must manage documents as a corporate asset and we must have the ability to search, retrieve, and update documents throughout the publishing life cycle. SGML was chosen as a way to identify and protect the data. BNA started over 50 years ago using typesetting instructions for Linotype operators. In the 70s we used two digit 'locator codes' to identify typesetting instructions. In 1980 we switched to proprietary (Atex coding) to produce our notification and daily publications. In 1985, with the purchase of a Datalogics system to produce our looseleaf publications, we began using unparsed SGML-like coding. Oh, if we could only recover from the blunder of using unparsed data!"

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

[CR: 19970524]

Jaakkola, Jani; Kilpeläinen, Pekka. Using sgrep for querying structured text files. Department of Computer Science, University of Helsinki Report C-1996-83. Helsinki, Finland: University of Helsinki, Department of Computer Science, November 1996. Extent: 11 pages.

Abstract: "Sgrep is a Unix tool for searching the contents of text files. Sgrep implements an algebra of unrestricted text fragments called regions. The algebra allows the retrieval of document components, represented as regions, based on conditions on their relative containment and ordering. This simple yet powerful model is suitable for querying structured document formats like electronic mail, RTF, LaTeX, HTML, or SGML documents. We describe the sgrep query language and give examples of its use. Especially, we explain how sgrep can be used for querying and assembling SGML documents."

Available online in Postscript format: ftp://ftp.cs.helsinki.fi/pub/Reports/by_Project/DocMan/Using_sgrep_for_querying_structured_text_files.ps.gz; [mirror copy]. See also the software main entry: 'sgrep' grep-like searching of structured documents.

[CR: 19980423]

Jaakkola, Jani; Kilpeläinen, Pekka; Lindén, Greger. TranSID: An SGML Tree Transformation Language. Department of Computer Science, University of Helsinki Report C-1997-36. Helsinki, Finland: University of Helsinki, Department of Computer Science, May 1997. Extent: 14 pages (with 15 references). Authors' affiliation: Department of Computer Science, University of Helsinki.

Abstract: "We present a powerful document transformation language called TranSID, which is targeted at structured (SGML) documents. The language is based on a powerful model where the entire input document tree may be referenced during the transformation process. The evaluation is performed in a bottom-up manner. A language evaluator has been implemented which runs in Unix environments."

Note also the longer work by Greger Lindén: Structured Document Transformations, PhD Thesis, Report A-1997-2, Department of Computer Science, University of Helsinki, June 1997. 122 pages. Available online in Postscript format, via FTP.

The document is available online in Postscript format: via FTP; [local archive copy].

The paper was also published in the Proceedings of The Fifth Symposium on Programming Languages and Software Tools, Jyväskylä, Finland, June 7-8, 1997, ed. Jukka Paakki, pages 72-83, Technical Report C-1997-37, University of Helsinki, Department of Computer Science, June 1997.

[CR: 19961130]

Jackson, William. "On-line Patent Filing is 'critical,' PTO Commissioner Says." Government Computer News 15/17 (July 15, 1996) 14-16.

"Abstract: Patent and Trademark Office (PTO) Commissioner Bruce A. Lehman reported to a House subcommittee that the development of an electronic filing system is 'critical' to the Office's efforts to reduce patent filing time. The PTO, which recently unveiled the Automated Patent System, hopes to reduce patent processing time to 12 months, down from the high of about 3 years in the mid-1980s. The electronic filing system is necessary to reach this goal while supporting a workload that grows 6 percent annually. The PTO will choose between two off-the-shelf applications based on SGML, one developed by InContext, the other by Microstar Software. The candidates will be tested at small companies starting in August 1996, and Lehman hopes electronic filing to be available within three years."

[CR: 19971227 MD: 19971229]

James, Zarella; Harvey, Betty; Welling, Doug. "Railroad Industry Forum (RIF) Electronic Parts Catalog Exchange Standard (EPCES)." Pages 625-630 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [Zarella James]: ISOGEN International Corporation, 2200 N. Lamar, Suite 230, Dallas, Texas; WWW: http://www.isogen.com/; [Betty Harvey]: Electronic Commerce Connection, Inc., Germantown, Maryland USA 20874; Phone: +1 (301) 540-8251; FAX: +1 (301) 540-4268; Email: harvey@eccnet.com; WWW: http://www.eccnet.com; [Doug Welling]: Managing Director of SGML Nexus, software products division of ISOGEN Corporation.

Abstract: "The Railroad Industry Forum (RIF) is a team of the National Association of Purchasing Managers who were tasked to develop a standard for the exchange of electronic parts catalog data within the North American railroad industry. The RIF members are comprised of major railroads and railroad manufacturers. Mary McCarthy and Betty Harvey, Electronic Commerce Connection, Inc. developed the EPCES DTD. EPCES - Electronic Parts Catalog Exchange Standard, is a standard that was developed by the RIF for interchange and presentation of illustrated parts catalogs. The presentation of EPCES information has been designed to facilitate point and click capability. LinkOne is an electronic parts catalog and service manual delivery system. It has been developed to enable electronic viewing of parts and service information for manufactured equipment and processes. LinkOne provides point and click functionality between graphics and textual information. ISOGEN International Corporation has developed an EPCES filter for LinkOne to support importing and/or exporting parts catalog information from the manufacturers or railroads in SGML compliant to the EPCES standard."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

For more information on RIF, see the dedicated database entry Railroad Industry Forum: Electronic Parts Catalog Exchange Standard (EPCES), or the description provided by Betty Harvey via the Electronic Commerce Connection web server.

[CR: 19960827]

Jelliffe, Rick. "No Fear With SGML." Byte 21/8 (August 1996) 20. Author's affiliation: ISO WG 8 and Allette Systems.

Letter to the editor ("Inbox" department), suggesting that some of the problems identified in Byte's article "Work Flow Without Fear" can be addressed through SGML: "SGML can be used to define the interface format of the documents through the work flow."

Jelliffe, Rick; Nicol, Gavin Thomas. "Scrutable Asia." <TAG>: The SGML Newsletter 8/5 (May 1995) 5-7. ISSN: 1067-9197. Authors' affiliation: [Jelliffe] Allette Systems, Sydney, Australia (email ricko@allette.com.au); [Nicol] EBT Japan (email: gtn@ebt.com).

The article is a tour of East Asia focused upon SGML issues. Central issues include the character repertoire, character encodings, and native-language document markup. The sidebar article by Gavin Nicol ("Postcard from Tokyo on HTML") discusses ERCS and ISO-2022-IPEUC in relation to Asian language support in SGML/HTML.

[CR: 19980601]

Jelliffe, Rick. The XML and SGML Cookbook. Recipes for Structured Information. Charles F. Goldfarb Series on Open Information Management. Upper Saddle River, NJ: Prentice Hall PTR, 1998. ISBN: 0-13-614223-0. Author's affiliation: Allette Systems; WWW Page: http://www.allette.com.au/allette/allette/ricko/bio.htm; Email: ricko@allette.com.au.

See the provisional volume description in a separate document. See also provisionally the Amazon.com entry: "Synopsis: SGML experts are in short supply and in high demand. This book will help jump start SGML users by providing 'cookbook recipes' for the most common SGML document type definitions (DTDs). The CD-ROM contains hundreds of sample DTDs that users can cut and paste from to create their own DTD." [amazon.com]

[CR: 19990302]

Jenssen, Astrid E.; Sandahl, Tone Irene. Conflicts between the possibilities and the reality in the field of structured electronic documents. Experiences from a large-scale SGML-project. Technical Report. []: [], [1996]. Authors' affiliation: [Jenssen] Center for Information Technology Services (USIT), Box 1059 Blindern, N - 0316 Oslo, Norway; [Sandahl] Department of Informatics, Box 1080 Blindern, N - 0316 Oslo, Norway University of Oslo, Norway.

Abstract: "The paper presents experiences based on the study of a pilot project integrating an SGML-based document processing system at the University of Oslo, Norway. The experiences are examined from three perspectives in order to discuss them in relation to different aspects of the system; the use situation, the organizational benefits and challenges, and the technological requirements. Improving the system based on experiences within one perspective may lead to conflicts to consider when improving the system based on experiences found within other perspectives. The paper states and discusses some of the conflicts in SGML-based document systems. The paper concludes with challenges in development and use of SGML-based document systems, and states some issues for further research."

The document is available online in HTML format or PDF; [local archive copy].

Jian Zhang. "Application of OODB and SGML Techniques in Text Database: An Electronic Dictionary System." SIGMOD Record 24/1 (March 1995) 3-8. 12 references. Affiliation: Department of Computer and Information Science, Pennsylvania University, Philadelphia, PA, USA.

"Abstract: An electronic dictionary system (EDS) is developed with object-oriented database techniques based on ObjectStore. The EDS is composed of two parts: the Database Building Program (DBP), and the Database Querying Program (DQP). DBP reads in a dictionary encoded in SGML tags, and builds a database composed of a collection of trees which holds dictionary entries, and several lists which contain items of various lexical categories. With text exchangeability introduced by the SGML, DBP is able to accommodate dictionaries of different languages with different structures, after easy modification of a configuration file. The tree model, the Category Lists, and an optimization procedure enables DQP to quickly accomplish complicated queries, including context requirements, via simple SQL-like syntax and straightforward search methods. Results show that compared with relational database, DQP enjoys much higher speed and flexibility. With EDS this paper demonstrates how to apply OODBMS's to systems that handle text information with strong yet varied intrinsic hierarchies."

[CR: 19950922]

Johansson, Stig. "The Encoding of Spoken Texts." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/2 (1995) 149-158.

Abstract: "There is a great deal of variation in the encoding of spoken texts in electronic form, both with respect to the types of features represented and the way particular features are rendered. This paper surveys problems in the electronic representation of speech and presents the solutions proposed by the Text Encoding Initiative. The special tags needed for the encoding of spoken texts are discussed, including a mechanism for temporal alignment. Further work is needed on phonological aspects, parallel representation, and on the development of software which connects the systematic underlying representation with a workable format for input and display."

[CR: 19950828]

Johansson, Stig. English-Norwegian Parallel Corpus: Manual. Internal document for the English-Norwegian Parallel Corpus. May, 1994. Extent: 13 pages.

The manual describes the TEI/SGML encoding scheme used to mark up text samples used in the parallel text project. Available on the Internet in HTML format: http://www.hd.uib.no/doc.html: ENPC Documentation [mirror copy].

[CR: 19950828]

Johansson, Stig; Ebeling, Jarle. "The English-Norwegian Parallel Corpus: Introduction and Applications." Paper submitted to The XXVIII International Conference on Cross-Language Studies and Contrastive Lingustics, Rydzyna, Poland, December 15 - 17, 1994. Oslo, 1994. 14 pages, Postscript, 142 KB.

Available on the Internet in Postscript format: ftp://ftp.hd.uib.no/pub/corpora/enpc.poznan.ps [mirror copy].

[CR: 19950828]

Johansson, Stig; Ebeling, Jarle; Hofland, Knut. "Coding and Aligning the English-Norwegian Parallel Corpus." Paper presented at the Symposium Languages in Contrast, Department of English, Lund University, April 1994. To appear in the Proceedings. Oslo, 1994. Extent: 19 Pages, Postscript, 90 KB.

The document contains a print version of the (TEI/SGML) DTD used in the parallel text corpus, and examples. Available on the Internet in Postscript format: ftp://ftp.hd.uib.no/pub/corpora/enpc.lund.ps [mirror copy]. For further details, see the main entry for the English-Norwegian Parallel Corpus.

[CR: 19950828]

Johansson, Stig; Hofland, Knut. "Towards an English-Norwegian Parallel Corpus.." Pages 25-37 in Creating and Using English Language Corpora. Papers from the Fourteenth International Conference on English Language Reseach on Computerized Corpora, Zürich, 1993, edited by Udo Fries, Gunnel Tottie, and Peter Schneider. Amsterdam/Atlanta: Rodopi, 1994. ISBN: 9051836295.

Based upon a paper from the Fourteenth International Conference on English Language Research on Computerized Corpora, Zürich, May 19-23, 1993.

[CR: 19950804]

Johnson, Eric. "Electronic Shakespeare: Making Texts Compute." Computer-Assisted Research Forum 1/3 (Spring-Summer 1993) 1-3. Eric Johnson is Professor of English and Dean of the College of Liberal Arts at Dakota State University, Madison, SD 57042 U.S.A. He is the Editor of TEXT Technology, and he has published more than fifty articles and reviews about computers, writing, and literary study. He can be reached by electronic mail as JohnsonE@dsuvax.dsu.edu.

Republished in version 2.0 as "Electronic Texts and their Use for Literary Research". See Electronic Texts and Computer Research by Eric Johnson.

Johnson, Eric. "The Electronic Texts We Want and Need." TEXT Technology: The Journal of Computer Text Processing 4/2 (Summer, 1994) 90-92. ISSN: 1053-900X. Eric Johnson is Editor of TEXT Technology. Email: JohnsonE@dsuvax.dsu.edu.

The article is available online via Eric Johnson's WWW server: 'Electronic Texts' [mirror copy here, text only].

Johnson, Eric. "Oxford Electronic Text Library Edition of the Complete Works of Jane Austen [Technical Review]." Computers and the Humanities 28/4-5 (August-October, 1994-1995) 317-321. ISSN: 0010-4817.

"Scholars in the humanities today are routinely doing textual and linguistic research that a generation ago would have been impossible or would have required the dedication of a lifetime. Such research is now feasible because humanists use computers and because texts of major writers are available in electronic form.

The Oxford Electronic Text Library edition of The Complete Works of Jane Austen (OETL Austen) is exactly the kind of electronic text that modern scholars need. It is an accurate rendering of R. W. Chapman's Oxford Illustrated Jane Austen, the standard scholarly edition of Austen, and it contains a wealth of useful information encoded in Standard Generalized Markup Language (SGML). The OETL Austen is distributed in both MS-DOS and Macintosh formats, and a site license is available. It will be used in a multitude of ways by students of Austen for years to come." [from the Introduction]

Johnson favorably reviews the OETL Austen, which uses SGML to structure the electronic text. A copy of the document is available online in HTML format. [Pages under construction: try simply "http://www.dsu.edu/~johnsone/" if the previous link fails.] Full information for ordering the electronic text edition is given in the review. See also a summary of the review by Mary Mallery.

Johnson, Jeff; Beach, Richard. "Styles in Document Editing Systems." IEEE Computer 21/1 (January 1988) 32-43. 16 references. ISSN: 0018-9162. Authors' affiliation: Xerox Corporation.

Abstract: A logical history of document editing mechanisms is presented. The design space for document style mechanisms is analyzed. Six primary design issues and the subsidiary issues they raise are discussed. Some major style issues that are seen as the subject of future research are identified.

[CR: 19950716]

Johnson, Rick. "Review of Practical SGML, 2nd edition, by Eric van Herwijnen." Technical Communication: Journal of the Society for Technical Communication 42/1 (First Quarter, February 1995) 124-125. ISSN: 0049-3155. Author's affiliation: Legislative Service Center, Olympia, WA.

The author writes a positive review, delineating improvements in the second edition.

[CR: 19971120]

Johnston, H. "Internet Tools Take On Real Science." Scientific Computing World 32 (October 1997) 31-35.

Abstract: "Abstract: As Internet tools become more sophisticated, many scientists are abandoning conventional methods of communication, such as the journal and the scientific conference, in favour of electronic means. The obvious benefit of Internet-based communication is the ability to share and discuss data, analysis techniques and conclusions without leaving the laboratory. More importantly, however, the Internet is also inspiring the creation of completely new ways of communication that may have a profound effect on how science is done. The paper discusses the Chemical Markup Language (CML) which facilitates the exchange of chemical information on the Internet. The CML project aims to ensure that chemical software and databases are compatible for use with CML, by means of collaboration with their creators." [CML is an experimental application of XML, Extensible Markup Language.

SCW web site.

Joloboff, Vania. "Document Representation: Concepts and Standards." Pages 75-105 in Structured Documents. Edited by Jacques André, Richard Keith Furuta, and Vincent Quint. The Cambridge Series on Electronic Publishing. Cambridge/New York: Cambridge University Press, 1989. vii + 220 pages, with bibliographic references and index [193-213]. 0-521-36554-6.

This article examines the problem of document representation in computer systems for printing, editing or interchange among heterogeneous systems. After a discussion of the various possibilities for defining documentation representation formalisms, it considers a number of standard representations typical of their class: page description languages, SGML, Interscript, ODA. Several other articles in the volume are of direct or marginal relevance to SGML as a metalanguage for document-structuring.

[CR: 19951113]

Joloboff, Vania. "Trends and Standards in Document Representation." Pages 107-124 (with 11 references) in Text Processing and Document Manipulation. Proceedings of the International Conference, University of Nottingham, 14-16 April 1986. Edited by J. C. [Hans] van Vliet. The British Computer Society Workshop Series. Cambridge: Cambridge University Press [on behalf of the British Computer Society], 1986. ISBN: 0-521-32592-7. Author's affiliation: Bull Research Center (France).

Abstract: "This paper starts by tracing the architecture of document preparation systems. Two basic types of representations appear: at the page level or at logical level. The paper then focuses on logical level representation and tries to survey three existing formalisms: SGML, Interscript, and ODA."

[CR: 19950925]

Judd, Peggy; Johnson, F. Scott. "SGML Enables Full-Text Scientific Publishing on the Web." Computers in Physics 9/4 (July - August 1995) 369-70. ISSN: 0894-1866. Authors' affiliation: Americ, Woodbury, NY.

"Abstract: Two technologies have come together to make online technical publishing begin to work. The first and foremost of these technologies is the Internet. Without this massive network of computers and communication equipment, putting a digital library icon on a lab workstation and on an office desktop would have been both problem-plagued and expensive. The second of these facilitating technologies is Standard Generalized Markup Language (ISO 8879: SGML). SGML is, by one definition, a meta-language with which one can capture the structure and semantics of a class of documents. It is internationally recognized as a standard for document representation. Although SGML products have been available for years, the past two years have seen a real growth in interest and use of this technology. AIP has adopted ISO 12083 as the basis for its SGML documents. As a standard, ISO 12083 is overseen by an international working group but not owned by any one organization."

[CR: 19990803]

Kaelbling, Michael John. Braced Languages and a Model of Translation for Context-Free Strings: Theory and Practice. Ph.D. Dissertation, Department of Computer and Information Science, The Ohio State University. Columbus, Ohio: Ohio State University, 1987. Extent: 111 pages. Advisor: Sandra A. Mamrak.

Available from UMI: University Microfilms International, Inc., Number 8804059.

Kaelbling, Michael J. On Improving SGML. OSU-CIRSC-7/88-TR22. Columbus, Ohio: The Ohio State University, 1988.

Summary: Several improvements are suggested to the syntax of SGML, the recent international standard for the description of electronic document types. These improvements ease processing by existing tools, remove ambiguity cleanly, and increase human usability. They also indicate some guidelines that should be followed in the design and specification of computer-software standards. By following accepted computer-science conventions for the description of languages the design of a standard may be improved, and the subsequent implementation task simplified.

Draft version 18-October-1988, "accepted for publication in Electronic Publishing: Origination, Dissemination and Design." Department of Computer and Information Science; The Ohio State University; 2036 Neil Avenue Mall; Columbus, OH 43210.

See also the response of Ron Hayter, "Comments on 'On Improving SGML'," Technical Bulletin 4. Software Exoterica Corporation [OminMark], 1988. Ron Hayter argues that Kaelbling's "improvements" to SGML are based upon a misunderstanding of the intent of the standard. Kaelbling's original draft known to Hayter was apparently 16-March-1988; Kaelbling's revised draft of 18-October-1988 responds to Hayter's comments.

Kaelbling, Michael J. "On Improving SGML." Electronic Publishing: Origination, Dissemination and Design (EPODD) 3/2 (May 1990) 93-98. 14 references. ISSN: 0894-3982. Author affilation: Siemens AG.

Abstract: "Several improvements are suggested to the syntax of SGML, the recent international standard for the description of electronic document types. These improvements ease processing by existing tools, remove ambiguity cleanly, and increase human usability. They also indicate some guidelines that should be followed in the design and specification of computer-software standards. By following accepted computer-science conventions for the description of languages the design of a standard may be improved, and the subsequent implementation task simplified."

Received 16-March-1988, Revised 18-May-1990. Another version of the paper is found in OSU-CIRSC-7/88-TR22. Author affilation: Siemens AG, ZFE IS EA 11; Corporate Applied Computer Sciences; Otto-Hang-Ring 6; 8000 Munich 83, FRG.

[CR: 19951212]

Kahlisch, Thomas; Vogel, Gunthild. A Journal Header Reader program for the blind: Access to scientific journal article headers. Technical Report, Dresden University of Technology. Dresden: Dresden University of Technology, Department of Computer Science, Institute of Information Systems, 1995. Extent: approximately 9 pages. Author's affiliation: Department of Computer Science; kahlisch@inf.tu-dresden.de [or email: journal@iis350.inf.tu-dresden.de].

Abstract: "The advantage of structured markup in SGML (Standard Generalized Markup Language) has recently become clear. This technology is being used to automatically convert documents into accessible forms for blind people. In Germany one of the first sets of documents available in SGML is the scientific journal article headers from the "Springer Verlag Journal Preview Service". This article gives a description of the "Journal Header Reader" application. We developed this application to make scientific documents in several formats accessible to blind people. The following chapter gives an overview of the SGML facilities used in our project." [from the document introduction]

Available on the Internet in HTML format: A Journal Header Reader program for the blind, [mirror copy, November 1995].

[CR: 19971120]

Kahn, Charles E. "A generalized language for platform-independent structured reporting." Methods of Information in Medicine 36/3 (August 1997) 163-171 (with 41 references). ISSN: 0026-1270. Author's affiliation: Department of Radiology, Medical College of Wisconsin, Milwaukee, USA; Email: ckahn@mcw.edu; WWW: http://www.mcw.edu/midas/recent-papers.html.

Abstract: "Structured reporting systems allow health-care workers to record observations using predetermined data elements and formats. The author developed the Data-entry and Reporting Markup Language (DRML) to provide a generalized representational language for describing concepts to be included in structured reporting applications. DRML is based on the Standard Generalized Markup Language (SGML), an internationally accepted standard for document interchange. The use of DRML is demonstrated with the SPIDER system, which uses public-domain internet technology for structured data entry and reporting. SPIDER uses DRML documents to create structured data-entry forms, outline-format textual reports, and datasets for analysis of aggregate results. Applications of DRML include its use in radiology results reporting and a health status questionnaire. DRML allows system designers to create a wide variety of clinical reporting applications and survey instruments, and helps overcome some of the limitations seen in earlier structured reporting systems."

See the main database entry for SPIDER - Structured Platform-Independent Data Entry and Reporting, or the web site for SPIDER. An online document (Postscript) is available which describes DRML: http://www.mcw.edu/midas/papers/AMIA96-DRML.ps; local archive copy.

[CR: 19971120]

Kahn, Charles E.; Huynh, Phiem N. "Knowledge representation for platform-independent structured reporting." Proceedings of the 1996 AMIA Annual Fall Symposium 8 (?) (1996) 478-482. Authors' affiliation: Department of Radiology, Medical College of Wisconsin, Milwaukee, USA.

Abstract: "Structured reporting systems allow health care providers to record observations using predetermined data elements and formats. We present a generalized language, based on the Standard Generalized Markup Language (SGML), for platform-independent structured reporting. DRML (Data-entry and Report Markup Language) specifies hierarchically organized concepts to be included in data-entry forms and reports. DRML documents serve as the knowledge base for SPIDER, a reporting system that uses the World Wide Web as its data-entry medium. SPIDER generates platform-independent documents that incorporate familiar data-entry objects such as text windows, checkboxes, and radio buttons. From the data entered on these forms, SPIDER uses its knowledge base to generate outline-format textual reports, and creates datasets for analysis of aggregate results. DRML allows knowledge engineers to design a wide variety of clinical reports and survey instruments."

See the main database entry for SPIDER - Structured Platform-Independent Data Entry and Reporting, or the web site for SPIDER - Structured Platform-Independent Data Entry and Reporting An online version of the document in Postscript format: http://www.mcw.edu/midas/papers/AMIA96-DRML.ps; local archive copy.

[CR: 19971118]

Kahn, Charles E.; Pfeifer, Kurt J. "Interactive Creation of Structured Reporting Applications." Radiological Society of North America Electronic Journal (RSNA EJ) 1/ (1997) [na.]. ISSN: 1090-7629. Authors' affiliation: University of Wisconsin - Milwaukee, and Section of Information and Decision Sciences, Medical Informatics and Decision Science (MIDAS) Consortium; WWW: http://www.mcw.edu/midas/kahn.html; Email: kahn@mcw.edu.

Abstract: "Structured reporting systems allow physicians to record findings by using predefined vocabularies and data-entry formats. The data-entry and reporting markup language (DRML) is used to define structured reporting applications for the SPIDER (structured platform-independent data entry and reporting) system. World Wide Web technology can be used to implement systems for structured entry and retrieval of medical data. The SPIDER system and its DRML report-definition language provide simple, platform-independent tools for structured reporting that conform to internationally recognized standards. The article guides readers through the use of DRML and SPIDER, and allows readers to interactively create structured reporting applications."

"DRML is a generalized report-specification language that simplifies the creation and maintenance of structured reporting applications. The specification of DRML as an SGML document type definition provides standardization that allows DRML documents to be used and exchanged across various computing platforms. Systems for publishing and on-screen editing of SGML documents are available commercially [. . .] . Such programs allow interactive, on-screen editing of DRML documents. Software is also available for validating the syntax of SGML documents [...] By including the DRML document type definition within a document (either explicitly or by reference), such software can be used to check the syntax of a DRML report definition. World Wide Web technology can be used to implement systems for structured entry and retrieval of medical data. The SPIDER system and its DRML report-definition language provide simple, platform-independent tools for structured reporting that conform to internationally recognized standards. This article has demonstrated their use for interactively creating structured reporting applications." [from the conclusion]

The document is available online in HTML format; see also target URL, registration may be requested].

[Received January 22, 1997; revision requested February 26; revision received and accepted March 3; posted March 10. Supported in part by The Whitaker Foundation (Biomedical Engineering Research Grant to C.E.K.) and the National Library of Medicine (USPHS grant G08 LM05705). Presented in part as infoRAD exhibit 9111WKS at the 82nd Scientific Assembly and Annual Meeting of the Radiological Society of North America, Chicago, December 1.

Källgren, Gunnel; Eriksson, Gunnar; Höglund, Magnus. "Introducing the SUC: A Large Balanced Corpus, Linguistically Analyzed and Marked-up in Accordance with the Recomendations Issued by the Text Encoding Initiative." Pages 119-120 [partial abstract] in Colloque International "Consensus ex Machina?". Abstracts International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratorie "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages. Author Affiliation: Stockholm University.

[CR: 19990519]

Karben, Alan. "News You Can Reuse. Content Repurposing at The Wall Street Journal Interactive Edition [Project Report]." Markup Languages: Theory & Practice 1/1 (Winter 1999) 33-45. ISSN: 1099-6622 [MIT Press]. Author's affiliation: Associate Director, Interactive Development, The Wall Street Journal Interactive Edition; Email: karben@wsj.com; WWW: http://wsj.com; Tel: +1 (212) 416-2975 FAX: +1 (212) 416 3291.

Abstract: "The content-reuse system of The Wall Street Journal Interactive Edition makes extensive use of SGML and XML to reorganize and reformat the content presented in the main wsj.com website. This paper discusses how the structures that define an Interactive Journal edition and its component articles are queried, processed, and converted by automatically triggered content-processors, allowing us to quickly fill requests by potential publishing partners to feature our branded content in their contexts."

[Conclusion:] '. . . All of our content-reuse processes owe their flexibility and ease of implementation to our use of SGML and XML. Articles created in SGML have been translated and served out in all sorts of flavors of HTML and other plain text formats. Edition structures and configuration files specified in XML are processed and tailored by custom software that allows our editors to specify what constitutes a mini-edition. And when our automatically generated content falls short of serving their audiences completely, an editor can step in and finish the job. . . . Our editors and designers are charged with constantly improving how our news can be accessed, navigated through, presented, and used. And our business-development staff is constantly seeking new ways to raise the visibility of our brand, which often means spreading excerpts from our trove of content out to places and platforms that our primary web site would not otherwise reach. Having our news, and the processes that direct where that news belongs, in an extensible format has proved to be the key to fulfilling their requirements.'

The document is available online in PDF format - "News you can reuse." [local archive copy] For other articles in this issue of MLTP, see the annotated Table of Contents.

Revision: Received 7 July 1998, Revised 12 August 1998.

[CR: 19971227 MD: 19971229]

Karben, Alan. "Ready for Tomorrow's Browsers. The News Production System of The Wall Street Journal Interactive Edition." Pages 567-570 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Alan Karben]: Associate Director, Interactive Development, The Wall Street Journal Interactive Edition, 200 Liberty St., New York, NY 10281 USA; Phone: +1 212-416-2975; FAX: +1 212-416-3291; Email: karben@interactive.wsj.com; WWW: http://www.karben14.com.

Abstract: "Using SGML within our Web publishing system not only allows us to create better-looking and more complicated HTML than editors could otherwise have authored using a native formatting language, but it also allows our editors and designers to massage the look of the edition as often as desired, and to produce spin-off products without additional editorial effort. To be presented will be an architectural overview describing how our publishing system offers editors a tremendous menu of publish-time choices."

"At The Wall Street Journal Interactive Edition, we have been using SGML to mark up news articles since our launch in April, 1996. The elements and attributes we use in our authoring system attempt to answer the question 'What is this content, and what makes it different?' as opposed to 'How do we want this to look in a Web browser?' Even though we may want a byline to wind up looking bold, we mark it up with a <byline> tag, not a <b> tag. Only later in the publishing process do we translate our documents into HTML and its variants. This paper will outline the benefits of this approach, and then describe in some detail how we create our SGML, and how we format it."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

[CR: 19950716]

Kay, Emily. "Airline Pushes for Technology Standards." InformationWeek 534 (July 3 1995) 64.

Federal Aviation Administration guidelines prescribe compliance with SGML for specified information deliverables, and USAir Group Inc.'s maintenance division selects software products that conform to the SGML standard. The article describes how the workflow accounts for non-SGML data in the USAir information system as well.

[CR: 19951220]

Kazman, Rick. Structuring the Text of the Oxford English Dictionary through Finite State Transduction. UWaterloo Technical Report, Data Structuring Group, CS-86-20. Waterloo, Ontario: University of Waterloo Computer Science Department, 1986. Extent: viii + 117 pages. Author's affiliation: [present, 1995] University of Waterloo, Department of Computer Science, 200 University Ave. West, Waterloo, Ontario N2L 3G1 CANADA. Tel: +1 519 888 4567 x4870; Fax: +1 519 885 1208. Email: rnkazman@cgl.uwaterloo.ca. WWW: http://watcgl.uwaterloo.ca/~rnkazman..

Abstract: "By Fall 1986 the Oxford English Dictionary will have been completely entered into machine-readable form as a first step toward creating an integrated version of the Dictionary and its Supplement. The ability to update and revise the OED requires the addition of a considerable amount of structure to the keyboarded text. Various software approaches to transducing the text of the OED in order to add this structure were evaluated, and eventually INR and lsim were chosen. The ise of INR, a program for computing finite automata, necessitated that the structure of the OED be described as a regular language. The methods used to describe the OED, resolve ambiguities and deal with space limitations are detailed. These methods are not limited to the OED, but may be applied to any text in which one wishes to augment the structural information."

The document was also submitted as a master's thesis (Master of Mathematics in Computer Science) to the University of Waterloo. See further on researches related to the production of NOED2 in the main entry for NOED.

[CR: 19980420]

Kelsey, Robert L.; Hartley, Roger T.; Webster, Robert B. "An Object-Oriented Methodology for Knowledge Representation in SGML." Pages 304-11 (with 13 references) in Proceedings of the Ninth IEEE International Conference on Tools with Artificial Intelligence. IEEE International Conference on Tools with Artificial Intelligence, Newport Marriott, Newport Beach, CA. November 4-7, 1977. Sponsored by IEEE Computer Sociecy Technical Committee on PAMI [Pan American Center for Earth and Environmental Studies]; Pan American Center for Earth and Environmental Studies. Los Alamitos, CA: IEEE Computer Society, 1997. Authors' affiliation: Los Alamos National Laboratory, New Mexico, USA; and New Mexico State University. Email: Email: rth@cs.nmsu.edu, rkelsey@cs.nmsu.edu, robw@lanl.gov, rob@xdiv.lanl.gov.

Abstract: "An object-based methodology for knowledge representation and its Standard Generalized Markup Language (SGML) implementation is presented. The methodology includes class, perspective, domain and event constructs for representing knowledge within an object paradigm. The perspective construct allows for the representation of knowledge from multiple and varying viewpoints. The event construct allows actual use of knowledge to be represented. The SGML implementation of the methodology facilitates usability, structured, yet flexible knowledge design, and sharing and re-use of knowledge class libraries."

The article is available online in Postscript format; [local archive copy]

[CR: 19960204]

Kennedy, Dianne. "[Review of] ABCD... SGML: A User's Guide to Structured Information, by Liora Alschuler." <TAG>: The SGML Newsletter 9/1 (January 1996) 9-10. ISSN: 1067-9197. Authors' affiliation: Dianne Kennedy is founder of the SGML Resource Center and currently chairs the DTD Working Group for SAE J2008. Email: dken@mcs.com.

Kennedy summarizes the main features of Alschuler's book, and highlights its unique contributions among the available published books on SGML. For additional information, see the bibliographic entry for Alschuler's ABCD...SGML.

[CR: 19970825]

Kennedy, Dianne. "Approaches to DTD Design." <TAG> 9/5 (May 1996) 1-4. ISSN: 1067-9197. Author's affiliation: SGML Resource Center.

The author provides a survey of DTD development approaches, describing the advantages and disadvantages of each in various contexts.

[CR: 19970314]

Kennedy, Dianne J "Are We Straying From the Goal of SGML?" SGML Users' Group Bulletin 3/1 (1988) 13-15. ISSN: 0269-2538. Author's affiliation: Datalogics, Inc.

The author analyzes the tendency to let production demands override wise long-term maintainability of SGML implementations and SGML-encoded data. The evidences are typically made manifest in trying to make the data fit the DTD and related constraints, rather than improving the DTD and application design.

[CR: 19980413]

Kennedy, Dianne. "Converting SAE J2008 to an XML DTD Using Near & Far Designer 3.0." XML Files: The XML Magazine Issue 04 (March 17, 1998) 6-13. Author's affiliation: XMLXperts Inc..

Summary: "Near & Far Designer 3.0 was specially designed to help make the transition from SGML to XML as smooth and straightforward as possible. Near & Far Designer 3.0 can evaluate any valid SGML DTD and interactively convert all mappings that are one-for-one. It will also highlight any remaining discrepancies, evaluate end user resolutions, and complete the transformation from an SGML DTD to an XML DTD - taking all guess work out of this task. Near & Far Designer 3.0 was designed to enable organizations to make the transition from SGML to XML in a cost and resource effective manner. Following the transition from SGML to XML, the graphical interface of Near & Far Designer 3.0 makes the ongoing creation of XML DTDs an easy task in the future. Designer now offers the document analyst a choice to create either new SGML DTDs or to create XML DTDs directly."

Available online: "Converting SAE J2008 to an XML DTD Using Near & Far Designer 3.0."

[CR: 19960714]

Kennedy, Dianne. "Datalogics Closing Its Doors; The End of an Era . . .and the Next Generation." <TAG>: The SGML Newsletter 9/4 (April 1996) 1, 12-13. ISSN: 1067-9197. Authors' affiliation: Dianne Kennedy is founder of the SGML Resource Center and currently chairs the DTD Working Group for SAE J2008. Email: dken@mcs.com.

The article describes the decisions leading up to the closure of Datalogics, scheduled for Spring 1996. A list of notable past and recent employees of Datalogics is printed in the article. A new company, Datalogics Inc., will assume responsibility for supporting the core products. The new Datalogics will be partially owned by Adobe, together with Steve Brown and Jim McNeill (CEO). A new users' group (DLSIG) is being formed to work with the new company.

[CR: 19970825]

Kennedy, Dianne. "EPSIG Meeting." International SGML Users' Group Newsletter 3/3 (July 1997) 8-9. ISSN: 0952-8008. Author's affiliation: SGML Resource Center, Elmhurst, IL.

A report on the EPSIG meeting of May 12, 1997, and on the April 1-2 1997 meeting in New York. Most of the current work on ISO 12083 related to maths. Among other recent decisions: "It was determined that an ad hoc mathematics group be formed and meet to make recommendations before the formal ISO 12083 meeting in December [1997] in Washington DC. Dianne Kennedy will coordinate that work. DLI list was provided to begin work via email: dli-math@ncsa.uiuc.edu. A second Ad Hoc Committee should be formed to review 12083 and make recommendations. This committee will be responsible for collecting publishers' requirements and documenting how publishers are using ISO 12083 today and how people are currently changing the standard models." [Extracted; see the complete text of the article in the ISUG Newsletter.]

[CR: 19970308]

Kennedy, Dianne. "An Introduction to DSSSL (ISO/IEC 10179) [Part 1]." <TAG> 10/2 (February 1997) 11-4. ISSN: 1067-9197. Author's affiliation: Dianne Kennedy is founder of the SGML Resource Center and currently chairs the DTD Working Group for SAE J2008. Email: dken@mcs.com.

Part 1 of a multipart series of articles on DSSSL.

"One of the newest ISO standards positioned to impact the publishing world is ISO/IEC 10179. In 1988, ISO/IEC JTC1 SC18/WG8, the working group which developed SGML, HyTime, and some of the other SGML-related ISO standards, began writing this new standard. The working group had representatives from the United States, France, Japan, Germany, Ireland, Norway, the United Kingdom, and other countries as well. The new standard, also known as Document Style Semantics and Specification Language (DSSSL), became international standard in April 1996. So, what exactly is DSSSL? How does it fit with SGML and HyTime? And why do we need DSSSL anyway?" [from the Introduction]

See a related version of the article online; [mirror copy]. For more information on DSSSL (Document Style Semantics and Specification Language), see the main entry in the SGML/XML Web Page, and the dedicated section on DSSSL Software Tools.

[CR: 19970331]

Kennedy, Dianne. "DSSSL (Part 2): An Overview of the Languages." <TAG> 10/3 (March 1997) 1-4. ISSN: 1067-9197. Author's affiliation: Dianne Kennedy is founder of the SGML Resource Center and currently chairs the DTD Working Group for SAE J2008. Email: dken@mcs.com.

Part 2 of a multipart tutorial article on DSSSL - Document Style Semantics and Specification Language, ISO 10179. See the first of the serialized articles in the February 1997 issue of <TAG>.

[CR: 19970816]

Kennedy, Dianne. "J2008 DTD Committee." International SGML Users' Group Newsletter 3/3 (July 1997) 11. ISSN: 0952-8008. Author's affiliation: SGML Resource Center, Elmhurst, Illinois, USA.

The article provides an update on SAE J2008, a family of standards pertaining to the automotive and truck industry, particularly for emission-related (clear air) information. The April 11, 1997 meeting actions and issues are highlighted in the report."

"SAE J2008 is a family of standards developed by the membership of the Society of Automotive Engineers in response to the mandate of the Clean Air Act to partition and provide easy access to emission-related automotive service information. At the heart of this SGML standard is a relational Data Model for Automotive Service Information rather than any particular document model. The SGML definition set forth within J2008 provides a hierarchical representation of the Data Model. In addition, this standard provides models for common text constructs such as tables, paragraph, lists, and procedures which are found within automotive service information." [Extracted; see the complete text of the article in the ISUG Newsletter.]

[CR: 19961201]

Kennedy, Dianne. "J2008 Task Force Update." <TAG> 9/11 (November 1996) 1, 10. ISSN: 1067-9197. Author's affiliation: SGML Resource Center.

Update on the Draft SAE J2008 Standard. The California Air Regulatory Board (CARB) is proposing that automobiles made (sold?) in California conform to SAE J2008 for 2002 vehicles; the trucking industry is making progress on a new standard T2008, based upon J2008. For additional information on SAE J2008 and T2008, see the main entry for automotive and truck industry use of SGML.

[CR: 19970620]

Kennedy, Dianne. "Journal Publishers Explore XML." <TAG>: The SGML Newsletter 10/6 (June 1997) 1, 9-10. ISSN: 1067-9197. Authors' affiliation: SGML Resource Center.

The author summarizes the April 1997 tutorial for journal publishers sponsored by GCA, taught by her and Murray Maloney. The focus was upon the emerging XML standard. Kennedy reports a growing interest in SGML/XML among journal publishers, including those who are using ISO 12083 as a basis for enterprise DTDs. The ariticle also addresses W3C math in XML (Extensible Markup Language) documents.

[CR: 19960828]

Kennedy, Dianne. "New Roles for SGML Consultants." <TAG> 9/8 (August 1996) 1-4. ISSN: 1067-9197. Author's affiliation: SGML Resource Center.

The author discusses the variety of services now being delivered by SGML consultants, using her own experiences and those of other consultants as examples (The Sagebrush Group, Mulberry Technologies, L. A. Burman Associates, and Information Architects).

[CR: 19961113]

Kennedy, Dianne. "Seybold San Francisco 1996 [Conference Report]." <TAG> 9/10 (October 1996) 8-10. ISSN: 1067-9197. Author's affiliation: SGML Resource Center. Email: dken@mcs.com.

Summary of the major annual Seybold conference from the perspective of SGML interests. Keynote speeches were delivered by Marc Andreessen of Netscape and Brad Chase of Microsoft. The latter talked about the integrated desktop of Explorer 4.0, "which heralds the importance of integration of the Web with traditional desktop products, whether mainstream or SGML-based."

New products: (1) XyVision SGML Conductor, a compuond document management solution integrating PDM and FrameMaker+SGML; (2) Folio 4.0, which is strongly aligned with Microsoft, and sports features aimed at protecting copyrighted information that is delivered electronically ["rights management functionality"]; (3) Corel Ventura 7.0 - a publishing package completely re-written to support 32-bit processing, and having SGML support in the form of the DTD Designer, SGML Layout, and SGML Editor tools; (4) Near and Far Author 2.0, which includes integration with Microsoft Word 7.

According to the author, the conference evidenced the emergent concept of "Mainstream SGML" - somewhat in opposition to "Industrial Strength SGML." The latter is "a strategy to bring SGML to mainstream business applications at office-software price levels." Microstar and several partner companies have devloped a logo to symbolize the development and marketing focus. Other industry partners are skeptical of the merits of this programme, or are doubtful of the net effects: "will the result be simply hierarchical HTML with certain content extensions?"

[CR: 19960716]

Kennedy, Dianne. "SGML for Journal Publishing." <TAG>: The SGML Newsletter 9/6 (July 1996) 1-3. ISSN: 1067-9197. Authors' affiliation: Founder, SGML Resource Center. Email: dken@mcs.com.

The article is a report and evaluation of the annual conference of the Society for Scholarly Publishing, held in Minneapolis on May 30-31, 1996. The author describes three major DTDs used in journal publishing and some of the challenged presented by cultural and economic issues. In most environments, it is found to be necessary to modify "industry standard" DTDs to meet the requirements of the stakeholders. The ISO 12083 DTD and its predecessor (AAP) are in use, as well as a DTD developed by Elsevier Science Journals. InContext and Folio are implementing a turnkey journal production system (SGML Journal Publisher) using ISO 12083 as the basis for DTD design.

[CR: 19951015]

Kennedy, Dianne. "The SGML Implementation Guide: A Blueprint for SGML Migration." <TAG>: The SGML Newsletter 8/10 (October 1995) 5-6. ISSN: 1067-9197. Author's affiliation: SGML Resource Center; email: dken@mcs.com.

an invited review of the book The SGML Implementation Guide: A Blueprint for SGML MigrationT, by Brian Travis and Dale Waldt. An HTML version of the review is available on the SGML Resource Center WWW Page [mirror, partial links].

[CR: 19951208]

Kennedy, Diane. "Tales from the Front. SGML Database Technologies: Relational vs Object - Which is Appropriate Today?" <TAG>: The SGML Newsletter 8/11 (November 1995) 9-11. ISSN: 1067-9197. Author's affiliation: SGML Resource Center. Email: dken@mcs.com, Tel: 1-708-941-8195.

`Tales from the Front' is a new column in <TAG> beginning with issue 8/11. In the current article, Kennedy describes situations in which either of the two database technologies would be perferable, and suggests that OODBs now have a niche place in the database market, especially within the context of the SGML market.

[CR: 19970331]

Kennedy, Dianne. "[Review of] Document Management for the Enterprise, by Michael J. D. Sutton." <TAG> 10/3 (March 1997) 8-9. ISSN: 1067-9197. Author's affiliation: Dianne Kennedy is founder of the SGML Resource Center and currently chairs the DTD Working Group for SAE J2008. Email: dken@mcs.com.

Review of a book on document managing solutions, written by an advisory member of ISO Commmittee responsible for SGML. Though SGML is not a central topic in the book, the author discusses SGML as playing an important role in document engineering.

[CR: 19960312]

Kennedy, Dianne. "Tales from the Front. Understanding Structured Documents." <TAG> 9/2 (February 1996) 6-8. ISSN: 1067-9197.

The article addresses fundamentals of document structure, and the role it plays in information management using SGML.

[CR: 19961226]

Kennedy, Dianne. "Tools for Implementing SGML-Based Information Systems." Pages 27-36 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: SGML Resource Center, 146 North End, Suite 100, Elmhurst, IL 60126, USA; Tel: 630- 941-8197; FAX: 630-941- 8196; Email: dken@mcs.com; WWW: http://www.mcs.net/~dken/.

Abstract: "Implementing SGML can be an enormous task. To be successful, an implementor must have a good technical background in SGML and must have a clear understanding of data flow and SGML system functionality. Gaining a understanding of the key components of an SGML system is critical. This afternoon's presentations are designed to provide the SGML newcomer with an overview of the major classes of SGML tools and a brief review of the products commercially available today. Presenters for this session are independent SGML consultants who specialize in the design and implementation of SGML-based information systems."

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19980413]

Kennedy, Dianne. "XML-Data: A Schema Language for Structured Data." XML Files: The XML Magazine Issue 04 (March 17, 1998) 2-3. Author's affiliation: Editor, XML Files.

Summary: "In XML terms, XML-Data is an XML tag set which enables us to precisely describe text structures, relational schema and much more. At the core of XML-Data is a DTD for DTDs. To that, elements have been added to describe schema, either relational or object-oriented. The idea is that with XML-Data we can describe any schema. Then, when XML-coded data is delivered via the Web along with an XML-Data Schema, the receiving system will be able to understand what it is getting. It will not only understand the hierarchy of data, but can also understand other relationships. If the data is relational, a client can understand which data elements are keys and which are foreign keys. Or in an object world, the client will clearly understand which elements are in the same 'class', something our standard XML-coded, well-formed data or even XML DTDs do not communicate today."

Online: XML-Data: A Schema Language for Structured Data. See also the main database entry for XML-Data.

[CR: 19971227]

Kennedy, Dianne; Burman, Linda. "Tools for Implementing SGML-Based Information Systems." Pages 23-32 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [Dianne Kennedy]: SGML Resource Center, 146 North End, Suite 100, Elmhurst, IL 60126 USA; Phone: +1 630- 941-8197; FAX: +1 630-941-8196; Email: dken@mcs.com; WWW: http://www.mcs.net/~dken; [Linda Burman]: President, L. A. Burman Associates, 23 Hambly Avenue, Toronto, ON, Canada M4E 2R5; Phone: (416) 699-7198; FAX: (416) 699-1198; Email: linda@interlog.com.

Abstract: "Implementing SGML can be a daunting task. To be successful, an implementor must have a good technical background in SGML and must have a clear understanding of data flow and SGML system functionality. Gaining a understanding of the key components of an SGML system is critical. This afternoon's presentations are designed to provide the SGML newcomer with an overview of the major classes of SGML tools and a brief review of the products commercially available today."

This paper was delivered as part of the "Newcomer" track in the SGML/XML '97 Conference.

[CR: 19971227]

Keough, Janis Allison. "Welcome to the SGML Funhouse." Pages 33-36 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Janis Allison Keough]: SGML Analysis Manager, The Bureau of National Affairs, Inc. (BNA), Data Administration; 1231 25th Street, N.W., Room III-414, Washington, D.C. 20037; Phone: +1 (202) 452-7587; Email: jkeough@bna.com.

Abstract: "This presentation uses a carnival Funhouse as a metaphor for implementing SGML for the first time. The speaker will describe three main areas in the funhouse and the hazards presented in each and some tips for surviving the experience: document analysis, DTD writing, and data markup, including legacy conversion and training users to mark up data."

This paper was delivered as part of the "Newcomer" track in the SGML/XML '97 Conference.

[CR: 19961226]

Kersher, George; Paciello, Michael ; Treviranus, Jutta. "Access to Information by People with Disabilities." Pages 87-92 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: [Kersher]: Recordings for the Blind and Dyslexic; [Paciello]: Yuri Rubinsky Insight Foundation; [Treviranus]: Adaptive Technology Resource Centre, University of Toronto.

Abstract: "Information access for people with disabilities is creating numerous opportunities and challenges within the SGML (Standard Generalized Markup Language) community. Additionally, as a result of the increasing paradigm shift by the publishing industry toward Internet and WWW-based document delivery systems, the importance of producing accessible information using SGML mechanisms has increased immeasurably.

The primary focus of this paper involves the production of electronic documents. However, the key principals involved in the design, production, and delivery of information apply regardless of the document medium.

In this showcase the presenters will: identify major problems in information and software design that deny access, demonstrate successful products that can be used by people with disabilities to access publications, point to resources that assist developers in creating accessible products in the future. The goals of the showcase are to educate participants about accessible electronic text delivery systems, and direct participants toward resources which help them create of choose accessible products."

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19960125]

Key, Martin. "Theory and Practice: Working with SGML, PDF and LATEX at Elsevier Science." Baskerville [The Annals of the UK TEX Users' Group] 5/2 (March 1995) 25-27. ISSN: 1354-5930. Author's affiliation: Elsevier Science Ltd. Email: m.key@elsevier.co.uk.

This issue of Baskerville makes available a number of papers presented at a joint meeting of the UK TEX Users' Group and BCS Electronic Publishing Specialist Group (January 19, 1995) [mirror copy]. See the link to Baskerville, or email: baskerville@tex.ac.uk. Issue 5/2 of Baskerville has other articles on SGML: "Portable Documents: Why use SGML?" (David Barron); "Formatting SGML Documents" (Jonathan Fine); "HTML & TeX: Making them sweat" (Peter Flynn); "The Inside Story of Life at Wiley with SGML, LaTeX and Acrobat" (Geeti Granger); "SGML and LaTeX" (Horst Szillat). See the special bibliography page for other articles on SGML and (LA)TEX.

Keyhani, A. "Building an electronic journal." Pages 257-261 (with 4 references) in 15th National Online Meeting. Proceedings - 1994. National Online Meeting, New York, NY, USA, 10-12 May 1994. Sponsored by Learned Information. Edited by Martha E. Williams. Medford, NJ, USA: Learned Information, 1994. xii + 464 pages. Author Affiliation: OCLC, Dublin, OH, USA.

Abstract: Electronic publishing is under close scrutiny by publishers, who are faced with increasing pressure to publish faster, reduce costs and increase circulation. Before moving forward, publishers need to determine whether the time is right, and then to decide how to implement an electronic version of their print journals or a totally new electronic-only journal. Decisions must be made on SGML vs. scanned pages, and CD-ROM vs. online. Most importantly, publishers need to determine how their electronic products can offer superior value to scholars and researchers, because the journals will fail if they are perceived to be less valuable than their print counterparts. As telecommunications access speeds increase and online storage costs decrease, the distribution of journals, complete with high-quality photographs, tables and equations, through online systems becomes increasingly viable. The electronic medium can be exploited to add links to relevant bibliographic databases as well as to other relevant journals. Comprehensive information can be made instantly available to users through one easy-to-use interface.

[CR: 19971107]

Khare, Rohit; Rifkin, Adam. "Capturing the State of Distributed Systems with XML." Pages 207-217 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Authors' affiliation: [Khare]: MCI Internet Architecture (Boston) ; [Rifkin]: Caltech Infospheres Project, California Institute of Technology.

Abstract: "This paper discusses the challenges of capturing the state of distributed systems across time, space, and communities, and looks to XML as an effective solution. First, when recording a data structure for future reuse, XML format storage is self-descriptive enough to extract its schema and verify its validity. Second, when transferring data structures between different machines, XML's link model in conjunction with Web transport protocols reduces the burden of marshaling entire data sets. Third, when sharing collaborative data structures between disparate communities, it is easier to compose new systems and convert data definitions to the degree that XML documents are adopted for the World Wide Web. Just as previous generations of distributed system architectures emphasized relational databases or object-request brokers, the Web generation has good reason to adopt XML as its common archiving tool, because XML's sheer generic power has value in knowledge representation across time, space, and communities."

A version of this document is available online in HTML format: http://www.cs.caltech.edu/~adam/papers/xml/xml-for-archiving.html; [local archive copy].

[CR: 19971008]

Khare, Rohit; Rifkin, Adam. "XML: A Door to Automated Web Applications." IEEE Internet Computing [IEEE] 1/4 (July - August 1997) 78 - 87. Authors' affiliation: [Khare]: MCI Internet Architecture, Email: khare@alumni.caltech.edu; [Rifkin]: California Institute of Technology, Email: adam@cs.caltech.edu, WWW: http://www.cs.caltech.edu/~adam/.

Abstract: "HTML allows the structural markup of Web documents, distinguishing the elements of a page with tags and declaring the physical relationships among the various document elements. This organizes the display of information and allows humans to read and use it. To give machines this capability, however, requires semantic markup, identifying what each particular element means on its own (for example, 'this is a home street address' or 'this is an e-mail address'). Semantic markup would change what is now simply displayed content to machine-readable, structured content."

"The eXtensible Markup Language (XML) specification makes it dramatically easier to develop and deploy domain- and mission-specific Web pages. In this article, we describe the evolution of the Web's data representation from display formats to structural markup to semantic markup.

"The shift from structural HTML markup to semantic XML markup is a critical phase in the struggle to transform the Web from a universal information space into a knowledge network."

A related version of the article was made available as "X Marks the Spot: eXtensible Markup Language opens the door to a motherlode of automated Web applications", [archive copy, August 4, 1997.]

The published abstract from IEEE (Institute of Electrical and Electronics Engineers, Inc.) is available in HTML format: http://www.computer.org/internet/ic1997/w4078abs.htm; the full text of the article is available from IEEE in PDF format, [local archive copy].

Khatchadourian, Haroutioun; Modiano, Nicole; Heyer, Gerhard; Waldhör, Klemens. "Use and Importance of Standard[s] in Electronic Dictionaries: The Compilation Approach for Lexical Resources." Literary and Linguistic Computing 9/1 (1994) 55-64. 11 references. ISSN: 0268-1145.

Authors discuss [esp. pages 60-61] the development and use of the 'MLEXd' SGML DTD within the MULTILEX project's efforts to standardize access to lexical data. [Abstract needed]

[CR: 19980529]

Kilpeläinen, Pekka. SGML & XML Content Models. Department of Computer Science, University of Helsinki, Report C-1998-12. Helsinki: Department of Computer Science, University of Helsinki, May 1998. Extent: 16 pages, with 17 references. Authors' affiliation: [Kilpeläinen]: University of Helsinki, Department of Computer Science, P. O. Box 26 (Teollisuuskatu 23), FIN-00014 University of Helsinki, Finland; Email: Pekka.Kilpelainen@cs.helsinki.fi; Tel: +358 9 7084 4227; FAX: +358 9 7084 4441; WWW: http://www.cs.Helsinki.FI/~kilpelai/.

Abstract: "The SGML and XML standards use a variation of regular expressions called content models for modeling the markup structures of document elements. SGML content models may include so called and groups, which are excluded from XML. An and group, which is a sequence of subexpressions separated by an &-operator, denotes the sequential catenation of its subexpressions in any possible order. If one wants to shift from SGML to XML in document production, one has to translate SGML content models to corresponding XML content models.

"The allowed content models in both SGML and XML are restricted by a requirement of determinism, which means that a parser recognizing document element contents has to be able to decide without lookahead, which content model token to match with the current input token, while processing the document from left to right. It is known that not all SGML content models can be expressed as an equivalent XML content model. It is also known that transforming an SGML content model into an equivalent XML content model may cause an exponential growth in the length of the content model. We discuss methods of eliminating and groups and analyze the circumstances where they can be applied. We derive a tight bound of e n! on the number of symbols in the result of eliminating an and group of n symbols, where e = 2.71828... is the base of natural logarithms. We present the analysis in a pedagogical manner, emphasizing mathematical methods which are typical to the analysis of algorithms. We also show that minimal deterministic automata for recognizing an and group of n distinct element names contain 2ⁿ states and n 2^n-1 transitions, excluding the failure state and transitions leading to it."

See the online abstract. The full text is available in Postscript format, [local archive copy]

[CR: 19961017]

Kilpeläinen, Pekka. Tree Matching Problems and Applications to Structured Text Databases. Department of Computer Science, University of Helsinki, Report A-1992-6 (PhD Dissertation).. Helsinki: Department of Computer Science, University of Helsinki, November 1992. Extent: 114 pages.

Available in Postscript format via the Internet: ftp://ftp.cs.helsinki.fi/pub/Reports/by_Author/Kilpel%E4inen_Pekka/Tree_Matching_Problems_with_Applications_to_Structured_Text_Databases.ps.gz; [mirror copy].

[CR: 19971206]

Kilpeläinen, Pekka; Wood, Derick. "SGML and Exceptions." Pages 39-49 (with 12 references) in Principles of Document Processing. Proceedings of the Third International Workshop. PODP '96, Third International Workshop. Palo Alto, California. September 23, 1996.. Edited by Charles Nicholas (Department of Computer Science and Electrical Engineering, UMBC, Baltimore, MD) and Derick Wood (Department of Computer Science, HKUST, Clear Water Bay, Kowloon, HONG KONG). Lecture notes in artificial intelligence. Lecture notes in computer science, 1293. Berlin / London: Springer-Verlag, 1997. ISBN: 354063620X. Authors' affiliation: [Kilpeläinen]: University of Helsinki, Department of Computer Science.

Abstract: "The Standard Generalized Markup Language (SGML) allows users to define document type definitions (DTDs), which are essentially extended context free grammars in a notation that is similar to extended Backus-Naur form. The right hand side of a production is called a content model and its semantics can be modified by exceptions. We give precise definitions of the semantics of exceptions and prove that they do not increase the expressive power of SGML. For each DTD with exceptions we can construct a structurally equivalent extended context free grammar. On the other hand, exceptions are a powerful shorthand notation-eliminating them may cause exponential growth in the size of a DTD."

[CR: 19961017]

Kilpeläinen, Pekka; Wood, Derick. SGML and Exceptions. Paper presented at PODP '96 Workshop. Helsinki: University of Helsinki, 1996. Extent: [??]. Authors' affiliation: [Kilpeläinen]: Department of Computer Science, University of Helsinki, Finland; WWW Home Page; [Wood]: Department of Computer Science Hong Kong University of Science and Technology HKUST, Clear Water Bay, Kowloon Hong Kong. Tel. +852.2358.6988; Fax +852.2358.1477; E-Mail dwood@cs.ust.hk; WWW Home Page..

Paper presented at PODP '96 Workshop on the Principles of Document Processing, Palo Alto, September 23, 1966. To be published by Springer-Verlag in the Conference Proceeedings. See the note of erick Wood [October, 1996]: "Pekka Kilpelainen, Helen Cameron, and Chris Cleverley and I are currently examining the issues of exceptions and their expressive power, the decidability of structural equivalence of DTDs, and how tag minimization can be defined in a general way." [from "FOUNDATIONS OF MARKUP, HTML, AND SGML"]. Other papers [need bibliog. work] include: (1) P. Kilpelainen and D. Wood, Exceptions in SGML document grammars, (1996), 30 pages. Also appeared as Technical Report HKUST-CS95-??; (2) P. Kilpelainen and D. Wood, SGML and Exceptions, (1996), 13 pages. Also appeared as Technical Report HKUST-CS95-??; (3) H. A. Cameron and D. Wood, Structural equivalence of regular extended context-free grammars and SGML DTDs, in preparation, 1996.

Kimber, W. Eliot. HyTime and SGML: Understanding the HyTime HyQ Query Language Technical Report, Version 1.1. IBM Corporation, August 2, 1993. 40 pages.

Abstract: "This document is intended to provide a brief tutorial introduction to the HyQ language. It is assumed that you have a working knowledge of SGML and have a copy of the HyTime standard, ISO 10744 [Hypermedia/Time-based Structuring Language ISO/IEC 10744:1992], at hand, although it does not assume that you have more than a passing familiarity with HyTime." [from About this Document] Note: Eliot Kimber is authoring a full-length book on HyTime that will be published in 1995; see the bibliographic entry.

This tutorial is available in compressed Postscript format from the Exeter SGML Project FTP server as Kimber-on-HyQ-1.1.ps.Z (note binary mode FTP transfer required), or in compressed text (ASCII) format FTP to SGML Project. Alternately, it is available in plain text (ASCII) format from the SGML Repository.

[CR: 19971227]

Kimber, W. Eliot. "HyTime Show and Tell." Pages 185-194 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [W. Eliot Kimber]: ISOGEN International Corp., 2200 N. Lamar St., Suite 230 Dallas, Texas 75202; Phone: 512.339.1400; Email: eliot@isogen.com; WWW: http://www.isogen.com.

Abstract: "[The presentation] describes several demonstrations of using various tools with the HyTime architecture to do useful and unique tasks. Demonstrations include the creation, management, and presentation of editorial notes, the use of HyTime to create 'virtual' and 'compound' documents. Demonstrates the power of the HyTime architecture both as a set of useful facilities and as a standard that enables interchange and interoperation."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

[CR: 19960331]

Kimber, W. Eliot. Position Paper for Workshop on Incorporating Hypertext Functionality into Software Systems II: Using Data Standards to Enable Hypermedia Interoperation. Paper presented at the Second International Workshop on Incorporating Hypertext Functionality into Software Systems, held in conjunction with the ACM Hypertext '96 conference, Washington, U.S.A.. Cupertino, CA: Passage Systems Inc., March 1996. Extent: approximately 5 pages. Author's affiliation: Passage Systems, Inc. (Email: kimber@passage.com).

"A basic premise of an open hypertext system is that it must be possible to create hyperlinks unilaterally among data objects to which write access is not available. In other words, it must always be possible to create hyperlinks that are independent of the data objects they connect. From this it follows that hyperlinks should always be conceived of and managed as first-class independent objects, at least for the purpose of defining general data models and management schemes. . . The database of hyperlinks must have the characteristics of a traditional database. It must be in a neutral data format that can be accessed with a minimum of cost. It must be general enough to support unanticipated uses. It must provide sufficient expressive power to enable the description of relationships as richly and precisely as authors desire." [extracted]

Available on the Internet in HTML format: http://space.njit.edu:5080/HTFII/Kimber.html [mirror copy, partial links].

[CR: 19980203]

Kimber, W. Eliot. Practical Hypermedia: An Introduction to HyTime. Charles F. Goldfarb Series On Open Information Management. New York: Prentice-Hall Professional Technical Reference, [forthcoming] 1997. Extent: 256 pages. ISBN: 0-13-309899-0. Author affiliation: Passage Systems, Inc. (W. Eliot Kimber (kimber@passage.com); Systems Analyst and HyTime Consultant; Passage Systems, Inc., 2608 Pinewood Terrace; Austin TX 78757 (512)339-1400; 465 Fairchild Dr., Suite 201; Mountain View, CA 94043 (415) 390-0911); .

Summary: HyTime is an ISO standard (ISO/IEC 10744) that is an extension to SGML. It is intended to support electronic documents which use hyperlinking and multi-media elements. In this book, Kimber focuses on the most practical aspects of the HyTime standard, explaining how to use HyTime to move information from the traditional print-based medium to hypermedia. [publisher's pre-publication description]

The book "Provides an introduction to the HyTime standard, ISO/IEC 10744. Intended primarily for people who have some experience with SGML, especially people doing technical publishing and documentation. A knowledge of SGML syntax is not required but will help make the details and examples easier to understand. The book does include an introduction to SGML syntax and terminology." [author's summary]

Another summary: "This beginner-level conceptual overview of HyTime explains how HyTime is used both in traditional information processing applications and in multi-media/hypermedia applications. It sorts out the basic concepts from the confusing details in the HyTime standard, and shows readers how the standard can be applied relatively simply and easily to existing SGML and hypermedia applications. . .[the book] (1) "discusses the basic problem that HyTime (and by extension SGML) tries to solve, explaining in general terms how HyTime solves that problem, and introducing the necessary syntax; (2) explains the role that SGML-encoded data plays in information management processing; (3) discusses how HyTime addressing methods are used to locate all types of data; (4) considers how to define and use HyTime property sets to access data of any type; (5) explains how to implement HyTime functions using the facilities of existing SGML-based produces and systems; (6) includes samples of HyTime markup, descriptive illustrations, and problems; (7) shows how to incorporate HyTime concepts and architectural forms into an application and includes an application specification for a small HyTime application; (8) [has] an accompanying diskette contains sample code and a public SGML domain parser." [from the PTR server, unstable URL]

See "Dr. Macro's Books for Review" (access to review drafts of books under development - sign up to review the book in advance). Provisionally, see also Eliot Kimber's HyQ tutorial.]

More/recent information on the book is/was available via the Prentice Hall WWW server's search facility: http://www.prenhall.com/, or [mirror copy of the book abstract, from February 1996]]. Possible (unstable) URL for the volume Table Of Contents.

[Note February 03, 1998] See the announcement from Eliot Kimber (ISOGEN International Corporation) for an updated review draft of his forthcoming book Practical Hypermedia: An Introduction to HyTime. The draft incorporates 1) a "new and improved HTML version with useful navigation aids, working cross references, and hyperlinks to the standard itself; 2) an update of the first five chapters to reflect the final text of ISO/IEC 10744:1997, through Hyperlinking; 3) an updated summary of changes for HyTime Second edition (Appendix B in the volume), which you can also find at http://www.hytime.org/papers/hytime-2ed-soc.html." [adapted] HyTime users will recognize the significance of this important reference work, and the value of the online draft version, for which the author now solicits critical review and feedback.

[CR: 19960418]

Kimber, W. Eliot. Re-Usable SGML: A Plea for SUBDOC. Passage Systems Technical Paper. A poster session first presented at SGML '95. Cupertino, CA : Passage Systems, Inc, 1995-1996. Extent: approximately 6 pages. Author's affiliation: Passage Systems, Inc.

Summary: "External general text entities are not generally re-usable because: (1) IDs and Entity names not guaranteed unique; (2) Fragments cannot be validated in isolation; (3) No SGML-defined structural constraints on general text entities . . .Subdocument entities eleminate the re-use problems inherent in general text entities because they are themselves complete documents."

Available online: "Re-Usable SGML: A Plea for SUBDOC" (W. Eliot Kimber); [SGML version]; [mirror copy]

[CR: 19961226]

Kimber, W. Eliot. "Re-Usable SGML: Why I Demand SUBDOC." Pages 431-440 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Highland Consulting Division of ISOGEN International Corporation, Dallas, TX.

Abstract: "This paper discusses the issues of SGML re-use and shows why they can only be solved generally through the use of subdocuments. The paper explores the following general issues:

General text entities are not re-usable
How to enable interoperation of documents with possibly different document types?
How to effect the cross-document addressing needed when a single document is composed of many subdocuments?

The SGML standard only defines two object types that can have independent existence: documents and subdocuments. Thus it is clear that only documents and subdocuments can be reliably re-used. In particular, external general text entities are not useful candidates for general re-use. My plea then is for tools to add the functions necessary to support the use of subdocuments for the re-use of semantic fragments. For most applications, such as browsers, this means treating the content of subdocument entities as though it had occurred in a general text entity for the purpose of processing (not parsing). For parsers, it means providing a mechanism to either parse multiple documents in parallel or to suspend the parsing of the parent document while the subdocument is parsed and then integrating the parsing result of the subdocument with the data resulting from the parsing of the parent document. For editors, it means allowing the declaration and editing of subdocument entities. Editors, in particular, may also need to provide ways to define constraints on what document types or architectures are to be allowed for subdocuments in specific application environments (families of DTDs).

I think that these conventions provide a clear and simple way to make the use of subdocuments in general less problematic and more fruitful. The full promise of SGML cannot be realized until the problem of fragment re-use is solved and I am firmly convinced that subdocuments are the key to that solution."

See the online version of the paper: "Re-Usable SGML: Why I Demand SUBDOC", SGML '96 presentation by W. Eliot Kimber of ISOGEN International Corp.; [mirror copy]. An SGML version is also accessible via the ISOGEN server, as well as a package containing HyBrowse styles and instructions for using HyBrowse.

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19960418]

Kimber, W. Eliot. SGML Document Management. Passage Systems Technical Paper. Cupertino, CA : Passage Systems, Inc, 1995. Extent: approximately 12 pages. Author's affiliation: Passage Systems, Inc.

"As SGML moves into the main stream as a preferred information representation method, enterprises are faced with the problem of managing their SGML data. . .Attempts to apply information management techniques borrowed from relational databases and program source code management have largely failed. Relational databases are inappropriate because they are intended for data that breaks down well into small, discrete units that organize into tables, which documents largely do not do. Program code management systems fail because documents are generally not record-oriented, complicating the problems of change tracking and management that largely depend on the record-oriented nature of most programing languages. . .SGML holds the potential to solve some of these problems. . ." [from the Introduction]

Available online: "SGML Document Management" (W. Eliot Kimber); [SGML version]; [mirror copy]

[CR: 19971125]

Kimber, W. Eliot. "Tastes Great - Less Filling: SGML for the 21st Century." Page(s) 333 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Senior SGML Consultant, Highland Consulting, Dallas, TX.

Abstract: [for the Closing Keynote address] "With developments like the World Wide Web, intranets, and increased focus on standardization by major software vendors, SGML and its related standards are being revised and enhanced to reflect new technologies and new requirements. This presentation looks at recent events--including the publication of the DSSSL standard, the HyTime Technical Corrigendum, and the XML specification--and projects the trends they represent into the future of SGML. The major trends are more functionality at a lower cost of entry, providing greater overall value."

The printed version of the presentation is available online in SGML format: see W. Eliot Kimber's Closing Keynote Address: "Tastes Great - Less Filling: SGML For the 21st Century."; see also the index page at ISOGEN for the slides and other formats. URL: slides and paper, .ZIP archive; [local archive copy].

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.

[CR: 19960313]

Kimber, W. Eliot. Why I Want the SGML LINK Feature. Presentation at the SGML '95 Conference, December 1995. Cupertino, CA: Pasage Systems, 1995. Extent: approximately 6 pages. Author's affiliation: Pasage Systems.

"Abstract: The paper discusses why the SGML LINK feature is important, the role it plays in the standardization and interchange of document processing information. Also discusses the specific ways in which LINK can be used effectively. Proposes a simple convention for the use of LINK. Discusses the relationship between LINK and other style and transformation specification mechanisms, such as DSSSL."

Available online in HTML format: [mirror copy, partial links].

[CR: 19970817]

Kimber, W. Eliot; Woods, Julia A. "Application of HyTime Hyperlinks and Finite Coordinate Spaces to Historical Writing, Analysis, and Presentation." Pages 603-613 (with 15 references) in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: [Kimber]: Isogen International Corporation, 2200 N. Lamar, Dallas, TX 75202; Email: eliot@isogen.com; WWW: http://www.drmacro.com/; [Woods]: Department of History, University of Texas, Austin 2608, Pinewood Terrace, TX 78757.

Abstract: "Defines a method of using the constructs defined by the HyTime standard (ISO/IEC 10744,1992) to both structure scholarly writing by capturing the abstract relationships within it and to affect its presentation in ways that express those relationships through the use of dynamic multimedia presentations. The design assumes that the data to be accessed comes from an essentially unbounded set of networked resources, rather than from a self-contained database. By using HyTime, the design separates the logical structuring and abstract fictional definition of the system from any specifics of implementation, including details of data location and access, with the specific goal of enabling interchange of both structured source data and presentation specifications among disparate systems, or implementations of the same basic system, while also enabling the use of the data by other SGML or HyTime applications for other unanticipated uses."

See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.

Kimura, Gary Dean. "A Structure Editor for Abstract Document Objects." IEEE Transactions on Software Engineering SE-12/3 (March 1986) 417-435. 13 references. Author affiliation: Digital Equipment Corp., Bellevue, WA, USA.

Abstract: The author presents an interactive document editor based on an expressive abstract document model for paper and electronic documents. The model introduces the notions of abstract and concrete objects, hierarchical composition of ordered and unordered objects, sharing of components, and reference links. It has been used to specify a wide variety of document objects, and is the basis for a document processing system that allows its users to edit the logical structure of a document using specific structure editing commands. This system introduces two new ideas. The first involves computational objects; each object can be programmed to generate its own unique view of the document, and each of these views can be displayed in a separate window on the screen. The second involves multiple windows to display the document structure. The windows are arranged hierarchically as sets and sequences, depending on the composite structure of the document. This system is used for both editing and viewing documents.

[CR: 19971206]

King, P. R. "A Logic Based Formalism for Temporal Constraints in Multimedia Documents." Pages 87-101 (with 11 references) in Principles of Document Processing. Proceedings of the Third International Workshop. PODP '96, Third International Workshop. Palo Alto, California. September 23, 1996.. Edited by Charles Nicholas (Department of Computer Science and Electrical Engineering, UMBC, Baltimore, MD) and Derick Wood (Department of Computer Science, HKUST, Clear Water Bay, Kowloon, HONG KONG). Lecture notes in artificial intelligence. Lecture notes in computer science, 1293. Berlin / London: Springer-Verlag, 1997. ISBN: 354063620X. Author's affiliation: .

Abstract: "This paper describes how an executable interval temporal logic may be used as a formalism for specifying and manipulating temporal constraints among objects within multimedia documents. The paper presents a taxonomy of such constraints, based in part upon the functionality of existing systems such as the HyTime standard, Firefly and MHEG. It then shows, largely by a series of examples, how each of the elements of this taxonomy can be accommodated in this formalism. It also suggests how this formalism could assist the author in modelling and testing such sets of temporal constraints, and hence serve as an aid in prototyping such documents."

[CR: 19961018]

King, P. R. "Modelling Multimedia Documents." Pages 95-110 (with 15 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Author's affiliation: Department of Computer Science, The Univesity of Manitoba, Winnipeg, Manitoba, Canada R3T 2N2. Email: prking@cs.umanitoba.ca.

Abstract: "This paper discusses the need for models for multimedia documents and describes a particular formal model. The model makes use of an executable Interval Temporal Logic as its basis. The paper describes how temporal constraints among media items may be specified for subsequent manipulation and for use in prototyping. In particular, it uses the powerful notion of interval projection, both as a device for specifying variable display rates for media items and also for providing a scripting mechanism. The paper also outlines how this model may be used as the basis of an authoring tool for such documents."

The article focuses upon document modelling and authoring systems designed to take advantage of formal models. The author's interest extends beyond the notion of attribute grammar, which serves as the core formalism in the authoring of structured documents in SGML and HyTime; he seeks to develop a model which formally addresses manipulation, including temporal aspects of authoring.

For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.

[CR: 19961226]

Kinney, Diane. "Reengineering SGML Implementation: Second-Generation SGML Systems." Pages 579-582 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Director, Data Management, West Information Publishing Group, Thomson Legal Publishing, The Aqueduct Building, 50 Broad Street East, Rochester, New York 14626, USA. Tel: +1 716-327-6211; FAX: +1 716-327-690; Email: dgkinney@lcp.com; WWW: http://www.lcp.com/ (Lawyers Cooperative Publishing).

Abstract: "Thompson Legal Publishing has re-engineered aging SGML-based systems to meet current needs. Tools were chosen from solid companies that did not expose the SGML to users, did not restrict the use of SGML in any way, that have the capacity to emulate structure and that have API's. Users now work in an environment that does not force them to place thirty elements/attributes in the data to enter one judicial case citation. Instead, a couple of clicks of the mouse, and in goes the case cite. Our savings in output processing have been enormous; a process that used to take cost $18.00/page now and costs $0.95 per page. The system's simplicity from the user's point of view will be demonstrated, and the complexity of the data created and the resulting flexible output will be shown.

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19970607]

Kipp, Neill A. "Button Bars, Sticky Notes, Hep Cats, and Ripple: Using HyTime Hyperlinking to Make Lasting Intranet Relationships ." <TAG>: The SGML Newsletter 10/5 (May 1997) 11-14. ISSN: 1067-9197. Authors' affiliation: Virginia Polytechnic Institute and State University, Electronic Thesis and Dissertation Project.

[CR: 19971227]

Kipp, Neill A. "Case Study: Digital Libraries with a Spatial Metaphor. Do You Remember Your First Visit to the Library?" Pages 631-640 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Neill A. Kipp]: Virginia Polytechnic Institute and State University, Department of Computer Science, 660 McBryde Hall, Blacksburg, Virginia USA 24061; Email: nkipp@vt.edu; WWW: http://etd.vt.edu/~nkipp/.

Abstract: "As the Web grows wildly, so do archives of electronic documents. Unfortunately, while novice computer users have a right to electronic information, they are often ill-equipped to master the intricacies of boolean search, SQL, and natural language interfaces. Furthermore, novice users are not versed in how to navigate Web topologies effectively, like determining possible work-arounds for '404: Web page not found' errors. With spatial-oriented user interfaces built directly from SGML / HyTime document databases, we can utilize users' a priori knowledge of information spaces. Users everywhere can enjoy navigating the epitome of well-tended hypermedia databases: digital libraries."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

See the "SGML '97 Talk Slides" provided by the author, and other information on the Digital Library 3D Interface Project.

[CR: 19970228]

Kipp, Neill A. Document Type Definition for Electronic Theses and Dissertations. Technical Report, Virginia Tech Graduate School [With Edward A. Fox, John L. Eaton, and Gail McMillan]. Blacksburg, Virginia: Virginia Tech Graduate School, November 9 1996. Extent: approximately 18 pages. Author's affiliation: Virginia Polytechnic Institute and State University, Electronic Thesis and Dissertation Project, Department of Computer Science, 660 McBryde Hall, Blacksburg, Virginia 24061, USA. Email: nkipp@vt.edu; WWW: http://etd.vt.edu/etd/.

Abstract: "The Virginia Tech Graduate School requires a specific form for the submission of Electronic Theses and Dissertations (ETDs) to maintain the consistency of these complex documents. The formal statement of these guidelines serves graduate students submitting ETDs, the faculty with whom they work, and scholars who study the submitted ETDs. We defined a Document Type Definition (DTD) in the Standard Generalized Markup Language (SGML) for the representation of ETDs, a logical choice for encoding complex electronic documents. To build the DTD, we analyzed constructs in existing theses and dissertations and studied the rules for their submission. Here we present definitions, annotations, and rationale for each document construct, and we explain the connection of the document constructs into an integrated DTD."

Available online: Document Type Definition for Electronic Theses and Dissertations, by Neill A. Kipp; [mirror copy]

[CR: 19970207]

Kipp, Neill A. "A Five-Year Portent: It's the End Tag of the World as We Know It." <TAG>: The SGML Newsletter 10/1 (December 1997) 1-4. ISSN: 1067-9197. Authors' affiliation: Department of Computer Science, Virginia Tech, Blacksburg, Virginia.

The author speculates on industry trends and the future role of SGML in the coming years.

[CR: 19980719]

Kipp, Neill A. "A Form is A Link and Other Adventures in Hyperspace." <TAG>: The SGML Newsletter 11/7 (July 1998) 7-9. ISSN: 1067-9197. Author's affiliation: Department of Computer Science, Virginia Tech, Blacksburg, Virginia.

The author builds upon an earlier essay in <TAG> in which he showed how one might conceive of an HTML document as "an n-ary hyperlink." Here, he explains how the "everything is a link" paradigm can facilitate the design of interactive information systems. The NDLTD (Networked Digital Library of Theses and Dissertations) project uses the notion of "form-as-a-link."

[CR: 19970306]

Kipp, Neill A. "Growing Your Infra-SSStructure to SSSupport Global MarketSSS with DSSSL." <TAG> 10/2 (February 1997) 8-12. ISSN: 1067-9197. Author's affiliation: Department of Computer Science, 660 McBryde Hall, Blacksburg, Virginia 24061, USA. Email: nkipp@vt.edu; WWW: http://etd.vt.edu/etd/.

The author overviews DSSSL's main features and then suggests how these features assist information providers with the goal of delivering SGML-encoded data into international markets.

For more information on DSSSL (Document Style Semantics and Specification Language), see the main entry in the SGML/XML Web Page.

[CR: 19980301]

Kipp, Neill A. "Liberating Your Links and Flying First Class." <TAG> The SGML Newsletter 11/2 (February 1998) 5-8. ISSN: 1067-9197. Author's affiliation: Department of Computer Science, 660 McBryde Hall, Blacksburg, Virginia 24061, USA. Email: nkipp@vt.edu; WWW: http://etd.vt.edu/etd/.

The author discusses the notion of "link" in the context of HTML and XML documents, and in terms of HyTime link concepts.

[CR: 19981007]

Kipp. Neill A. "MetaStructures 1998 Conference [Report]." <TAG>: The SGML Newsletter 11/9 (September 1998) 6-10. ISSN: 1067-9197. Authors' affiliation: .

Kipp provides a detailed conference report for the MetaStructures 1998 Conference (August 17 - 19, 1998) held at Le Centre Sheraton Hotel, Montréal, Québec, Canada. The conference was hosted by GCA, and chaired by Steve Newcomb (TechnoTeacher) and Carla Corkern (ISOGEN International Corp). Papers from some of the (more than twenty) presentations will be made available online from http://www.hytime.org/. "In summary," Kipp writes, "MetaStructures '98 had more attendees discussing deeper issues than last year. It had more live design and product demos than any year before. Even so, the ideas of interoperable metastructures are still on the 'bleeding edge' of technology. Therefore, if you feel that the SGML/XML trade shows are secret plots to numb your mind, then next year's MetaStructures will provide the intellectual stimulation you absolutely need."

[CR: 19950925]

Kipp, Neill A. "The Second International HyTime Conference." <TAG>: The SGML Newsletter 8/9 (September 1995) 1-13. ISSN: 1067-9197. Authors' affiliation: Kipp Software Services, and Virginia Tech.

The author gives a detailed summary of the presentations made at the second HyTime conference, August 16-17, 1995, in Vancouver, British Columbia.

[CR: 19970207]

Kipp, Neill A. "SGML '96: Ten Years and Kicking Like a Reindeer." <TAG>: The SGML Newsletter 9/12 (December 1996) 1-9. ISSN: 1067-9197. Authors' affiliation: Department of Computer Science, Virginia Tech, Blacksburg, Virginia.

The author supplies a thorough review of the SGML '96 Conference. In product news there is (a) OmniMark V3, enabling Internet transaction servers; (b) AIS Balise Double-Byte Edition, with native support for Unicode, JIS or other large character sets; (c) Stilo Structured Document Editor, with XML support; (d) ArborText's ADEPT Release 7, which will bring the Windows version into sync with the UNIX version 6 platform, add more Asian support, and offer Visual ACL. Other highlights covered in Kipp's report: XML, SGML revision ("SGML '97"), DSSSL, "semantic" DTDs for mathematics, HyTime tools.

[CR: 19961226]

Kipp, Neill A. "SGML Usability and DTD Design." Pages 419-430 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Virginia Polytechnic Institute and State University, Electronic Thesis and Dissertation Project, Department of Computer Science, 660 McBryde Hall, Blacksburg, Virginia 24061, USA. Email: nkipp@vt.edu; WWW: http://etd.vt.edu/etd/.

Abstract: "SGML is the logical choice for encoding electronic documents, and Virginia Tech encourages (and will later require) students to submit Electronic Theses and Dissertations (ETDs) in SGML. Our DTD must work with translators as well as be usable for students preparing SGML directly. A usability test for tagging ETDs according to our DTD involved teaching SGML-novice graduate students to code using our DTD, observing them tagging their own documents, and having them narrate their thoughts during the process. Our results show that subjects require high-quality system documentation (replete with examples of correct usage), that learning to author the simplest hypermedia in SGML is inherently nonintuitive, and that our line-edited, batch-processed ETD formatting system is easy to use.

This work was funded in part by the Southeastern Universities Research Association (SURA) 1996 project, 'Development and Beta Testing of the Monticello Electronic Library Thesis and Dissertation Program'."

More detailed information on the Electronic Theses and Dissertations project may be found in the SGML/XML Web Page; or see http://etd.vt.edu/etd/. See especially the brief project description [mirror copy], and a related write-up in the September 1996 issue of D-Lib Magazine [mirror copy, December 1996]

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971014]

Kipp, Neill A. "Tool-Time With a HyTime Paradigm." <TAG>: The SGML Newsletter 10/9 (September 1997) 1-4. ISSN: 1067-9197. Author's affiliation: Virginia Polytechnic Institute and State University, Electronic Thesis and Dissertation Project.

The author provides an in-depth review of the 1997 International Conference on the Application of HyTime (IHC '97, August 19 - 20, 1997, Quebec). The article contains a sidebar on HL7 (Health Level 7) and the proposed Kona architecture, which used HyTime constructs.

Note that the author's own paper for IHC '97 "HyTime Engine Peer-Peer Protocol. HEP Cats Jam Java in the Digital Library" is available as part of the online conference record/proceedings; [archive copy].

[CR: 19971230]

Kipp, Neill A. "Will XML-Linking Be Useful for Me - As I'm Building My Very Own Digital Library?" <TAG>: The SGML Newsletter 10/12 (December 1997) . ISSN: 1067-9197. Author's affiliation: Department of Computer Science, Virginia Tech, Blacksburg, Virginia; Email: nkipp@vt.edu; WWW: http://etd.vt.edu/~nkipp/.

The author discusses the relationship between XLL [Extensible Linking Language] to other standards, reviewing the major linking facilities proposed in the draft specification. For the context of the presentation (linking within digital libraries), see also the "SGML '97 Talk Slides" provided by the author, and other information on the Digital Library 3D Interface Project.

[CR: 19970918]

Kirschenbaum, Matthew G; Fox, Ed. "Electronic Theses and Dissertations in the Humanities." Pages 102 - 104 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Authors' affiliation: [Kirschenbaum]: University of Virginia, Email: mgk3k@virginia.edu; [Fox]: Virginia Polytechnic Institute and State University (Virginia Tech), Email: fox@vt.edu.

[Extract:] "The rationales for Virginia Tech's ETD project, include: (1) Preparing graduate students for their professional careers by training them in the use of digital libraries and introducing them to electronic publishing; (2) Promoting collaboration between graduate research programs at separate universities by making graduate scholarship visible and accessible via a network archive; and, (3) More efficient use of the university's library and administrative resources. The channels Virginia Tech has established to guide the finished thesis or dissertation from the student's personal computer to the offices of the graduate school and to the library's on-line archive will be reviewed, stressing that an important component of the Virginia Tech project has been to develop potential models (as opposed to absolute standards) for ETD production elsewhere. Also to be discussed are document formats for the completed ETD (PDF and SGML), multimedia applications, and the archiving of the ETD with UMI. Finally, a statistical analysis of existing Virginia Tech ETDs will be presented. . ."

Abstract available online in HTML format: "Electronic Theses and Dissertations in the Humanities", by Matthew G. Kirschenbaum, Ed Fox; [archive copy]. See the earlier presentation: "Electronic Publishing and Doctoral Dissertations in the Humanities", by Matthew G. Kirschenbaum, 1996 Convention of the Modern Language Association, Washington DC. [archive copy]. See more information on the Electronic Thesis and Dissertation Project and related efforts in the dedicated database entry, and the August-September 1997 discussion, or Electronic Theses and Dissertations in the Humanities: Directory of Resources, by Matthew G.Kirschenbaum.

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.

[CR: 19970314]

Kiser, Betsy "." SGML Users' Group Bulletin 3/2 (1988) 55-61. ISSN: 0269-2538. Author's affiliation: Online Computer Library Center (OCLC).

The author describes the components of EIDOS -- "Electronic Information Delivery Online System." EIDOS is part of OCLC's endeavor to deliver resources to libraries. Some 6000 libraries can access the EIDOS database, which was set up using SGML, following the AAP DTD. The article also overviews the Twayne project in which 150 volumes are being encoded in SGML (AAP) for the G. K. Hall Twayne Series.

The article is based upon a paper presented at Markup '88, Ottawa, 24-26 May 1988.

Klas, Wolfgang; Aberer, Karl; Neuhold, Erich J. "Object-Oriented Modeling for Hypermedia Systems Using the VODAK Modeling Language (VML)." Pages xxx-xxx (with 40 references) in Advances in Object-Oriented Database Systems, edited by A. Dogac, T. Ozsu, A. Biliris, and T. Sellis. NATO ASI Series. Berlin/Heidelberg: Springer Verlag, 1994.

Abstract: This paper introduces the key principles and main features of the VODAK Model Language VML. This includes the standard concepts in object-oriented data modelling needed in the subsequent discussion, like objects, classes, types, inheritance and methods. Then the application of VML for modelling hypermedia documents is discussed. Advanced features in order to tailor data models towards particular application scenarios, that will be needed in order to provide adequate models for hypermedia documents, are introduced. Among these are metaclasses, parametrized object types, semantic relationships and dynamic method delegation. Modelling primitives are designed to model typed hierarchical and hypertext document structures. Particular attention is paid to provide operations that maintain consistency by observing the compositional constraints of document types.

Available in Postscript format as P-93-15.ps.Z from the GMD-IPSI FTP server.

[CR: 19971113]

Klein, Bertil; Fankhauser, Peter. "Error tolerant Document Structure Analysis." GMD-IPSI Technical Report. Darmstadt, Germany: GMD-IPSI [Integrated Publication & Information Systems Institute], 1997. 16 pages. Authors affiliation: GMD-IPSI; Email contact: Bertin.Klein@gmd.de.

Abstract: "Successful applications of digital libraries require structured access to sources of information. This paper presents an approach to extract the logical structure of text documents. The extracted structure is explicated by means of SGML (Standard Generalized Markup Language). Consequently, the extraction is achieved on the basis of grammars that extend SGML with recognition rules. From these grammars parsing automata are generated. These automata are used to partition a flat text document into its elements, to discard formatting information, and to insert SGML markups. Complex document structures and fallback rules needed for error tolerant parsing strategy has been developed that ranks and prunes ambiguous parsing paths."

The document is available online in Postscript format: ftp://ftp.darmstadt.gmd.de/pub/dimsys/reports/P-97-05.ps.Z; [local archive copy]. A version of this paper was published in International Journal on Digital Libraries Volume 1, Number 4 (1997).

[CR: 19951113]

Klensin, John C. Defining Tags. Rationale and Proposed Rules. INFOODS Working Paper, Number IS N16. Cambridge, MA: INFOODS / United Nations University, Revised March 13 1987. Extent: 5 pages.

Drafted September 17, 1986, Revised March 13, 1987. See other documents from the INFOODS project (by Klensin and Romberg) which explain this early example of database use of SGML.

[CR: 19951113]

Klensin, John C; Feskanich, D; Lin, V.; Truswell, A. S.; Southgate, D. A. T. Identification of Food Components for INFOODS [International Network of Food Data Systems] Data Interchange. Food and Nutrition Bulletin, Supplement volume 16. Tokyo, Japan: The United Nations University, 1989. ISBN: 928080734X.

Author's annotations from a personal note: `This document provides an example (although we slightly disguised it to make it more accessible to nutritionists) of the application of fairly deep semantics to SGML. It ... all fits a simplified model of the syntax. There are two meta-rules on the grammar ... (i) With one exception, in the outermost tag, we solved the ``attributes vs nested tags" problem with a firm "no attributes" rule. (ii) Beyond a specific level (defined semantically and determinable lexically by nesting depth), there is a firm rule that any GI that requires an end-tag is spelled with a trailing slash and any GI that does not permit an end-tag is spelled without one. As I think I mentioned earlier, we don't allow minimization or any else that is permitted by not required at the lexical level. There are elements that need not appear at all (almost all of them), but that is another issue' John C. Klensin is Director, INFOODS (International Network of Food Data Systems) Secretariat, and Chairman of the Standards Committee for ACM. Address: Massachusetts Institute of Technology; Room N52-457; 77 Massachusetts Avenue; Cambridge, MA 02139; 617-253-8004; FAX 617-491-6266; TELEX 921473 MITCAM. See other INFOODS publications under the name Roselyn Romberg.

UN document identifiers: United Nations sales no. E.89.III.A.8. WHTR-14/UNUP-734.

[CR: 19951113]

Klensin, John C. INFOODS [International Food Data Systems Project] Food Composition Data Interchange Handbook. Tokyo, Japan: United Nations University Press, 1993. Extent: 165 pages, bibliography (160-163), index. ISBN: 9280807749. Author's affiliation: United Nations University.

The International Food Data Systems Project (INFOODS) is part of the United Nations University's Food and Nutrition Programme, and uses SGML in the organization of information collected on nutrient composition of foods worldwide. The handbook provides guidelines on the organization and content of food composition tables and databases; it also specifies procedures for the accurate international interchange of such SGML-structured data.

This book will be of interest to researchers designing SGML database applications. In addition to details of implementation specific to food data interchange, considerable thought has been given to problems of meta data (including space, tab and line breaks), use of data formulas, non-ISO 646 character sets, and sub-structuring of textual objects that are needed in many databases (e.g., postal addresses, email addresses). Some interesting work-arounds are also implemented, and will be of theoretical interest to researchers applying the SGML standard to databases.

This handbook supercedes a number of other technical and working papers which nevertheless may be of historical interest: (1) John C. Klensin, Intermediate Structural Tags. Working Paper INFOODS/IS N34. Cambridge, MA. December 7, 1987. 3 pages.; (2) John C. Klensin, A New Structural Tag Category -- Derived Measures. Working Paper INFOODS/IS N32." Cambridge, MA. 87.11.16. 2 pages; (3) John C. Klensin, Syntax and Semantics. INFOODS Data Interchange Scheme. Working Paper INFOODS/IS/N 6. Cambridge, MA 85.12.05. 17 pages; (4) Roselyn M. Romberg, Additional Discussion of the Interchange Scheme. Summary of Conclusions from Review of 'INFOODS/IS N6'. Working Paper INFOODS/IS N17. Cambridge, MA. 78.06.17, revised 87.07.10. 5 pages; (5) Roselyn M. Romberg and John C Klensin, Initial Root (Structural) Tag List. Working Paper INFOODS/IS N15. Cambridge, MA. 87.03.17, revised 87.06.19, final 87.07.16. 6 pages.

John Klensin [1992] is Chairman of the Standards Committee for the ACM (Association for Computing Machinery), thus serving on the board that oversees all Information Technology and Information Sciences standards activites in the US. He has also chaired one of the X3 Technical Commiittees in the language area for many years, and has maintained formal liaison to X3J6 for a long time while what is now SGML was under development. Contact [current 1992]: John C. Klensin; Director, INFOODS Secretariat; Massachusetts Institute of Technology; Room N52--457; 77 Massachusetts Avenue, Cambridge, MA 02139; Tel: 1 617 253-8004; FAX: 1 617 491-6266; Telex: 6502688345; MCI Cable: MITCAM.

Volume summary: Pt. I. Introduction and Overview. 1. Introduction to the Interchange System. 2. Technical Overview. 3. Introduction to the Reference Material -- Pt. II. The Reference Sections. 4. The Header Elements. 5. The Food Element and Subelements. 6. Data Values and Data Description -- Pt. III. Processing Data and Interchange Files. 7. Registering Elements. 8. Conversion of Data to Interchange Format. 9. Conversion of Data from Interchange Format -- Appendix A: Registered International Food Record Identifiers -- Appendix B: Element Registration Form.

UN Identifiers: WHTR-16/UNUP-774. United Nations sales no. E.91.III.A6

Klensin, John C. "When the Metadata Exceed the Data: Data Management With Uncertain Data." Statistics and Computing 5/1 (March 1995) 73-84. 22 references. Author's affiliation: United Nations Univ., Boston, MA, USA.

"Abstract: With many types of scientific data, the amount of descriptive and qualifying information associated with the data values is quite variable and potentially large compared with the number of actual data values. This problem has been found to be particularly acute when dealing with data about the nutrient composition of foods, and a system-based on textual markup rather than, for example, the relational model-has been developed to deal with it. This paper discusses the types of metadata encountered and the problems associated with dealing with them, and then describes this alternative approach. The approach described has been installed in several locations around the world, and is in preliminary use as a tool for interchanging data among different databases as well as local database management.

Kochtanek, T. R. "Standards for Full Text Document Storage." Pages 301-307 (with 17 references) in 15th National Online Meeting. Proceedings - 1994. National Online Meeting, New York, NY, USA, 10-12 May 1994. Sponsored by Learned Information. Edited by Martha E. Williams. Medford, NJ, USA: Learned Information, 1994. xii + 464 pages. Author affiliation: Missouri Univ., Columbia, MO, USA.

Abstract: Several models have been developed for designing and searching full text document databases. Standards, both proposed and official, have been developed to respond to questions relating to the structure, linkages, searching and transmission capabilities of full text databases. These standards include the open document architecture (ODA), the structured generalized markup language (SGML) and its derivatives, and the information retrieval service definition and protocol specification for library applications (ANSI/NISO Z39.50-1992). An overview of these standards is presented, along with their application to full text databases. The paper concludes with a challenge for information professionals to participate in the development and refinement of such encoding standards.

[CR: 19950716]

Koegel, J. F.; Rutledge, Lloyd W.; Rutledge, J. L.; Keskin, C. "HyOctane: a HyTime Engine for an MMIS." Pages 129-136 (with 7 references) in Proceedings ACM Multimedia '93 [Proceedings of First ACM International Conference on Multimedia [First ACM International Conference on Multimedia, Anaheim, CA, USA, 2-6 August, 1993] New York, NY: ACM, 1993. Authors' affiliation: Department of Computer Science, Massachusetts University, Lowell, MA, USA.

"Abstract: The authors are interested in the development of distributed multimedia information systems (MMIS) which use the HyTime international standard as the data model and interchange format. They have developed and implemented a prototype system in which interactive multimedia presentations can be stored and retrieved. Sample document instances are externally encoded in HyTime and stored in the database using the HyTime data model. The architecture and operation of the system are presented. Issues related to using a HyTime engine for general multimedia presentation and interchange are discussed."

[CR: 19971227 MD: 19971229]

Kondrach, George. "DTD Testing: Find the Devils in the Details." Pages 133-142 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [George Kondrach]: Principal, ISOGEN International Corporation, 2200 N. Lamar St. Suite #230, Dallas, Texas 75202; Phone: +1 (214) 953-0004 ext. 106; FAX: +1 (214) 953-3152; Email: george@isogen.com; WWW: http://www.isogen.com.

Abstract: "After the decision to use SGML (or XML), the intense activities of Document Analysis, Information Modeling, and DTD Creation must follow. Many committed SGML/XML enthusiasts might announce a 'success' for their SGML/XML effort upon clearing these hurdles, and achieving the profound status of 'ownership of a valid DTD'. Many other casual SGML/XML adopters may 'simply download' a public domain DTD, and presume their success on the same basis, possession of a valid DTD. Any celebration of success is premature, however, until DTDs are subjected to, and pass, rigorous testing, which appraises their applicability to the production purposes for which they were intended."

[Conclusion:] "As with many of the activities necessary to deploy productive SGML and XML, testing is not glamorous, fun, or easy, but it is productive, rewarding, and economically sound. It has been frequently said of SGML/XML that "...standards based systems satisfy users who possess any technology adoption bias; strategists like the technology impact of standards on the organization, pragmatists like the sound economic metrics of standards, and conservatives like the safety and security that standards afford..." As these different markets demand different types of psychic, economic, and reassurance returns on their respective investments of trust, money, and discomfort, they also speak to their demand for meaningful testing by qualified SGML/XML implementors, before those systems are proffered for full-scale production!"

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

[CR: 19971125]

Kraft, Matthias; Hohoff, Simon. "Information Documents, and Products: Introducing a Data Repository to a Legal Publishing House." Page(s) 171-177 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Authors' affiliation: Research and Development, Electronic Publishing, Verlag C.H. Beck, Germany.

Abstract: "This presentation describes how the information repository of a publishing house was integrated into the environment of the company. The attempt was made to combine the entity relationship approach of an SQL database and the document-driven approaches of SGML. This led to more than one SQL database with an identical microdocument architecture to store the information elements. This presentation closes with a view to the future plans of integrated composing of products with the microdocuments of the database."

"Introducing SGML to a conservative publishing house is a long way to go. In the case of C. H. Beck, the leading company for legal publications in Germany, the efforts were driven by the demands of a continuous growing market for electronic publications, on line as well as CD-ROM.

"Since information is the main business of a publishing company, to create an effective information repository was the first step to go. The efforts were driven into two different directions.

"On one hand the information, the sources and the publication process was structured in classic entity relationship models. The analysis brought three different information models (legislative documents, court decisions and intellectually authored texts) implicating three different databases. Two of three databases represent an entity relationship model of the information. The third database (storing the authored texts like books) is document driven and mirrors the structure of the source publication. To enable the best flexibility and an easy handling of the data, in each case the documents were broken apart into micro documents of almost the same class.

"On the other hand the source documents and the resulting publications where examined in order to create a DTD. The resulting DTD is divided into several modules, that represent overall document structures (books, journals, sections etc.) and modules to indicate detailed information (tables, highlighting etc.). the overall DTD is intended as an abstract model in order to derive various different process specific DTDs. Thus the detailed element model corresponds with the micro documents of the information repository. The global document structures are created by the export function of the databases.

"In the future there will be a combining project management system, which will enable the product manager to create publications containing micro documents of all three databases and an overall structure.

[CR: 19950716]

Kruger, M. "[in German; English translated title,] Technical Publication: Strategies and Systems." Nachrichten für Dokumentation 41/5 (October 1990) 285-289. Affiliation: MID/Inf. Logistics Group GmbH, Heidelberg, Germany.

"Abstract: Technical publications typically are complex, bulky, and have a long life. Their compilation, updating, and editing for different target groups can only be achieved with an efficient organisation. The shortage of time and budget requires automation and modularisation of the production process. To meet these requirements in the framework of a compound document processing architecture SGML (ISO 8879), a system independent description language of the document structure is a first important milestone. Others like e.g. DSSSL or SPDL currently are under development."

[CR: 19950804]

Kruse, Susan E. "Only 250 error messages! Teaching the TEI Guidelines to undergraduates." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 221-224. ISSN: 1053-900X. Author's affiliation: Research Unit in Humanities Computing, Kings College London.

See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.

[CR: 19970726]

Kuikka, Eila. Processing of Structured Documents Using a Syntax-Directed Approach. Supervisors: Martti Penttonen, Matti Linna. PhD Thesis. University of Kuopio, Department of Computer Science and Applied Mathematics, Finland.. Kuopio, Finland: Department of Computer Science and Applied Mathematics, Kuopio University, 1996. Extent: 76 pages + 4 Appendices.

Abstract: "Documents possess a natural structure which can be visualized when they are printed on paper. Traditionally, text processing systems do not support the structured creation of documents. Instead, the formatting specified by the author defines implicitly the structure. The system itself is not aware of this structure. However, if the structure could be defined in advance, the system could use it to direct the author to create documents with the correct structure. Further, the structure could then be used to produce different formats for the same content, to modify it and to search documents.

Electronic document processing consists of the creation, updating, formatting, storage, retrieval and dissemination of documents. Systems currently used for various document management tasks have been built using different principles and different ad hoc methods. Due to their various approaches, these systems require the author to use different representations, different operations, and different processing philosophies. This can cause many problems later when authors need to convert from one electronic representation to another version with the same content. What an author really needs is a uniform document processing system where all these activities would exist. In addition, this system should be convenient and easy to use.

"This thesis investigates how a syntax-directed translation method, previously applied to programming languages, may be used as the basis of a processing system for structured documents. In this case, the grammar serves as a tool to define the structure of a document, what kinds of representations the user wishes, which documents the user wants to retrieve from a set of documents, and what form the new structure should take. The document processing is considered as transformations from one representation to another. The system uses grammars to generate tools, automatically or semi-automatically, used during various processing phases. Parse trees for grammars form the interface between tools. Theapproach has been tested by building a prototype of an integrated syntax-directed document processing system.

"The thesis shows that structured processing is a feasible tool for the manipulation and management of electronic documents, and that a syntax-directed approach can be used for the subtasks of document management. The approach forms a consistent and application independent basis to produce and disseminate different representations from the same content using either the same or modified structures. The results were developed with a relatively small set of documents. However, this approach may be used also for the large and expanding number of documents available in information sources on worldwide computer networks." [Universal Decimal Classification: 519.68; 519.7; 681.3.068]

Published also as: Kuopio University Publications C. Natural and Environmental Sciences 53, 1996. Public defense of the doctoral thesis was on November 22nd, 1996. See http://www.cs.uku.fi/~kuikka/thesis.html (bibliographic data), or the thesis online in Postscript format. [Thesis, archive copy.]

[CR: 19970726]

Kuikka, Eila; Penttonen, Martti. Transformation of Structured Documents." Electronic Publishing: Origination, Dissemination and Design (EPODD) 8/4 (December 1995 [appeared July 1997]) 319-341. With 44 references. ISSN: 0894-3982. Authors' affiliation: [Kuikka:] Department of Computer Science and Applied Mathematics, University of Kuopio, PO Box 1627, 70211 Kuopio, Finland; Email: kuikka@cs.uku.fi; WWW: http://www.cs.uku.fi/~kuikka/index.html; [Penttonen:] Department of Computer Science, University of Joensuu, PO Box 101, 80101 Joensuu, Finland; Email: Martti.Penttonen@cs.joensuu.fi; WWW: http://www.cs.joensuu.fi/info/staff/penttonen/penttonen.html.

"Abstract: Many documents have a definable structure. Some document formatting systems, like the LaTeX formatter, use a structural notation. In recent years the general mark-up language SGML has gained popularity. In this paper we study the transformation of one structure to another. For example, technical journals have their structure definitions, and an article originally written for one journal must be restructured before it can be submitted to another journal. We assume that structure definitions are grammatical, and study the transformations that can be automated or at least semi-automated.

"We took a collection of computer science journals and compared their structure definitions. We classified differences as simple, local and global. As transformation techniques we studied syntax directed translation schemata and tree transducers. Our conclusion was that simple and local transformations can be automated or semi-automated, depending on whether additional information is needed, while global transformations are difficult to automate. Transformations were tested on our prototype syntax-directed document processing system. The system has one module for editing a document under one structure definition, and another module for changing a document from one structure definition to another."

[Paper received December 4, 1995; revised July 22, 1996.]

Earlier report, available online: Eila Kuikka and Martti Penttonen, Transformation of Structured Documents. University of Waterloo, Computer Science Department, Techinal Report CS-95-46, 73 pages; URL: ftp://cs-archive.uwaterloo.ca/cs-archive/CS-95-46/. See also: Eila Kuikka, Processing of Structured Documents Using a Syntax-Directed Approach. Ph.D. Thesis. University of Kuopio, Department of Computer Science and Applied Mathematics, Finland. Kuopio University Publications C. Natural and Environmental Sciences 53, 1996. 76 pages + 4 Appendices. URL: http://www.cs.uku.fi/~kuikka/Thesis/thesis.ps.gz.

[CR: 19970726]

Kuikka, Eila; Penttonen, Martti. Transformation of Structured Documents. University of Waterloo Technical Report CS-95-46. Waterloo, Ontario, Canada: Department of Computer Science, University of Waterloo, October 1995. Extent: 73 pages, 6 appendices. Authors' affiliation: Eila Kuikka: Department of Computer Science, University of Waterloo, Canada; Email: ekuikka@watsol.uwaterloo.ca, Or permanent address: email kuikka@cs.uku.fi; Martti Penttonen: Department of Computer Science, University of Joensuu, Finland; Email: penttonen@cs.joensuu.fi.

"Abstract: Structure definitions of documents have been used successfully for inputting and formatting in text processing systems. This report considers transformations between different representations of structured documents and studies possiblities to extend the use of structure definitions to document transformations and to discover algorithmic methods for carrying out transformations. Documents are presented as parse trees for context-free grammars and transformations are made from parse tree to parse tree. First, the report describes differences of manuscript styles demanded by various scientific journals and presents a declarative classification for structure differences between two parse trees. Second, a set of tree transformation methods are described and their suitability for transformations between documents having a structure difference in each defined class is analyzed. For each class several methods may or must be used and only certain kinds of differences can be managed automatically. Finally, instead of designing a system where a method accommodates for all kinds of differences or where different methods are used in various transformations, the report presents a model for a document transformation system that presents a possibility of using various methods according to differences in document representations. The system is divided two modules. In the first one transformations are made automatically and they do not change the hierarchical structure of a document. In the second one transformations are made semiautomatically or nonautomatically and the hierarchical structure changes. Differences between the existing and the required representation of a document are analyzed and methods selected according to the classified differences."

Available online: ftp://cs-archive.uwaterloo.ca/cs-archive/CS-95-46/. [archive copy]

[CR: 19961018]

Kuikka, Eila; Salminen, Airi. "Filtering Structured Documents in the SYNDOC Environment." Pages 181-193 (with 29 references and 6 figures) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: [Kuikka]: Department of Computer Science and Applied Mathematics, University of Kuopio, P.O. Box 1627, 70211 Kuopio, Finland. Office: Teknia-building, Savilahdentie 6A, 3rd floor, Room 3967; Phone: +358-71-162576, Fax: +358-71-162595. E-mail: kuikka@cs.uku.fi, Eila.Kuikka@cs.uku.fi; [Salminen] Department of Computer Science and Information Systems, University of Jyväskylä, PO Box 35, 40351 Jyväskylä, Finland. Email: airi@cs.jyu.fi.

Abstract: "This paper describes the filtering approach for searching documents whose structure is defined by a grammar. The method is based on the theoretical model for defining filters to specify information interest of a user. It is employed to find documents in SYNDOC, a syntax-directed text processing system. The method is suitable, for example, for SGML and ODA documents. The user selects a grammar and indexes only documents for the selected grammar. A filter generated in a syntax-directed way using the grammar describes conditions for indexed documents integrating structure and content constraints. The user compares a filter with indexed documents, and either edits, browses or prints original documents using the selected output form. Indexed documents, filters and retrieved documents can be stored for further purposes."

Keywords: structured document, filtering, context-free grammar, parse tree.

[CR: 19961018]

Kuikka, Eila; Mykkänen, Jouni; Ryynänen, Arto; Salminen, Airi. Implementation of Two-dimensional Filters for Structured Documents in SYNDOC Environment. University of Joensuu, Department of Computer Science, Report Series A. Report A-1995-4.. Joensuu, Finland: University of Joensuu, 1995. Extent: 31 pages, 12 references. ISBN: 951-708-383-1.

Abstract: "Filtering is used to select a subset, corresponding to the information interests of a user, from a set of information items. The information interests are described in a filter which is created to control the selection. In our earlier work we have described a theoretical framework for specifying filters to express content-based and structure- oriented constraints on structured text. In the filters, the information interests of the user are expressed by constraints and annotations on two-dimensional templates. The templates are created from the grammar associated with the structured text.

"This report describes a prototype for the filtering method in a syntax-directed document processing system called SYNDOC. In SYNDOC, a filter is applied to documents associated with a common grammar. The application of a filter means finding the documents that match the filter. From user's point of view, filtering a subset of a given document document collection consists of the following six steps. First, a filter for a given grammar is defined; second, a directory containing documents associated with the grammar is chosen; third, indexing is applied to the documents (unless indexed documents were chosen); fourth, the filter is applied to the indexed document of the chosen directory; fifth, the form of the output is defined; and sixth, the filtered documents are displayed in the specified form. In the current phase of the implementation, the matching test is applied to one document at the time, and in case of matching, the document is displayed using the default output form."

The reports is available in Postscript format by ftp from the University of Joensuu: ftp://ftp.cs.joensuu.fi:/pub/Reports/A-1995-4.ps [mirror copy]. See: other publications of Eila Kuikka on "Filtering of Structured Documents" [mirror copy], or the WWW Home Page. The report is also available via postal mail: Department of Computer Science, University of Joensuu, P.O. Box 111, FIN-80101 Joensuu, Finland. See also: E. Kuikka and A. Salminen, "Two-dimensional filters for structured texts," to be published in Information Processing and Management.

[CR: 19980218]

Kuikka, Eila; Nikunen, Erja. Survey of Software for Structured Text. Report A-1998-1. Kuopio, Finland: University of Kuopio, Department of Computer Science and Applied Mathematics, 1998. Extent: []. Authors' affiliation: [see below].

February 09, 1998. Announcement from Eila Kuikka for the public availability of a revised report Survey of Software for Structured Text. This report, available in HTML (hypertext) and Postscript format, surveys some 207 software tools that claim to support the processing of structured documents. This publication updates the 1994 survey which reviewed 89 software packages (see immediately below). Most of these software tools are SGML/XML compliant or aware. Description, contact information, references, and prices are listed for each software package. The database entries are accessible via alphabetical (name) listing, by software 'type' (in eighteen categories), and by price. This revised and expanded 1998 edition of the Survey is authored by Eila Kuikka (Department of Computer Science and Applied Mathematics, University of Kuopio, Finland) and Erja Nikunen (Nokia Telecommunications, Finland). In HTML format: http://www.cs.uku.fi/~kuikka/systems.html, and published also as a technical report of the Department of Computer Science and Applied Mathematics, University of Kuopio, Finland. [local archive copy]

Kuikka, Eila; Nikunen, Erja. [Systems for Structured Text] RAKENTEISTEN TEKSTIEN KÄSITTELYJÄRJESTELMISTÄ. Report A/1994/4. Kuopio, Finland: University of Kuopio, Department of Computer Science and Applied Mathematics, January 11, 1994. iii + 82 pages. Authors' affiliation: [Eila Kuikka]Department of Computer Science and Applied Mathematics, University of Kuopio, Finland, [PO Box 1627, FIN-70211 Kuopio, FINLAND]; email: kuikka@cs.uku.fi; [Erja Nikunen] Research Centre for Domestic Languages, Finland; Email: enikunen@domlang.fi.

See the preceding bibliography entry for the 1998 update.

The authors have prepared an overview ["Systems for structured text"], a summary and a 3-part software-systems listing from the longer report. The abstract for the list: "This list [3 parts in 3 disk files: A-E, F-M, N-Z] [was] a part of a report published in Finnish as a technical report of the Department of Computer Science and Applied Mathematics, University of Kuopio, Finland. The aim of the report was to give a brief overview of electronic text and its processing by computers. The main part of the report is a section that contains a short description and typical features of 89 systems. This English summary contains only that part of the report and our aim is not to update this list later."

The overview page describes the extracted portions of the report (now updated). The three main documents are still available here, for historical value, as mirror copies [copied April 12, 1995]: the summary; part1 A-E , part2 F-M, and part3 N-Z.

[CR: 19961018]

Kumar, Vijay; Furuta, Richard; Allen, Robert B. "Interactive Interfaces for Knowledge-Rich Domains." Pages 235-246 (with 28 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: [Kumar, Furuta]: Center for the Study of Digital Libraries, and Department of Computer Science; Texas A&M University, College Station, TX 77843-3112, USA. Email: vijayk@cs.tamu.edu; furuta@cs.tamu.edu; [Allen]: Bellcore, 1A-352R, 445 South Street, Morristown, NK 07960, USA. Email: rba@bellcore.com.

Abstract: "Timelines represent a familiar means for representing the relationship among historical events. When incorporated into the context of electronic documents, the timeline provides the basis for implementing an interface into an event space, relying particularly on hypertextual-style links. Generalizing timelines also permits the flexible representation of many different kinds of relationships beyond the temporal. This paper includes examples of such representations, showing examples from prototype implementations."

[CR: 19961226]

Kumpf, Dave. "Re-engineering Your Company's Knowledge Infrastructure: Standard Tools vs. Standard Data Representations." Pages 501-506 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: President, Lexicon Systems, Inc., 6165 Lehman Drive, Suite 204, Colorado Springs, Colorado 80918, USA; Tel: 719-593-8971; FAX: 719-593- 9268; Email: davek@lexisys.com; WWW: http://www.lexisys.com/.

Abstract: "Ten years after SGML was adopted as an international standard, more organizations than ever before are investigating its possibilities. The reason is simple. The problems addressed by Total Quality Management in the manufacturing and general service industries are magnified enormously in knowledge work and are much more difficult to address. Accessibility and reusability of information are important, and so are the relevance and applicability of information in a particular problem-solving context. Redundant knowledge creation and information rework waste organizational effort and dollars and have a profoundly negative effect on programs, processes, and systems. To combat redundancy and rework, organizations are seeking solutions in standard tools and standard data representations."

Note: The above presentation was part of the "SGML Business Management" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

Laan, C. G. van der. "SGML (, TeX and . . .)." TUGboat: The Communications of the TeX Users Group 12/1 (March 1991) 90-104. 49 references. Author affiliation: Rekencentrum TUG, Groningen.

[CR: 19951206]

Lancashire, Ian. Early Books, RET Encoding Guidelines, and the Limits of SGML. Paper presented at The Electric Scriptorium. Approaches to the Electronic Imaging, Transcription, Editing and Analysis of Medieval Manuscript Texts: A Physical & Virtual Conference. The University of Calgary, Calgary, Alberta [physical conference]. November 10-12, 1995. Sponsored by The University of Calgary, Calgary Institute for the Humanities, and SEENET. Conference coordinated by Dr. Murray McGillivray, Thomas Wharton, Blair McNaughton, and Robert McLean. Extent: approximately 13 pages. Author's affiliation: University of Toronto.

"Standard Generalized Markup Language (SGML) encodes medieval and Renaissance manuscripts and printed books with difficulty. This computer language is an ISO standard, but one acknowledged more in the breach than in the observance. Here I argue that the humanities should follow the originators of the World Wide Web, who made HTML (Hypertext Markup Language), an encoding standard using SGML syntax but serving purposes alien to the intentions of SGML's creators. The Text Encoding Initiative (TEI) SGML document-type definition is unusable for my kind of scholarly editing, and for the editing of early texts generally. However, the TEI Guidelines is an excellent discussion of tagging, principles and practice, and its system of over 400 tags is the starting point for anyone interested in text encoding." [from the document Introduction]

The document is available on the Internet as part of the official conference record: see http://www.ucalgary.ca/~scriptor/papers/lanc.html [mirror copy, December 1995]. For further details on the Electric Scriptorium conference, see Electric Scriptorium Home Page.

[CR: 19961226]

Landis, Susan E.; Pearson, Troy J. "From Existing Paper Documents to a Tailored Electronic Information Base." Pages 589-596 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Landis]: Principle SGML Analyst, Technical and Management Services Corporation, 3070 Presidential Drive, Suite 300, Fairborn, Ohio 45324, USA; Tel: +1 513.427.9050; FAX: +1 513.427.9058; Email: landiss@tamsco.com; WWW: http://www.tamsco.com/; [Pearson]: Senior SGML Analyst, Technical and Management Services Corporation, 3070 Presidential Drive, Suite 300, Fairborn, Ohio 45324, USA; Tel: +1 513.427.9050; FAX: +1 513.427.9058; Email: landiss@tamsco.com.

Abstract: "Technical and Management Services Corporation (TAMSCO) and Warner Robins Air Logistics Center/LB/LU Directorate recently began a cooperative effort to develop a more efficient way to manage the data for the C-130 flight manuals. WR ALC/LB/LU recognized the tremendous cost and inefficiencies in managing the existing C-130 data. With the assistance of TAMSCO, this cooperative effort is currently reengineering the existing process for creating, distributing, accessing, and reusing the technical information. By using Standard Generalized Markup Language (SGML), this effort will realize the ability to store and reuse technical procedures more efficiently. The SGML data will be accessible to the end users through an electronic information base both digitally and hard-copy. Using SGML and the AF Standards will bring many benefits and lower maintenance costs. The future success of the USAFs C- 130 Technical Manual program depends on how effectively and efficiently the existing data is identified, maintained, managed, and used."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971227]

Landis, Susan E; Pearson, Troy J. "IETMs: Technology That Supports The Future." Pages 489-492 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Susan E. Landis]: Principal SGML Analyst, Technical and Management Services Corporation (TAMSCO), 3070 Presidential Drive, Suite 300, Fairborn, Ohio 45324 USA; Phone: +1 937.427.9050; FAX: +1 937.427.9058; Email: landiss@tamsco.com; WWW: http://www.tamsco.com; [Troy J. Pearson]: Senior SGML Analyst, (TAMSCO); Email: pearsont@tamsco.com.

Abstract: "An Interactive Electronic Technical Manual (IETM), as defined in the DoD IETM specifications, is a package of information required for the diagnosis and maintenance of a weapons system, arranged and formatted for interactive screen presentation to the end-user. Technical and Management Services Corporation (TAMSCO) has been assisting the military develop IETMs using commercial off the shelf (COTS) products with open ended software interfaces. IETMs provide many benefits over traditional paper manuals, as will be discussed. TAMSCO recognizes that while the concept of IETM is still a new technology, it is only an application of finding a more efficient and effective way to provide support and maintenance to existing military weapon systems."

This paper was delivered as part of the "IETM" track in the SGML/XML '97 Conference.

[CR: 19951010]

Langendoen, D. Terence; Simons, Gary. "Rationale for the TEI Recommendations for Feature-Structure Markup." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/3 (1995) 191-209. ISSN: 0010-4817.

Abstract: "In this paper, we concentrate on justifying the decisions we made in developing the TEI recommendations for feature structure markup. The first four sections of this paper present the justification for the recommended treatment of feature structures, of features and their values, and of combinations of features or values and of alternations and negations of features and their values. Section 5 departs briefly from the linguistic focus to argue that the markup scheme developed for feature structures is in fact a general-purpose mechanism that can be used for a wide range of applications. Section 6 describes an auxiliary document called a 'feature system declaration' that is used to document and validate a system of feature-structure markup. The seventh and final section illustrates the use of the recommended markup scheme with two examples, lexical tagging and interlinear text analysis."

[CR: 19961201]

Lapeyre, Deborah Aleyne. "[Review of] Developing SGML DTDs: From Text to Model to Markup, by Eve Maler and Jeanne El Anduloussi." <TAG> 9/11 (November 1996) 5-8. ISSN: 1067-9197. Author's affiliation: Mulberry Technologies, Inc., Rockville, MD.

See the bibliographic reference for the Maler / El Anduloussi book.

[CR: 19990519]

Lapeyre, Deborah Aleyne. "Annotated Table of Contents. Eve Maler and Jeanne El Andaloussi, Developing SGML DTDs: From Text To Model To Markup." Markup Languages: Theory & Practice 1/1 (Winter 1999) 113-115. ISSN: 1099-6622 [MIT Press]. Author's affiliation: Vice President, Mulberry Technologies Inc.; Email: dlapeyre@mulberrytech.com; WWW: http://www.mulberrytech.com.

The annotated Table of Contents for Developing SGML DTDs complements the corresponding book review article by Chet Ensign, also published in this issue of Markup Languages: Theory & Practice.

[CR: 19990519]

Lapeyre, Deborah Aleyne. "Annotated Table of Contents. David Megginson, Structuring XML Documents." Markup Languages: Theory & Practice 1/1 (Winter 1999) 116-118. ISSN: 1099-6622 [MIT Press]. Author's affiliation: Vice President, Mulberry Technologies Inc.; Email: dlapeyre@mulberrytech.com; WWW: http://www.mulberrytech.com.

The annotated Table of Contents for Megginson's Structuring XML Documents complements the corresponding book review article by Chet Ensign, also published in this issue of Markup Languages: Theory & Practice.

[CR: 19971227]

Lapeyre, Deborah Aleyne. "SGML/XML'97: The Extraordinary Conference. What to See and Do at SGML/XML '97." Pages 13-18 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Deborah Aleyne Lapeyre]: Co-chair, SGML/XML '97; Vice President, Mulberry Technologies, Inc., 17 West Jefferson Street, Suite 207, Rockville, MD 20850 USA; Phone: +1 (301) 315-9631 FAX: +1 (301) 315-8285; Email: dalapeyre@mulberrytech.com; WWW: http://www.mulberrytech.com.

Summary: This presentation by the program Co-chair provides an overview of the structure of SGML/XML '97 conference. Lapeyre explains the various technical tracks, vertical tracks, vendor demonstration theatre, exhibit hall, poster sessions, special user group meetings, tutorials, BOF meetings, the SGML/XML '97 bookstore, and other important conference events.

"Welcome to SGML/XML'97, the conference that is both the largest SGML Conference ever and the largest XML conference ever. We're all here to have a good time: to learn new things, to consolidate our positions, to expand our minds, and to make technical progress happen. The first conference in this series, back in 1988, only dealt with SGML and had 52 attendees with no exhibit hall. As this conference has been able to say every year for the last 11 years, this year we have more attendees, more sessions, more tracks, more vendors, and more night events than any SGML conference has ever had before! This year we've expanded; XML has joined SGML as a major technical focus."

This presentation was part of the "Introductions" track at the SGML/XML '97 Conference.

[CR: 19960730]

Lapeyre, Deborah A.; Usdin, Tommie. TEI and the American Memory Project at the Library of Congress. Paper presented at Digital Libraries Workshop 1996, Organized by Nancy Ide and Judith Klavans, Held in conjunction with the First ACM International Conference on Digital Libraries, Bethesda, Maryland. Poughkeepsie, New York / New York, NY: Vassar College, Department of Computer Science / Columbia University, Department of Information Services, 1996. Authors' affiliation: ATLIS Consulting Group.

"The American Memory DTD is in use to capture a variety of materials, and to re-tag some documents that were previously tagged using a non-SGML generic tagging scheme. The DTD has proven useful for tagging a variety of texts. American Memory has digitized a variety of Library of Congress collections. They are currently interested in talking to potential partners who may be interested in publishing some of these collections. . . While there is no easy way to measure the relative accuracy or retrieval system precision using SGML as compared to non-SGML encoding, the SGML option, and selection of the TEI model for American Memory seem to be working well." [extract]

The document is available online: http://www.cs.vassar.edu/~ide/DL96/Lapeyre; [mirror copy]. See the main workshop entry or the program listing for other workshop details.

Laplante, Mary Fletcher. "Copy of Letter to NIST [Conformance Testing Program] Dated February 2, 1994." SGML Users' Group Bulletin Newsletter 26 (February 1994) 6-7. ISSN: 0952-8008.

The open letter is sent from SGML Open's Executive Director to Ron Wilson of NIST on behalf of SGML Open, GCA, and the SGML Users' Group. It expresses concerns about NIST's proposed testing procudures for the SGML community. This letter is a supporting document to the article of Pamela Gennusa on NIST's proposed Conformance Testing program.

[CR: 19951010]

Laplante, Mary Fletcher. "SGML Moves into the Database and Document Management Markets." Imaging World 4/10 (October 1, 1995) 19. Author's Affiliation: Mary Fletcher is Executive Director of SGML Open.

Laplante, Mary Fletcher. "SGML Open Update: Progress on All Fronts." <TAG> 7/4 (April 1994) 7-8. ISSN: 1067-9197.

Report on the activities of SGML Open (the SGML consortium), especially following meetings of the marketing and technical committees after the Documation '94 conference.

[CR: 19951122]

Laplante, Mary Fletcher. Standards for Interoperability SGML Open White Paper, Number 1001-SO. Coraopolis, Pennsylvania: SGML Open, [January-February] 1995. Extent: approximately 5 pages. Author's affiliation: Mary Laplante is Executive Director, SGML Open.

"SGML is one of the most important document management investments that an organization can make because it ensures the interoperability of its information. Technology changes every eighteen months, which is one of the reasons why we invest in open systems that are based on de jure and de facto standards. We want to be sure that the hardware and software that we buy today will work with other systems we have now or will have in the future. Investments in open systems platforms and architecture may increase the value of a corporation's physical assets, but those systems are guaranteed to be replaced eventually. The best investment that we can make in open systems is the development and maintenance of open information, which is what SGML enables.

"Besides strategic benefits, SGML also offers the tactical advantages of allowing an organization to share the same data across multiple document repositories, thereby supporting enterprise-wide document management. Organizations can choose the products and technologies that are best suited to their needs, while knowing that their documents are interchangeable and accessible to anyone, even across repositories." [extract]

A related document under the title "Standards for Interoperability" also appeared in the January/February issue of The Gilbane Report (CAPV Publications - Gilbane Report). As one of the SGML Open White Papers [WHITE PAPER #1001-SO, see also the printed abstract, mirror], the document is available on the Internet as "Standards for Interoperability" [mirror copy, partial links, November 1995].

Laplante, Mary Fletcher. "Viewpoint: The Value of SGML Conformance Testing." <TAG> 7/4 (April 1994) 1, 12. ISSN: 1067-9197.

[CR: 19960716]

Larson, Ray R.; Moon, Ralph; McDonough, Jerome; O'Leary, Paul; Kuntz, Lucy. "Cheshire II: Designing a Next-Generation Online Catalog." Journal of the American Society for Information Science 47/7 (July 1996) 555-567 (with 31 references). ISSN: . Author's affiliation: [Larson] School of Information Management and Systems, University of California, Berkeley. Email: ray@sherlock.berkeley.edu.

"Abstract: The Cheshire II online catalog system was designed to provide a bridge between the realms of purely bibliographical information and the rapidly expanding full text and multimedia collections available online. It is based on a number of national and international standards for data description, communication, and interface technology. The system uses a client server architecture with X window client communication with an SGML based probabilistic search engine using the Z39.50 information retrieval protocol."

"The Cheshire II system is being made available for public use in the UC Berkeley Astronomy-Mathematics-Statistics Library (a medium-scale academic branch library, circa 75,000 volumes) using modern workstations, and to the national mathematics research community via network access. Use and acceptance of the system and its features will be evaluated using transaction monitoring and questionnaires."

A version of the paper is also available online: http://sherlock.berkeley.edu/asis_paper/paper.html. See also the main entry for the Cheshire II Project

[CR: 19950925]

Lauritsen, M. "Knowing Documents." Pages 184-191 (with 11 references) in Proceedings of Fourth International Conference on Artificial Intelligence and Law. Fourth International Conference on Artificial Intelligence and Law, Amsterdam, Netherlands, 15-18 June 1993. New York, NY: ACM Press, 1993. Author's affiliation: Harvard Law School, Cambridge, MA.

"Abstract: Drawing upon scholarship on legal drafting, current document assembly technology, and aspects of the Standard Generalized Markup Language (SGML), this article discusses the forms of knowledge at play in the creation of legal documents. It also examines the notion of self-describing documents and their potential role in new modes of expression and knowledge pertinent to legal drafting."

Lavagnino, John. "Simultaneous Electronic and Paper Publication." TUGboat: The Communications of the TeX Users Group [= Proceedings of the 1991 Annual Meeting] 12/3 (December 1991) 401-405. Author affiliation: Brandeis University.

The author concludes that SGML is the "best choice" for creating a multiform text. [full abstract/summary needed]

[CR: 19971202]

Lavagnino, John. "What Not to Tag." Pages 93-97 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Women Writers Project, Brown University; Email: john.lavagnino@brown.edu.

Summary: "My aim in this paper is to talk about our choices in encoding texts, and, in particular, to focus on decisions about things not to do. One well-known reaction to the sight of the imposing bulk of the TEI Guidelines is the cry of despair at the thought that every word must be mastered and applied wherever and whenever the appropriate text features crop up. Of course, this is not the intention at all; but the decision about what to use is still a real problem, particularly for projects---the most common sort, I think---that do not have a specific use for their texts in mind, but instead aim to provide a generally useful digital collection. [...] If I had to sum up what I have to say in one sentence, it would be: Don't tag what you don't understand. But a somewhat more Wagnerian statement of my point would be: Don't tag things that aren't fully worked out or elaborated, and don't tag the random, the occasional, the unique, or---to use an Aristotelean term that I'm going to be adopting---the accidental."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/whatnot.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.

[CR: 19950922]

Lavagnino, John; Mylonas, Elli. "The Show Must Go On: Problems of Tagging Performance Texts." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/2 (1995) 113-121.

Abstract: "A dramatic work may be seen either as an event or as a text; the TEI Guidelines make it possible to encode a dramatic work in either way, but do not attempt to solve the difficult problem of doing both at once. The basic element of a dramatic work, when seen as a text, is the speech; the Guidelines also provide elements for encoding other familiar parts of dramatic texts (such as stage directions and cast lists), as well as for encoding analytic information on various aspects of texts and performances that is not normally included in printed dramatic texts. There are often other formal structures in dramatic works that intersect with the structure of speeches -- metrical structures, for example; we discuss approaches for encoding these structures."

[CR: 19971227]

Lécluse, Christophe. "Building Client-Side Web Architectures Today Using XML, Balise and a Standard HTML Browser." Pages 461-466 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Christophe Lécluse]: Advanced Information Systems, 17 Rue Remy Dumoncel, Paris, France F-75014; Email: clec@ais.berger-levrault.fr.

Abstract: "Among the major trends in WWW applications today are (1) client side applications and (2) use of SGML and now XML for richer information modeling. Most Web applications today exclusively rely on servers to perform all computations and data manipulation requested through the Web/HTML browser. The limit of this design model have been clearly reached and new models are considered where more intelligence is brought back in the client, that is in the Web browser area.

"At the same time, the limits of the HTML modeling capabilities become more and more obvious as Web applications develop and XML, as a specialized profile of SGML, is now recognized as a major break through in the domain of advanced WWW applications."

"In this article, we present an application in the domain of technical documentation for the automotive industry that requires such Web and client-side architectures. This application provides consultation of Illustrated Parts Catalog (IPC) modules with real-time configuration management. Configuration management here consists in presenting to a user the exact documentation for the vehicle he/she has to repair."

"We present the application itself and how/why it requires such a web and client-side architecture. Because XML browsers are not yet a reality, we also propose a short term integration solution that can implement this architecture with today's HTML browsers coupled with an external XML engine."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

[CR: 19961226]

Lécluse, Christophe. "Event Driven or Tree Manipulation Approaches to SGML Transformation - You Should Not Have to Choose." Pages 373-380 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Technical Director, Advanced Information Systems (AIS), 35 Rue du Pont, Neuilly sur Seine, France F-92200. Email: clec@ais.berger-levrault.fr; WWW: http://www.ais.berger-levrault.fr/.

Abstract: "Two approaches are available for specifying transformation processes on SGML documents: a declarative approach, based on context-sensitive rules triggered on SGML parsing events, and a procedural approach, based on explicit manipulation of the document tree."

"This paper shows that each approach is optimal for a certain class of problems, but that both are actually needed and that maximum expressive power is achieved when both can be combined in a same program."

The document is available online in HTML format: http://www.balise.com/current/articles/lecluse.htm; [mirror copy].

An alternative source for information presented in this paper is the Proceedings of SGML Finland '96; see the paper by François Chahuneau, "Event driven or Tree Manipulation Approaches to SGML Transformation - You Should Not Have to Choose."

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19960803]

Lee, J. C.; Swietlik, C. E. "Toward Automated Document Reformatting: An SGML Markup System." Pages 137-141 (with 7 references) in Proceedings of the Twenty-Eighth Southeastern Symposium on System Theory (SSST '96). Twenty-Eighth Southeastern Symposium on System Theory, Baton Rouge, LA, USA.. March 31 - April 2, 1996. Sponsored by Louisiana State University, Department of Electrical & Computer Engineering and College of Engineering, Baton Rouge Section of IEEE; IEEE Computer Society; IEEE Control Systems Society. Edited by M. Naraghi-Pour. Los Alamitos, CA: IEEE Computer Society Press, 1996. ISBN: 0-8186-7352-4 [IEEE Catalog # PR07352]. Authors' affiliation: Argonne National Laboratory, IL, USA.

"Abstract: An SGML markup system is presented. One major obstacle for the SGML to gain more application is the prohibitively high cost of the markup process. The system the authors present adopts an incremental design approach. This approach helps to "divide and conquer" each specific problem encountered during the markup process and ensures that the system converges to a almost-fully automated markup system. The major software components of the system are described. Some selective algorithms are also introduced."

Conference: [cit].

[CR: 19971125]

Leenheer, Paula; Mackenzie, Colin. "Ajaib - A Case Study of An SGML/Intranet Development. Getting Documentation Off the Ground." Page(s) 269-272 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Authors' affiliation: [Paula Leenheer]: EP Advisor on EDM, Shell International Exploration and Production B.V., The Netherlands; Email: p.leenheer@siep.shell.com; [Colin Mackenzie]: Consultant, Database Publishing Systems Ltd, United Kingdom; Email: crm@dpsl.co.uk.

Abstract: "Brunei Shell Petroleum (BSP) has developed a system to support the production operators in their day-to-day activities on the Platform. This system (named 'Ajaib' which is Malay/Arabic for Miracle) breaks away from the traditional Operations Manual and instead delivers all information required by the Operator in support of his day-to-day activities from a single, commodity desktop Web browser. The information is managed in its native format (e.g., SGML, AutoCad) and is presented in a variety of formats including animation and graphics; this session aims to provide insight into the development and acceptance of a corporate Intranet solution."

"BSP decided that the core information for the system should utilise SGML to manage the various information content types and relationships. BSP chose to re-use DTDs specifically developed for Shell Expro to capture the information. This information consisted of asset information (e.g., equipment descriptions for specific platforms, pipelines and systems), organisation information (e.g., description of BSP personnel and their responsibilities), and activity information (e.g., descriptions of maintenance tasks that operators perform each day). Furthermore, the system should also contain additional explanatory information, as is usually contained in training manuals. The new system had to provide all the information that operators require from a single point of delivery, in a format that would be appealing to the operators. The decision was made to use Web technology and standard products to deliver the specially created content in a textual and graphical form. The textual information would be converted from SGML to HTML prior to delivery of the final system."

[CR: 19971216]

Le Maitre, Jacques; Murisasco, Elisabeth; Rolbert, Monique. "From Annotated Corpora to Databases: The SgmlQL Language." Pages 37-58 in Linguistic Databases. [Conference on] Linguistic Databases. Centre for Language and Cognition and Centre for Behavioral and Cognitive Neuroscience, University of Groningen, Groningen, The Netherlands. March 23-24, 1995. Sponsored by the Dutch National Science Foundation (NWO), Royal Dutch Academy of Science (KNAW), et al.. Edited by John Nerbonne (Computational Linguistics, and Humanities Computing, University of Groningen). CSLI Lecture Notes, Number 77. Stanford, CA: Center for the Study of Language and Information, 1998. ISBN: 1-57586-093-7 (hardback), 1-57586-092-9 (paper). Author's affiliation: [Le Maitre and Murisasco]: University of Toulon, France; [Rolbert]: University of Aix-Marseille III, France.

For more information, see the database entry for MtSgmlQL, or documentatin on 'the SgmlQL interpreter': http://www.lpl.univ-aix.fr:81/.www/projects/SgmlQL/MQL1.html.

[CR: 19951113]

Le Noel, Bernard; Biezunski, Michel (translators). SGML - ODA: presentation des concepts et comparaison fonctionnelle. Paris: Afnor, 1991. Extent: ix + 87 pages. ISBN: 2-12-488011-X.

Apparently a translation of the English document SGML and ODA: Standards for Document Processing and Interchange, published by the Danish national standards body in 1989. ['Traduction de l'ouvrage publie en anglais par Dansk Standardiseringsrad sous le titre "SGML and ODA. Standards for document processing and interchange" en 1989.']

[CR: 19961210]

Le van, Huu. "Describing Formatting Directives for SGML Documents." SGML Users' Group Bulletin 4/1 (1989) 35-38 (with 8 references). ISSN: 0269-2538. Author's affiliation: Department of Computer Science, University of Milan, Via Moretto da Brescia, 9 Milano, Italy.

Abstact: "SGML is an international standard defined by ISO, oriented to provide a markup language to describe the logical structure of documents. One of SGML's objectives is to allow every formatting system to process a document described by SGML elements without modifying the manuscript, i.e., it is not necessary for document authors to know which formatting system will process their documents. An approach to achieve this objective could consist of transforming the SGML document into a source file for every formatter wanted, with the support of a special map table which associates every SGML element with formatting control sequences of the formatter itself. To this end we define a language, called METAFORM, which at run time is capable of selecting the appropriate set of formatting commands to be inserted into the formatter source file we are generating, on the basis of the current status of every element in the SGML document. The paper describes the main characteristics of the METAFORM language and its application to SGML documents."

Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).

[CR: 19951113]

Le van, Huu. "SGML: A Standard Language for Text Description." Pages 198-212 (with 9 references) in PROTEXT II. Proceedings of the Second International Conference on Text Processing Systems. International Conference on Text Processing Systems, Dublin, Ireland 23-25 October 1985. Edited by John J. H. Miller. Dublin, Ireland: Dún Laoghaire, Boole Press, Ltd., 1987. vii + 215 pages. ISBN: 0-906783-50-X (hardback); 0-906783-53-4 (paperback). Author's affiliation: Institute of Cybernetics, Milan University, Italy.

"Abstract: The standard SGML, proposed by ISO, is a declarative markup language. It provides a coherent and unambiguous syntax for describing document elements. A document described by SGML can be submitted to the treatment of all applications (data base, editor, formatter . . .) because it does not contain any particular processing instructions. This paper refers to some proposal for the application of SGML concepts in the formatting environment. First, it analyzes the possibility of integrating SGML with the TEX formatting system. Subsequently, it describes an environment for the document preparation, where the user, even inexperienced, is able to define the logical structure and the text of documents interactively and graphically, respecting the semantic meaning of SGML elements. The document defined is processed by the SGML parser producing an intermediate and system independent file. Subsequent interpretation of this file and the support of a special map table will allow a SGML document to be processed by all systems; i.e. it is not necessary for document authors to know which formatting system will process their documents."

[CR: 19951113]

Le Van, Huu; Terreni, Elisa. "A Language to Describe Formatting Directives for SGML Documents." Pages 98-119 (with 18 references) in TeX for Scientific Documentation. Proceedings of the Second European Conference. (The Second European Conference on TeX for Scientific Documentation, Strasbourg, France, June 19-21, 1986. Sponsored by: CNRS (Centre National de le Recherche Scientifique), SMF (Société Mathématique de France), Université Louis-Pasteur de Strasbourg). Edited by Jacques Désarménien. Lecture Notes in Computer Science, Number 236. Berlin/New York: Springer-Verlag, 1986. ISBN: 0387168079 (New York); ISBN: 3540168079 (Berlin). Authors' affiliation: Department of Computer Science, University of Milan, Italy.

Abstract: "SGML is a standard proposed by ISO [International Organization for Standardization, Geneva] for documents description based on a generalized markup technique. The formatting process of a SGML document could consist in singling out markup elements and inserting formatting directives into the document in accordance with the class of the markup elements themselves, using a suitable map table.

"This paper will present an implementation of an environment of SGML documents production, emphasizing a special language, METAFORM, for the map table construction."

[CR: 19951113]

Le Van, Huu. "An Environment for SGML Document Preparation." Pages 43-52 in Conference Proceedings. Summer 1987 USENIX Technical Conference and Exhibition. USENIX Conference, Phoenix, Arizona, June 8-12, 1987. Berkeley, California: The USENIX Association, 1987. Author's affiliation: Department of Computer Science, Milan University, Italy.

"Abstract: SGML (Standard Generalized Markup Language) is an International Standard defined by ISO for documents description based on generalized markup technique. The author refers to some proposals for the application of SGML concepts in the formatting environment. He describes an environment for SGML documents preparation, where the user, even inexperienced, is able to define the logical structure and the text of documents interactively and graphically, and where the document so defined can be processed by all formatting systems."

[Note: pagination may be "110-126" in a variant publication of the proceedings.]

[CR: 19971107]

Leventhal, Michael. "XML. Can the Desperate Perl Hacker Do It?" Pages 153-163 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: Grif S.A.

Abstract: "Is Perl a suitable language for programming XML? The use of Perl with XML is illustrated in this article with a program that checks to see if an XML document is well-formed. The relative simplicity of the program demonstrates that lightweight Perl programs may be used with XML, although Unicode and the use of entities make it difficult for Perl programmers to handle some XML files."

[CR: 19960410]

Leventhal, Michael; Benfield, Rebecca. REENGINEERING LEGAL PUBLISHING: AN EVOLUTIONARY APPROACH. Word 6 for SGML Authoring? Technical paper presented at the SGML '94 Poster Session. Oakland, CA: Text Science, 1994. Extent: approximately 6 pages. Authors' affiliation: [Leventhal] Text Science, Inc., Text Science Tower, 1800 Lake Shore Ave., No. 14, Oakland, CA 94606. (V) 510-444-2962 (F) 510-444-1672. Email: michael@textscience.com. WWW: http://www.textscience.com/, or http://www.textscience.com/homets; [Benfield] Continuing Education of the Bar, University of California. Email: becky@cory.berkeley.edu..

Summary: "Many organizations are considering using Word 6 for SGML authoring now that Microsoft's SGML Author is on the immediate horizon. We'd like to offer some reflections on our decision to use Word 6 and our experience with conversion to SGML with respect to our overall philosophy of incremental, evolutionary project engineering."

Aavailable on the Internet: http://www.textscience.com/w6paper.html; [mirror copy of paper, partial links]. See also the accompaning poster which describes various strategies for mapping Word 6 styles to SGML. Or: RTF STYLES TO SGML UTILITIES IN PERL - http://www.textscience.com/stylecod.html.

[CR: 19971206]

Leventhal, Michael; Kohl, Jeffrey S.; Lewis, David R.; Smith, Ann K. Large-scale Electronic Document Distribution at Pacific Bell. Pacific Bell Technical Report. [Los Alamitos]: IEEE, 1997. Author's affiliation: Text Science, Inc..

Abstract: "At Pacific Bell we have developed a document distribution system which leverages a number of Internet technologies to solve extreme scaling requirements while staying cost-effective and meeting demanding business needs. The system, currently a fully operational prototype, is a combination of off-the-shelf and custom software. We are in the process of customizing our solution for specific internal organizations while introducing open document standards which enable the production of information compatible with our delivery system."

[SuperBook's companion and succesor: ] "In an effort to move to open standards and commercial products and to take advantage of the quickly developing Internet technologies, Pacific Bell issued a request for proposal (RFP) in October 1995 for a large-scale SGML browser. This paper provides an overview of the requirements and describes how various commercial components have been integrated to meet our business needs..."

"The SGML markup in our system creates, as Douglas Engelbart expressed it 'Explicitly Structured Documents -- where the objects comprising a document are arranged in an explicit hierarchical structure, and compound-object substructures may be explicitly addressed for access or manipulation of the structural relationships.' Open Text's LLS provides us with the ability to index these hierarchical objects and to retrieve them using "region expressions", the general principle of which is described in [x].

"Our documents originate from a variety of sources including Microsoft Word, native HTML, and SGML. In all three cases we require the authors or providers to, at a minimum, encode hierarchical divisions into the documents. Those hierarchical divisions are used to generate the TOC database and thus become basic units of content search and retrieval. The authors may also identify other units, or containers, of content (by use of tags or styles depending upon their authoring tools), and when they do, these become additional units of search and retrieval in the delivery system. These additional containers may be smaller (finer-grained), or larger than the hierarchical units, and may be based in traditional document technology, 'example' or 'paragraph', for example, or may contain units meaningful within an application or knowledge domain.[...] Our current SGML markup encompasses HTML; that is, both HTML and non-HTML tags can exist in a document at the same time. The non-HTML tags provide container and the addressing information while the HTML markup describes content formatting. Currently, we are able to deliver these 'SGML+HTML' documents directly to the Netscape client as it, and other current generation browsers, simply ignores the unrecognized SGML markup."

Available online in HTML format: http://yuri.stanford.edu/ic1q97/final.htm; [local archive copy].

[CR: 19980608]

Leventhal, Michael; Lewis, David; Fuchs, Matthew; with contributions by Stuart Culshaw and Gene Kan. Designing XML Internet Applications. The Charles F. Goldfarb Series on Open Information Management. [Subseries:] The Definitive XML Series from Charles F. Goldfarb. . Upper Saddle River, NJ: Prentice Hall PTR, [May] 1998. Extent: xxxii + 584 pages, CD-ROM. ISBN: 0-13-616822-1. Authors' affiliation: [Leventhal:] Text Science, Inc.; Email: michael@textscience.com; [Lewis:] Principal Webmaster, Pacific Bell; Email: drlewi1@pacbell.com; [Fuchs:] Senior Software Designer, Walt Disney Imagineering; WWW.

See the online description and Table of Contents.

[CR: 19960408 MD: 19980606]

Levinson, Edward. "Exchanging SGML Documents Using Internet Mail and MIME." Computer Standards & Interfaces 18/1 (January 1996) 93-102 (with 11 references). ISSN: 0920-5489. Author's affiliation: Accurate Information Systems, Inc. Email: elevinson@accurate.com.

Abstract: "The Multipurpose Internet Mail Extensions (MIME) provides an extensible capability to receive, via electronic mail, more than just plain text. Its capabilities include audio graphics and video. As an Internet Draft Standard it is being widely deployed in commercial and public domain mail systems. The extensibility makes it an attractive vehicle for the exchange of documents that use the Standard Generalized Markup Language (SGML). This paper reports on work in integrating the MIME and SGML standards."

"MIME is the result of an Internet Engineering Task Force effort to expand the capabilities of Internet mail without disturbing the existing base of text mail systems. It provides mechanisms for content labelling and multiple content parts within the Internet message body. There are seven basic MIME content types, text, image, video, audio, application, message, and multipart. Multipart indicates a message that contains multiple body parts each of which is an independent message part; the others represent atomic message units. Each content type has associated with it a set of subtypes that the MUA uses to precisely identify the contents and then to invoke the appropriate software to process that content."

"MIME can also be used to create an ad hoc encoding for the SGML Document Interchange Format (SDIF) data stream, allowinf a MIME-capable mail user agent (MUA) to directly display the encoded documents. For SGML and SDIF processing, new subtypes are proposed that identity the documents and allow for the appropriate processing.

"When exchanging an SGML document, the document's internal entity and process structure must be transferred along with the files. Maximum utility occurs when these structures are represented in a system independent manner. The MIME approach exposes those structures and transports them by providing them with a canonical representation. An experimental implementation was built using the publicly-available software packages, sgmls (an SGML parser) and mh (a mail user agent). They were modified to generate and read MIME encoded SGML documents."

This article was published in an SGML special issue of Computer Standards & Interfaces [The International Journal on the Development and Application of Standards for Computers, Data Communications and Interfaces], under the issue title SGML Into the Nineties. It was edited by Ian A. Macleod, of Queen's University.

See now: XML Media/MIME Types.

Levinson, Ed. Encapsulating SGML Documents Using the Multipart/Related Content-Type. Work item of the MIME Content-Type for SGML Documents Working Group of the IETF. Eatontown, NJ: Accurate Information Systems, Inc., May 31, 1995. 20 pages. Email address: elevinso@accurate.com.

"Abstract: This draft describes the encapsulation of a Standard Generalized Markup Language (SGML) document withing a MIME message. It proposes new content sub-types of Text/SGML, Application/SGML, and Application/SGML-notation, and a new header, Content-SGML-Entity. This specification uses the proposed Multipart/Related Content-Type [RFC-REL] and access-type=content-id [RFC-ACTI] specifications. Multipart/Related provides the mechanism for treating the entire document as a single object and access-type=content-type allows a single MIME entity to appear several times without replicating the body of that MIME entity."

The filename is draft-ietf-mimesgml-encap-00.txt. See the cover letter which explains this document in relationship to two others, and which provides Internet access points. A lightly HTMLized version is available here.

Levinson, Ed. The MIME Multipart/Related Content-type. Work item of the MIME Content-Type for SGML Documents Working Group of the IETF. Eatontown, NJ: Accurate Information Systems, Inc., May 31, 1995. 7 pages. Email address: elevinso@accurate.com.

"Abstract: The Multipart/Related content-type provides a common mechanism for representing objects that are aggregates of related MIME body parts. This document defines the Multipart/Related content-type and provides examples of its use."

The filename is draft-ietf-mimesgml-multipart-rel-01.txt. See the cover letter which explains this document in relationship to two others, and which provides Internet access points. A lightly HTMLized version is available here.

Levinson, Ed; Clark, James. Message/External-Body Content-ID Access Type. Work item of the MIME Content-Type for SGML Documents Working Group of the IETF. Eatontown, NJ: Accurate Information Systems, Inc., May 31, 1995. 4 pages. Email address: elevinso@accurate.com.

"Abstract: When using MIME [MIME] to encapsulate a structured object that consist of many elements, for example an SGML [SGML] document, a single element may occur several times. An encapsulation normally maps each of the structured objects elements to a MIME entity. It is useful to include elements that occur multiple time exactly once. To accomplish that and to preserve the object structure it is desirable to unambiguously refer to another body part of the same message. The exsisting MIME Content-Type Message/External-Body access-types allow a MIME entity (body-part) to refer to an object that is not in the message by specifying how to access that object. The Content-ID access method described in this document provides the capability to refer to an object within the message."

The filename is draft-ietf-mimesgml-access-cid-00.txt. See the cover letter which explains this document in relationship to two others, and which provides Internet access points. A lightly HTMLized version is available here.

[CR: 19970522]

Levitt, Jason. "XML Is The Future Of HTML [Internet View]." Information Week (May 19, 1997) 88. ISSN: 8750-6974.

Intro: "XML offers all the power of Dynamic HTML, Style Sheets, and other HTML extensions within the confines of an extensible framework. Unlike HTML, XML documents are specified in two parts. One is the XML document itself, which may look like an HTML document except that it will probably have a lot of new tags. The other part is a Document Type Definition that explains what the new tags mean and how they should be interpreted. The separation of the DTD from the document's contents lets Web developers extend the language simply by creating new DTD files."

The document is available online: "XML Is The Future Of HTML", [mirror copy]

Levy, D. M. "Document Reuse and Document Systems." Electronic Publishing: Origination, Dissemination, and Design (EPODD) EP '94. Fifth International Conference on Electronic Publishing, Document Manipulation, and Typography, Darmstadt, Germany, 13-15 April 1994. 6/4 (December 1993) 339-348. 26 references. Author affiliation: Xerox Palo Alto Res. Center, CA, USA.

Abstract: While reuse is currently the focus of much attention in the programming language community, it is also a central, but less noticed, issue in the creation and use of documents, and therefore in the design of document systems. To a great extent, the work of producing new documents, and new versions of old documents, involves reusing pieces of previously existing documents, where reuse involves finding the relevant material, modifying it as needed, and stitching the pieces together. The objective of this paper is to demonstrate how a focus on reuse can shed light on current efforts to build structured document systems and to design and use standards, such as SGML, ODA, and OLE, that address structured and compound documents.

[CR: 19980508]

Lewis, John D. "XML. An Introduction." OCLC Systems and Services [Journal] 14/1 ( 1998) 51-52. ISSN: 1065-075X. Author's affiliation: Director of Product Development, PubList.com; Email: JLewis@PubList.com.

"Abstract: The Extensible Markup Language (XML) is a new language specification submitted to the World Wide Web Consortium (W3C). The specification (available online at http://www.w3.org/TR/PR-xml-971208) defines this new language in terms of both SGML and HTML, and is specifically designed for the Internet. In the era of online electronic journals, currently wrapped in HTML, this has significant repercussions for electronic publishing." [Note: see the subsequent W3C Recommendation 10-February-1998http://www.w3.org/TR/1998/REC-xml-19980210.]

Possibly online: see http://www.mcb.co.uk/oclc.htm.

[CR: 19961210]

Lewis, Sheila. "European Conformance Testing Service for SGML." SGML Users' Group Bulletin 4/1 (1989) 43-44. ISSN: 0269-2538. Author's affiliation: Testing Services, The National Computing Centre Limited, Oxford Road, Manchester M1 7ED, UK.

Abstact: "The Commission of the European Communities (CEC) is keen to promote harmonization throughout European testing services in the areas of Information Technology and Telecommunications, including the availability of test technology and equivalence of test results between test laboratories. To this end, in 1985 the CEC launched the first Conformance Testing Services (CTS-1) programme covering various topics including compilers, OSI protocols, and computer graphics. In 1988 the CEC launched their second CTS programme (CTS-2) to cover fresh areas, one of which was a three-year project to establish harmonized conformance testing services for SGML systems throughout Europe. This paper will explain the conformance testing process in general terms and outline how this is applied to validating SGML parsers. The paper will also consider the benefits of using the service, and its availability."

Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).

Lieberman, I.; Geer, R. "Interactive Authoring and Display System-a PC Based Document Access/Reference Environment." Pages 95-101 in Conference Proceedings. AUTOTESTCON '94. IEEE Systems Readiness Technology Conference. 'Cost Effective Support Into the Next Century'. New York, NY, USA: IEEE, 1994. 1 reference. Author's affiliation: Test Autom. Inc., Valencia, CA, USA.

"Abstract: Interactive Authoring and Display System (IADS), a Microsoft Windows application, provides an environment to develop (author) and read (display) documents via a PC. IADS provides both a textual and graphic environment. IADS uses Standard Generalized Markup Language, ISO 8879 (SGML) as its internal file format for textual data. Both vector and raster graphics are supported through CALS standard data formats MIL-D-28003 CGM and MIL-R-28002 Raster Type I as well as Windows .BMP, .PCX and other industry standard formats. IADS was chosen by the Naval Air Warfare Center Weapons Division, Point Mugu as the environment for all Technical Manual publications for the Sparrow Missile Test Set (AN/DPM22-12(V)). Technical Manual development on IADS has been straight forward, requiring a minimal amount of self training. This paper presents development and operational features of IADS, encouraging others to develop/maintain manuals (of any complexity level) in the same manner."

The IADS package (version 2.0, March/April 1995) is available on the Exeter FTP server and elsewhere.

[CR: 19951022]

Light, Richard. Getting a Handle on Exhibition Catalogues: the Project CHIO DTD. CIMI Project Paper. Nova Scotia: Consortium for Interchange of Museum Information, Summer, 1995. Author's affiliation: Consultant, Consortium for Interchange of Museum Information.

"Abstract: This paper describes work being carried out by CIMI (the Consortium for Interchange of Museum Information) on the analysis of exhibition catalogues. This is being undertaken as part of Project CHIO (Cultural Heritage Information Online). The project plans to use the SGML (Standard Generalized Markup Language) standard to express the structure and content of source materials, including exhibition catalogues. The analysis that was undertaken led to a particular view on how exhibition catalogues (and by extension, any text-based museum information sources) could be marked up to support retrieval of extracts relevant to a wide range of queries. The process of analysis is described, and the resulting design decisions outlined. The paper concludes with an assessment of the possibilities for information retrieval offered by this approach."

Available on the Internet: Richard Light: Getting a handle on exhibition catalogues: the Project CHIO DTD; [mirror copy]. A text version is also available. See also the main entry for the Consortium for Interchange of Museum Information.

[CR: 19980428]

Light, Richard; North, Simon; Allen, Charles; Bray, Tim [Foreword]. Presenting XML. Indianapolis, IN: Sams Publishing [Macmillan Publishing USA], 1997. Extent: xxxi + 415 pages. ISBN: 1-57521-334-6. Author's affiliation: [Richard Light Consulting].

Presenting XML was probably the first publiched book written entirely on XML. It will "help readers learn the fundamentals of the Extensible Markup Language (XML); help them understand the relationship between XML, SGML, and HTML; and enable them to write their own XML applications to deliver structured information to the World Wide Web." Richard Light is a well-known authority in the SGML/XML world, and Tim Bray is co-editor of the XML specification. One reviewer writes that the book, through no fault of its own, "suffers from being a snapshot of a moving target, but [is] a worthy first volume in the soon-to-be-large XML library."

Description: ". . . this reference takes you on an introductory tour of this robust technology, showing you how the technology can work to your advantage. You'll learn to create XML documents, separate style from content, and create power links with XML. In addition, you'll find out how XML is being used today and what impact it will have in the future. With Presenting XML, you'll get a quick, efficient introduction to XML and everything it has to offer, and you'll learn why this dynamic markup language is the wave of the future." [publisher's blurb] See provisionally the description from Macmillan's superlibrary.com server, or the announcement from Simon North. Alternately, check the companion web site for the volume.

A review of Presenting XML was published in XML Files: see the bibliography entry, or the source at http://www.gca.org/memonly/xmlfiles/issue4/book.htm. See also the review on the XMLxperts site.

[CR: 19961106]

Light, Richard. "The SGML Tagger and OUP." SGML Users' Group Newsletter 27 (May 1994) 17. ISSN: 0952-8008. .

Description of the Oxford University Press SGML tool called "The SGML Tagger," developed by Richard Light. The tagger is designed to be loaded on top of word-processing software so that tagging may be done without a special editor. The tool is compatible only with text-based DOS word-processors, and thus not with Microsoft Word under Windows. Contact: +44-865-267979. Or see: Richard Light: The SGML Tagger; [mirror copy]

[CR: 19971107]

Lincoln, Thomas L. "Codifying Medical Records in XML: Philosophy and Engineering." Pages 149-152 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: RAND Corporation.

Abstract: "The following paper was given as a talk at the 'XML Mixer' in La Jolla, California in late July '97, before a combined audience of clinicians, computing professionals, and vendors of document processing software. What brought the group together was an ongoing effort to introduce markup technology into the processing of healthcare information in an ISO standard manner, using SGML (Standard Generalized Markup Language) and SGML's strict subset, XML (Extensible Markup Language). Other speakers spoke more specifically to processing topics, work flow, or business issues in the use of information systems in medicine, but the emphasis here is on some long perceived, but often overlooked problems in the semantics of communication. Both the general and the specific are important ingredients in this area, which indirectly indicates why the document format offers the appropriate middle ground between free text and excessively rigid (but easy to process) data structures."

Note: further information on the role of SGML/XML in medical informatics is found in the database section for the SGML Initiative in Health Care (HL7 Health Level-7 and SGML).

[CR: 19950903]

Lincoln, Thomas L.; Essin, Daniel J.; Anderson, Robert; Ware, Willis H. THE INTRODUCTION OF A NEW DOCUMENT PROCESSING PARADIGM INTO HEALTH CARE COMPUTING -- A CAIT WHITE PAPER. CAIT Technical Report. Santa Monica and Los Angeles (CA): Rand and Los Angeles County + University of Southern California Medical Center Department of Medical Administration, 1994 [1993?].

"The main goal of this CAIT White Paper is to address a very general problem in automated record keeping as it applies to Health Care -- thereby resolving the long standing unmet need for an Electronic Clinical Chart (ECC) for on-line clinical use [Institute of Medicine 1991, Ball 1992]. In clinical medicine (and in other similar venues) documentation must be responsive to real world circumstances. As a consequence, the information components are typically highly variable in both form and content, complicating their management and use. Today's technologies offer an opportunity to develop and introduce an effective new systems architecture based on the concept of "document processing" that can markedly improve processing effectiveness by anticipating such variability and making it's management a part of the underlying logic. Here the notion of the document as the object to be stored and processed is in contradistinction to the common computing view in which data, records, and fields are the fundamental items. Electronic documents, properly enhanced with additional labels, can form the archive from which data can be extracted from various viewpoints for classic processing, providing greater flexibility to end-user applications and enhanced results."

"The new approach considers each component of the medical chart as a loosely structured document in which the components can be uniquely delimited in some uniform manner by tags or labels [Essin 1993]. To do this, the (ISO) Standard Generalized Markup Language (SGML), which has been designed for this purpose with respect to data display and formatting, is extended to organize medical content. Here appropriate new content related tagging conventions are introduced that delimit each specific item and section for subsequent retrieval and processing."

Available online via FTP: FTP Remote file dumccss.mc.duke.edu/standards/SGML/proposals/CAIT-white-paper.txt, [or mirror copy].

[CR: 19960826]

Lindberg, Donald A. B.; Humphreys, Betsy L. "Medical informatics." Journal of the American Medical Association (JAMA) 275/23 (June 19 1966) 1821-1822. ISSN: . Author's affiliation: .

"Abstract: Improvements in computer technology, the Internet and the development of wireless and satellite communications have led to several innovations in medical informatics. Telemedicine involves transmitting images and other information to and from medical centers. It could lower costs substantially by allowing physicians and nurses to participate in a patient's treatment without having to travel to the site. Many medical journal publishers are supplementing their traditional printed product with a site on the Internet. This is facilitated by their adoption of the Standard Generalized Markup Language (SGML), which uses standard tags imbedded in the text that allows the integration of text supplied by different publishers. The Digital Imaging and Communications in Medicine (DICOM) and Health Level 7 standards can be used in telemedical applications. The development of the electronic patient record has generated concerns about confidentiality."

"Perhaps influenced by the success of the abbreviated hypertext markup language (HTML) version used in Web applications, commercial publishers are moving to adopt the standard generalized markup language (SGML) for electronic publications -- some 10 years after its introduction. The SGML standard uses embedded tags instead of local, nonstandard printing instructions to identify various publication elements, thus improving prospects for integrated access to information generated by different publishers."

[CR: 19980423]

Lindén, Greger. Structured Document Transformations. PhD Thesis. Report [Series of Publications] A-1997-2. Helsinki, Finland: Department of Computer Science, University of Helsinki, June 1997. Extent: 122 pages (bibliography: pages 109-122). ISBN: 951-45-7766-3. ISSN: 1238-8645. Author's affiliation: Department of Computer Science, P. O. Box 26 (Industrigatan 23), FIN-00014 University of Helsinki, FINLAND; Tel: +358 9 708 44164; FAX: +358 9 708 44441; Email: Greger.Linden@cs.helsinki.fi; WWW: http://www.cs.helsinki.fi/~linden/.

Abstract: "We present two techniques for transforming structured documents. The first technique, called TT-grammars, is based on earlier work by Keller et al., and has been extended to fit structured documents. TT-grammars assure that the constructed transformation will produce only syntactically correct output even if the source and the target representations may be specified with two unrelated context-free grammars. We present a transformation generator called ALCHEMIST which is based on TT-grammars. ALCHEMIST has been extended with semantic actions in order to make it possible to build full scale transformations. ALCHEMIST has been extensively used in a large software project for building a bridge between two development environents.

The second technique is a tree transformation method especially targeted at SGML documents. The technique employs a transformation language called TranSID, which is a declarative, high-level tree transformation language. TranSID does not require the user to specify a grammar for the target representation but instead gives full programming power for arbitrary tree modifications. Both ALCHEMIST and TranSID are fully operational on UNIX platforms."

The dissertation was presented on June 18, 1997. It is available online in Postscript format, via FTP or HTTP; [local archive copy]. See also a list of the author's publications, and some publications of the University of Helsinki SID [Structured and Intelligent Documents] project.

[CR: 19971125]

Lindgren, Lars-Olof. "Information Modeling for Document Management: the Key to Successful System Selection and Deployment." Page(s) 145-146 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Managing Director, Texcel International.

Summary: "This session will address the central role that identification and analysis of document components plays in the selection, design, and deployment of document management systems. The key to success in installing such a system is a thorough analysis of your information model. The SGML document analysis process is the cornerstone of this effort. The total effort should be no less rigorous than that used for designing and deploying any database management system. Additionally, information analysis should be independent of the deliverables such as paper documents that traditionally inform the design of a document management system."

"SGML is today's most powerful comprehensive object model for document information, and as such is an ideal mechanism to migrate the underlying document structure need to change to move from a file based to a component based DBMS system. The SGML document analysis process is the starting place for the information analysis required for successful component management."

[CR: 19960403]

Linkins, Kim Fulcher. "Sun, Berkeley Library Provide Materials over Internet." The Sun Observer 10/4 (April 1966) 1, 8. ISSN: [?]. Author's affiliation: The Sun Observer, Editor..

The article is based upon an interview with Roy Tennant, Sunsite Project Manager. According to the Sun Observer, Sun Microsystems "provides the hardware to enable Berkeley's Digital Library Project to make photos, literature, and other artwork available to any user for downloading."

See provisionally: "Finding Aids for Archival Collections: SGML Translated on-the-fly Into HTML". [need also document URL]

[CR: 19961226]

Lloyd, Chris; Craven, Bruce. "Building an Object Oriented Database Management System for SGML." Pages 609-614 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: F. A. Davis Company.

Abstract: "Maintaining large amounts of SGML data in separate files on a file system has always been a difficult proposition. Trying to coordinate a distributed workgroup environment is even more difficult. Simple mechanisms such as ID and IDREF can become a nightmare on even small projects. A database environment offers many exciting possibilities for features such as version control, sharing, validation, and distribution. The challenge is to develop a system that is capable of accepting any SGML document and flexible enough to support many different SGML database applications."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19950716]

Lo, C. Y. "Integrating links and versioning in document management." Australian Computer Science Communications. [Eighteenth Australasian Computer Science Conference. ACSC'95, Glenelg, SA, Australia, 1-3 February, 1995.] 17/1 (1995) 339-346 (with 14 references). Author's affiliation: Department of Computer Science, R. Melbourne Institute of Technology, Vic., Australia.

"Abstract: Links and versioning are two important aspects of document management. Supports in these two areas could certainly enhance the functionality of a document management system. However, in addition to the problems posed by individual areas, the integration of the two generates more considerations. The situation is further complicated by the extensive scope and many possibilities in this context. This paper thus attempts to describe a specific set of link versioning behaviours to provide a platform to explore the various issues of link versioning. Based on this model and the SGML environment, two methods to handle link versioning are presented and analysed."

[CR: 19990312]

Lobin, Henning. Informationsmodellierung in XML und SGML. Berlin/Heidelberg: Springer-Verlag, [forthcoming] 1999. Extent: 250 pages. ISBN: 3-540-65356-2. Author's affiliation: Universität Bielefeld - Fakultät für Linguistik und Literaturwissenschaft; WWW: http://coli.lili.uni-bielefeld.de/~lobin/; Email: lobin@lili.uni-bielefeld.de.

In this book, Lobin endeavors "to describe the field from a somewhat distinct perspective, with a strong emphasis on architectures and some other SGML extended facilities definied in the HyTime standard. There are chapters dealing with the grammatical restriction of PCDATA and CDATA content using architectures or the use of LINK for a flexible architectural mapping."

Summary: "Die Extensible Markup Language (XML), eine vereinfachte Version der Standard Generalized Markup Language (SGML), wurde für den Austausch strukturierter Datenim Internet entwickelt. Informationen können damit nicht nur in einem einheitlichen, medienunabhängigen Format strukturiert werden, sondern auch die Strukturierungsprinzipien selbst sind durch ein formales Regelwerk, eine Grammatik, beschreibbar. Erst so werden weitergehende Verarbeitungsprozesse wie geleitete Dateneingaben, Datenkonvertierung, flexibles Navigieren und Viewing der Daten möglich. Neben der elementaren Informationsmodellierung ist mit der Meta-Strukturierung durch sog. Architekturen ein neuer Aspekt hinzugekommen: dieobjektorientierte Schichtung von Struktur-Grammatiken. Das vorliegende Buch stellt sowohl elementare als auch architektonische Strukturierungstechniken erstmals in zusammenhängender Form dar. Es wendet sich an Leser, die sich detailliert und praxisorientiert mit dem Thema auseinandersetzen wollen."

Contents: "Einleitung. - Teil I. Primäre Strukturierung - Strukturgrammatiken: Elemente.- Attribute. - Dokumente. - SGML-Versionen. - Teil II. Sekundäre Strukturierung - Architekturen: Sekundäre Strukturierung durch Architekturen. - Deklaration von Architekturen. - Architektur-Definition und Link-Prozess-Deklarationen. - Anwendungen.- Anhang: A. Standardisierte Informationsmodelle. - B. XML-Syntaxregeln mit SGML-Erweiterungen. - C. Architektonische Verarbeitung in SP. - D. SGML-Deklarationen für XML.- E. Abbildungsverzeichnis. - F. Verzeichnis von Definitionen und Beispielen.- G. Register. - H. Materialien." [from Springer Verlag]

Keywords: Standard Generalized Markup Language (SGML), Extensible Markup Language (XML), HyTime, Informationsmodellierung, Textstrukturierung.

[CR: 19950716]

Loeffen, Arjan. Statements on objects in SGML based document design. LET.RUU Research Document. June 27 1995. Extent: approximately 12 pages.

This document emerges from the author's thesis research. The "statements" in this document should be of interest and use to designers of object-oriented databases. The document is available online http://www.let.ruu.nl/departments/C+L/loeffen/phdthes/statemen.htm. [mirror copy, July 21, 1995, text only].

Loeffen, Arjan. "Text Databases: A Survey of Text Models and Systems." Work paper, [no number]. Utrecht: University of Utrecht, [1994]. 10 pages. Author's address: Arjan Loeffen, Faculty of Arts, University of Utrecht; Achter de Dom 22-24; 3512JP Utrecht; The Netherlands; ++31+30536417 (voice work); ++31+206656463 (voice home); ++31+30539221 (fax work); Email: Arjan.Loeffen@LET.RUU.NL

Available in Postscript format as "sigmod.uue" [encodes "sigmod.ps"] through anonymous FTP. Note that other valuable research papers on SGML from Arjan Loeffen are available from the same FTP server: see the subdirectory "models" and the subdirectory "sgml-model".

Abstract: "Text models focus on the manipulation of textual data. They describe texts by their structure, operations on the texts, and constraints on both structure and operations. In this article common characteristics of machine readable texts in general are outlined. Subsequently, ten text models are introduced. They are described in terms of the datatypes that they support, and the operations defined by these datatypes. Finally, the models are compared." [The text models discussed include: TDM (relational model based upon nonfirst normal form), P-string model, PAT (University of Waterloo), TOMS ("textual object management system" - an indexing toolkit), the containment model, MdF ("Monads-dot-Features"), the Banyan system, Extended MAESTRO, Grif, and Multos.

Loeffen, Arjan. "Text databases: a survey of text models and systems." SIGMOD Record 23/1 (March 1994) 97-106. 23 references. Author affiliation: Faculty of Arts, Utrecht University, Netherlands.

Abstract: Text models focus on the manipulation of textual data. They describe texts by their structure, operations on the texts, and constraints on both structure and operations. In this article common characteristics of machine readable texts in general are outlined. Subsequently, ten text models are introduced. They are described in terms of the datatypes that they support, and the operations defined by these datatypes. Finally, the models are compared. The models include the TDM text data model based on nonfirst normal form, p-string model, PAT text model, TOMS textual object management system and the containment model.

[CR: 19970403]

Loeffen, Arjan. Toward Semantic Specifications for SGML Encoded Documents. Utrecht University Technical Report. Utrecht, NL: Utrecht University, 1996. Extent: approximately 18 pages. Author's affiliation: Utrecht University. Email: Arjan.Loeffen@let.ruu.nl; WWW: http://CandL.let.ruu.nl/staff/loeffen.htm.

Abstract: "In this article I intend to show that the current mechanisms for specifying how SGML enoded documents are to be processed may not be adapted to express the intent, or meaning, of the encoding strategy applied. First, a short survey of SGML essentials is given. SGML document processing is introduced, and common approaches for specifying such processes are described. These processes concern a small application domain. Some inherent restictions of SGML and current processing techniques are discussed. Next, an object-oriented view on the document is given, and its application as a processing framework is outlined. Finally, semantic specifications are introduced, that allow for validation and processing specifications to be recorded and exchanged in the form of semantic specification sheets."

Also: in Interdiciplinaire Onderzoeksconferentie Informatiewetenschap 1996, Delft, 1996.

Available online: http://CandL.let.ruu.nl/preprint/stinfon/stinfon.htm, or in Postscript format; [mirror copy, Postscript].

[CR: 19970817]

Logan, Elisabeth; Pollard, Marvin. "[Special Issue Volume] Introduction." Pages 581-582 in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: [Logan]: School of Library and Information Studies, Florida State University, Tallahassee, FL 32306-2048; Email: logan@mailer.fsu.edu; [Pollard]: College Center for Library Automation, 1238 Blountstown Road, Tallahassee, FL 32204, Email: mpollard@calstate.edu.

Abstract: "The need for organizations and industries to increase the efficiency of using document information has lead to the development and adoption of standards for document architectures. The use of networked computers to author, exchange, manipulate, store, retrieve, present, use, and re-use information has simultaneously created the possibility and the need for adopting standards for interchanging digital document information. Structured document information systems require the attention of producers and users of information today because growing document repositories are recognized as valuable information assets. Implementing standards-conforming, structured information systems, increases the value of these document repositories, but doing so requires serious rethinking of the ways document information is produced, stored, and distributed. This Special Issue of JASIS addresses the standards of structured information and document architectures, the issues surrounding the implementation of these standards for organizations and persons working towards the goal of using document information more efficiently, and explores the future of structured document information systems."

See other details concerning the original call for papers and the significance of this special issue, appearing as the ninth in a series of several special topics issues of JASIS, following the announcement by Donald H. Kraft (April, 1992 issue of JASIS: "A Call to Action in Response to Happy Days," Editorial, Journal of the American Society for Information Science 43/3, April 1992, page 302).

[CR: 19970817]

Logan, Elisabeth; Pollard, Marvin E. (special issue guest editors). Structured Information / Standards for Document Architectures. Journal of the American Society for Information Science [Special Issue] = Volume 48, Number 7 (July 1997). New York, NY: American Society for Information Science, 1997. ISSN: 0002-8231. Author address: [Logan] Elisabeth Logan, School of Library and Information Studies, Florida State University, Tallahassee, Florida 32306-2048 USA; Voice: + 1 904 644-8106; Fax: + 1 904 644-9763; E-mail: logan@mailer.fsu.edu; [Pollard]: Project Manager, Unified Information Access System, California State University, P.O. Box 3842, Seal Beach, Ca 90740-7842. Tel: +1 562 985-9492, FAX: 562 985-9414, Email: mpollard@calstate.edu. Or contact Wiley: http://www.wiley.com/compbooks/compjournals/jasis.html].

Abstract [from the call for papers]: "The need for organizations and industries to increase the efficiency of using document information has lead to the development and adoption of international standards for document architectures. The use of networked computers to author, exchange, manipulate, store, retrieve, present, use, and re-use information has simultaneously created a need for and the possibility of adopting standards for interchanging digital information. Structured document information systems require the attention of producers and users of information today because growing document repositories are recognized as valuable information assets. Implementing standards-conforming, structured information systems, can increase the value of these document repositories, but doing so requires serious re-thinking of the ways information is produced and distributed. Papers are solicited on topics which will address research and development issues that will: (a) introduce the concepts underlying structured information, (b) address the evolution of the standards of document architectures and (c) address the issues surrounding the implementation of these standards for organizations and persons working towards the goal of using information more efficiently. Papers exploring the future of structured information systems are welcome."

Although research articles and empirical studies will be favored, state of the art reviews or position papers on SGML and other international standards as well as DTD's from government or industry will be considered. These might include, for instance, HTML (Internet WWW), CALS (Department of Defense), ICADD (Publishing), or TEI (Academic Community) as well as many others."

This ninth special topics issue of JASISis scheduled to appear in mid-1996. It will cover the topic of Standards for Document Architectures: SGML (Standard Generalized Markup Language), HyTime (Hypermedia/Time-based Structuring Language), DSSSL (Document Style Semantics and Specification Language), and SPDL (Standard Page Description Language). The guest editor for this special issue is Elisabeth Logan of Florida State University. This special issue appears in a sequence of several issues, following the announcement by Donald H. Kraft (April, 1992 issue of JASIS: "A Call to Action in Response to Happy Days," Editorial, Journal of the American Society for Information Science 43/3, April 1992, page 302).

The collection of articles is presented in bibliographic summary within a dedicated document. See also the individual entries: (1) Logan, Elisabeth; Pollard, Marvin. "[Special Issue Volume] Introduction." (2) Weibel, Stuart. "In Memoriam: A Tribute to Yuri Rubinsky, August 2, 1952 -- January 21, 1996." (3) Marcoux, Yves; Sévigny, Martin. "Why SGML? Why Now?" (4) Mason, James David. "SGML and Related Standards: New Directions as the Second Decade Begins." (5) Adler, Sharon C. "The ``ABCs'' of DSSSL." (6) Kimber, W. Eliot; Woods, Julia A. "Application of HyTime Hyperlinks and Finite Coordinate Spaces to Historical Writing, Analysis, and Presentation." (7) Flynn, Peter. "W[h]ither the Web? The Extension or Replacement of HTML." (8) Barnard, David T; Ide, Nancy M. "The Text Encoding Initiative: Flexible and Extensible Document Encoding." (9) Sengupta, Arijit; Dillon, Andrew. "Extending SGML to Accommodate Database Functions: A Methodological Overview." (10) Fausey, Jon; Shafer, Keith. "All My Data Is in SGML. Now What?." (11) Salminen, Airi; Kauppinen, Katri; Lehtovaara, Merja. "Towards a Methodology for Document Analysis." (12) Goldfarb, Charles F. "SGML: The Reason Why and the First Published Hint."

See: [July 1997] the online Table of Contents, [archive copy]; or "Call for Papers" - http://www.asis.org/Publications/JASIS/structure.html; [mirror copy].

Logan, Harry M. "Report on a New OED Project: A Study of the History of New Words in the New OED." Computers and the Humanities [ issue = Proceedings of the Eighth International Conference on Computers and the Humanities] 23/4-5 (1989) 385-395.

The article explains the use of the PAT retrieval program, which scans the OED text using the descriptive tags in the dictionary, in conjunction with GOEDEL (Generalized OED Extracting Language). Both PAT and GOEDEL were developed at the University of Waterloo Centre for the NOED, and are being used as generalized retrieval software facilities at other institutions. Tables III and IV provide examples of the SGML-style tagged text of the New OED.

This issue of CHUM contains the Proceedings of the Eighth International Conference on Computers and the Humanities (9-11 April 1987, Columbia, South Carolina), and is edited by Robert Oakman.

Logan, Robert. "Role of Standards Growing." Computing Canada 21/2 (January 18, 1995) 32.

"Abstract: Document processing systems help organizations get a handle on their masses of paper. A Gartner Group study estimates that professionals spend as much as half their time searching for documents, but only 15 percent of their time reading them. Document management comprises such technologies as workflow, full text retrieval and RDBMS. Though its roots are in image processing, the latest document management approach is to treat documents as dynamic information objects that drive enterprise decision-making and business processes, rather than simply text and pictures on a page. A successful document management system will metamorphize legacy information into a knowledge repository. Organizations should only choose solutions that conform to international standards, such as Standard Generalized Markup Language (SGML) and the Continuous Acquisition and Life-cycle Support (CALS), developed by the US Defense Dept.

[CR: 19951113]

Louarn, Philippe. "Documents électroniques: une application." Cahiers GUTenberg Number 19 (janvier 1995) 121-126. Author's affiliation: Irisa/INRIA [Institut National de Recherche en Informatique et en Automatique], Rennes, Campus de Beaulieu, F-35042 Rennes; E-mail: Philippe.Louarn@irisa.fr.

Résumé: Bien que saisi sous une forme électronique, le rapp ort d'activité de l'Inria n'a v ait jamaisété traité sous cette forme. Cet article décrit la procédure mise en place, s'appuyant sur la norme SGML, pour exploiter par divers vecteurs (www, Minitel, ftp,...) l'important volume d'information contenu dans ce rapport. Nous évoquerons les problèmes rencontrés, les apports de ce nouveau système et concluerons sur les perspectives ouvertes par ce processus.

Abstract: Each year, Inria produces an activity report. Although this report is typeset in an electronic form, it was never exploited in this way. This paper describes a new process, based on SGML, which allows users to access to the report by different ways (WWW, Minitel, ftp,...). Advantages and disadvantages of this process will be shown and future developments will be presented."

Available on the Internet in Postscript format: ftp://ftp.irisa.fr/opera/doc/ra.ps.gz [mirrored copy, November 1995].

[CR: 19961018]

Lovegrove, William S.; Brailsford, David F. "Document Analysis of PDF Files: Methods, Results and Implications." Pages 207-220 (with 17 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: Electronic Publishing Research Group, Department of Computer Science, University of Nottingham, Nottingham NG7 2RD, United Kingdom. Email: wsl@cs.nott.ac.uk; dfb@cs.nott.ac.uk.

Abstract: "A strategy for document analysis is presented which uses Portable Document Format (PDF - the underlying file structure for Adobe Acrobat software) as its starting point. This strategy examines the appearance and geometric position of text and image blocks distributed over an entire document. A blackboard system is used to tag the blocks as a first stage in deducing the fundamental relationships existing between them. PDF is shown to be a useful intermediate stage in the bottom-up analysis of document structure. Its information on line spacing and font usage gives important clues in bridging the 'semantic gap' between the scanned bitmap page and its fully analysed, block-structured form. Analysis of PDF can yield not only accurate page decomposition but also sufficient document information for the later stages of structural analysis and document understanding."

[CR: 19970518]

Lowe, Brian; Zobel, Justin; Sacks-Davis, R. "A Formal Model for Representation and Querying of Structured Documents." Journal of Systems Integration 7/1 (January 1997) 31-46 (with 32 references). Authors' affiliation: Department of Computer Science, RMIT, Melbourne, Victoria, Australia. [Zobel] Email: jz@cs.rmit.edu.au .

"Abstract: Most documents have a hierarchical structure, which can be made explicit by markup languages such as SGML. W e propose a formal model for representation of hierarchically structured documents, to be used as the basis for document query languages. The model uses a redundant representation of the document elements to simplify the expression of common queries. As an illustration of the power of the model we show how queries might be expressed, both as set theoretic expressions and in a simple algebra, and outline how queries might be evaluated in a practical system."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY