The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Last modified: July 15, 1998
SGML News. What Was New, Relatively New, or New in the 'SGML Web Page' in 1996?

Related News:   [SGML News for 1995] -   [SGML News for 1997] -   [SGML/XML News for 1998]

  • December 17, 1996. New entry for the California Heritage Digital Image Access Project. "The California Heritage collection is a 'digital' archive containing photographs, pictures, and manuscripts from the collections of the Bancroft Library. . . The project's central objective has been to build a prototype demonstration database that, by the project's end, will provide collection-level access to 25,000 digital representations of primarily source materials documenting California history, which have been selected from the collections of The Bancroft Library. When fully developed, the prototype will also use USMARC collection-level records to provide access to its EAD encoded finding aids and digital images. While the USMARC access component of this project is still under development, direct access to the encoded finding aids is currently provided here through a WWW interface (DynaWeb) that lets users directly search and navigate the SGML encoded finding aids and digitized primary source materials. . . The online finding aid collection can be searched using a rich and powerful query language. In addition to the full range of standard search tools--wildcards, proximity searching, boolean searching--users can search the underlying SGML with which all of the finding aids in this database are marked up."

  • December 17, 1996. New entry for SSML: A Speech Synthesis Markup Language. "SSML is an application of the Standard Generalized Markup Language (SGML). The input is text-based and unconstrained in its use of words, but allows a large amount of extra information to be included with the text to guide the synthesis in its performance. The text is annotated with markers to specify features such as emphasis, particular speech styles, or the beginning of new topics. . . SSML version 1.0 is now fully incorperated into the Festival speech synthesis system, which is available for general use."

  • December 16, 1996. New entry for the ETAP - Uppsala University Parallel Corpus Project. ETAP is "one of the research projects at the department of Linguistics in Uppsala University, Sweden: 'Etablering och annotering av parallelkorpus för igenkänning av översättningsekvivalenter' (in English: 'Creating and annotating a parallel corpus for the recognition of translation equivalents'). This project is a part of The Stockholm-Uppsala Research Programme 'Translation and Interpreting - A Meeting between Languages and Cultures' financed by the National Bank of Sweden (Riksbanken Jubleumsfond). . . the project has resulted in two parallel, aligned, subcorpora, the Scania Corpus and the Swedish Statement of Government Policy Corpus. Text structure in the documents in the two corpora has been automatically marked up with TEI Lite conformant SGML by means of software developed in the project."

  • December 13, 1996. Announcement for a preliminary DSSSL stylesheet for TEI-Lite, contributed by Richard Light. The stylesheet is available from the UNC Sunsite FTP server in ZIP format or in UNIX tar/gzip format. The UNC Sunsite server also has a DSSSL stylesheet for print output of HTML 3.2 documents, created by Jon Bosak. The stylesheets can be used with James Clark's 'Jade' DSSSL engine to generate print output from SGML (TEI-Lite) and HTML 3.2 documents.

  • December 12, 1996. Update of the Text Encoding Initiative's "Project Descriptions" Page to include LE (Language Engineering)-PAROLE. "LE-PAROLE will produce and make available a harmonised set of corpora and lexica for all the Union languages. The corpora and lexica produced in LE-PAROLE will constitute the initial nucleus of a set of written language resources. In subsequent phases this will be gradually enlarged in coverage and size, and enriched with additional linguistic information. Emphasis is placed on the production of written language resources for LE, but the resources will also be extremely useful for applications in other telematics areas and, in general, in the 'Information society' framework. . . The different lexicons built for each language will be conformant to the PAROLE model and wil l be delivered in the reference format, i.e. an SGML file conformant to the PAROLE DTD instantiation for the language concerned." [extracted from the TEI description] For further information, see the PAROLE Web site (under construction as of December 11, 1996), or the LE-PAROLE entry in the TEI Application Page, where some fifty-seven (57) projects using TEI-SGML are now referenced.

  • December 12, 1996. New entry for the FAST (Finding Aids SGML Training) Track project of the Research Libraries Group (RLG). The RLG FAST Track project is designed to "enhance national and international access to primary sources through digitized finding aids linked to their RLIN collection-level records. Finding aids are guides that provide detailed descriptions of the content of archival collections; they can form a valuable bridge between collection-level cataloging and whole information objects. The FAST (Finding Aids SGML Training) Track is a series of workshops designed to train RLG members in encoding their finding aids with Standard Generalized Markup Language (SGML) according to the Encoded Archival Description (EAD) standard and guidelines." The finding aids documents will become part of the RLG digital collections, a current example of which is Arches (Archival Server and Test Bed).

  • December 09 [10], 1996. Seybold Publications' "Story of the Week" is a feature article by Mark Walter on XML (Extensible Markup Language), published in the December 1996 issue of Seybold Report on Internet Publishing . The article title: "W3C Publishes Draft of Simplified SGML. At Last a Sensible Way to Extend HTML." See the database entry on XML for more information on the Extensible Markup Language, and the Seybold entry for Seybold contact addresses. Further bibliographic detail, including a link to the online PDF version, is provided in the bibliography entry. [Note: We extend our appreciation to Mark Walter and others at Seybold Publications for their excellent coverage of SGML and related standards in the Seybold Report series.]

  • December 04, 1996. New entry for Harvard/Radcliffe Digital Finding Aids Project (DFAP). "In response to the recommendations of the Special Collections Task Force report (1994), the Harvard University Library Automation Planning Committee established the Digital Finding Aids Project (DFAP) in February 1995. DFAP's charge is to plan and oversee the design and deployment of a new computer application system to store, search, and retrieve digital finding aids in Standard Generalized Markup Language (SGML) format for all of [the forty-nine (49)] Harvard/Radcliffe repositories in a shared database. Presently, eight repositories are participating in the project: Baker Library (Business School), Design School, Divinity School, the Gray Herbarium and Houghton Library (Harvard College Library), Law School, Schlesinger Library on the History of Women in America (Radcliffe), and the Harvard University Archives."

    "The DFAP Web site includes a history of the project to date; Harvard guidelines for using SGML for finding aids, based on the Encoded Archival Description (EAD), a proposed national standard; repository-specific versions of the guidelines; and a growing number of SGML-encoded finding aids. . . The Digital Finding Aids Project is the first step towards establishing a strong administrative structure for the creation and technical support of SGML-encoded finding aids at Harvard. The project is a collaborative effort involving curators, archivists, catalogers, electronic text specialists, and library systems people, engaging participants from across Harvard."

  • December 03, 1996. New entry for the Railroad Industry Forum: Electronic Parts Catalog Exchange Standard (EPCES). "The Railroad Idustry Forum (RIF) is a task team of the National Association of Purchasing Managers who were tasked to develop a standard for the exchange of electronic parts catalog data within the North American railroad industry. The RIF members are comprised of major railroads and railroad manufacturers. SoftQuad and Applied Image Technology (AIT) provided the RIF technical expertise during the two-year effort for completion of the standard. Mary McCarthy and Betty Harvey, on behalf of SoftQuad, Inc. developed the EPCES DTD. SGML -- Standard Generalized Markup Language -- is utilized by the EPCES to encode text, data and provide for the linking of information elements. There is a high level of correlation between ATA Illustrated Parts Catalog (IPC) structures and the RIF sample parts catalogs. However, the ATA IPC standard incorporates more advanced effectivity and revision control information than evidenced by the RIF samples and has richer markup capability due to a more stringent style guide for ATA parts catalogs." [extracted]

  • December 02, 1996. New entry for a project sponsored by the University of North Carolina at Chapel Hill: Documenting The American South: The Southern Experience in 19th Century America. According to the project description, the online database "presents primary sources documenting the culture of the American South from the viewpoint of Southerners. We plan to scan and encode texts and digitize images so that faculty and students at colleges, universities, and even secondary schools throughout the South - and the world - can use them. This database is the first stage of a larger project to document the cultural history of the American South. It will offer diaries, autobiographies, travel accounts, titles on slavery and regional literature drawn from the splendid Southern holdings of the UNC--CH Academic Affairs Library. We have begun with testimonial materials because students use them heavily, and we believe they are of interest to a larger audience." All the selected materials are encoded according to the Text Encoding Initiative (TEI P3) SGML-based Guidelines, using TEILite.DTD (version 1.6)."

  • November 30, 1996. New entry for the SGML Initiative in Health Care - an initiative of HL7 (Health Level-7). "The HL7-SGML Initiative is a special interest group of HL7 formed to create the standard for the use of SGML in all domains of health care. This standard will comply with ISO 8879 (SGML) and SGML-related standards and complement other appropriate standards." "HL7 was founded in 1987 to develop standards for the electronic interchange of clinical, financial and administrative information among independent health care oriented computer systems; e.g., hospital information systems, clinical laboratory systems, enterprise systems and pharmacy systems." In August 1996, the HL7 Technical Steering Committee authorized the creation of an SGML SIG as part of a larger initiative to integrate SGML into medical informatics standards. "HCML" is a proposed abbreviation for the evolving markup language: "Health Care Markup Language."

  • November 30, 1996. Announcements for two new SGML User Group initiatives: (1) Communique from Rafal Ksiezyk of Warsaw University about a proposed SGML Users' Forum in Poland (broken link? provisional WWW site); see also/instead: "SGML in Poland, Users' Meeting Point". In Italy, (2): A communique from Maurizio Vianello about the founding meeting of the SGML Users' Group Italia (SUGI), December 4, 1996. See the SGML User Groups main entry for other information on the national SUGs.

  • November 30, 1996. Announcement for a GCA-sponsored XML Conference, "Selling SGML: Using XML on the Web." March 10-12, 1997, San Diego, California. See the main conference entry for details.

  • November 24, 1996. For the benefit of many who have been heard to ask themselves (usually more than once in the past several years), "Who is this James Clark, anyway?" -- an online biographical sketch.

  • November 22, 1996. Further summary descriptions of XML (Extensible Markup Language), presented by members of the W3C SGML Editorial Review Board at the SGML '96 Conference in Boston, November 17 - 21, 1996. The proposed XML standard took center stage at the conference. See the summaries section in the XML entry.

  • November 22, 1996. Announcement for the HyBrowse HyTime Browser, from TechnoTeacher, Inc. "HyBrowse is a true HyTime (ISO/IEC 10744) hyperdocument browser for Windows 95 and Windows NT. It is useful for developing electronic document architectures that employ HyTime's strongly typed location-independent linking mechanisms." HyBrowse is publicly available (free) for a trial period of 45 days. In addition to standard features one would expect, it supports: (1) True HyTime independent hyperlinking; (2) User-defined strong hyperlink typing with [a] icons assignable to anchor roles over entire bounded object set (BOS), [b] rendering styles assignable to anchor roles over entire BOS; (3) HyTime-conforming address elements ; (4) Aggregate location and hyperlink traversal handling; (5) Arbitrary BOS awareness allows users to add (import) a document into the current BOS; (6) Re-open browsing sessions without reparsing or reprocessing."

    Eliot Kimber writes: "NOTE: HyBrowse is intended as a tool for creating prototypes and demos of HyTime features. It is not intended to be a production-quality information delivery system. The formatting features are minimal compared to Panorama or DynaText but sufficient to demonstrate the very interesting things you can do with independent links and anchors thereof. If you've been thinking of ways that HyTime hyperlinking could solve some of your information management problems but never had a way to realize or test those ideas, now you do, for free." See: the description and download instructions on the TechnoTeacher server, and supporting documentation from W. Eliot Kimber.

  • November 20, 1996. New entry for the EUROMATH project. "The EUROMATH project was funded through the SCIENCE programme of the European Union and administered through the European Mathematical Trust based at the University of Kent at Canterbury, UK. A major goal of the EuroMath Project is the design and development of the EUROMATH Editor, part of the EUROMATH system. The EUROMATH system. . .is designed to incorporate an editor capable of handling mathematical documents, accessing and creating mathematical databases, an electronic mail interface and computer algebra capability. The EUROMATH system is built upon the WYSIWYG approach and the full SGML-compatibility of the Grif SGML structured editor developed by Grif SA, St. Quentin. Additional functionality (also EUROMATH applications) has been developed by a number of partners."

  • November 19, 1996. Announcement from Jon Bosak for new version of a DSSSL stylesheet for HTML 3.2 printouts. Using a DSSSL engine such as James Clark's Jade, this DSSSL stylesheet may be used to generate customized print views of HTML 3.2 documents (via RTF, TeX, etc.). The stylesheet has been designed for easy modification, and tutorial instructions illustrate how the output can be modified -- including customizations based upon a modified DTD. Although the stylesheet is lacking support for a few extended HTML features, it supports "features missing from HTML 3.2 such as headers, footers, optional autonumbering of heads and table captions, automatic TOC (Table of Contents) generation, and the correct and completely extensible interpretation of named units in size and length attributes."

    The DSSSL stylesheet was created by Jon Bosak of Sun Microsystems, with assistance from Anders Berglund (EBT) and from James Clark. The distribution of the stylesheet includes the HTML 3.2 DTD, the DSSSL stylesheet, ISO Latin-1 entities, and an appropriate CATALOG. It also includes a thought-provoking article by Jon Bosak, "SGML, Java, and the future of the Web." The article discusses SGML in relation to HTML and XML, and includes a section "Advanced linking and stylesheet mechanisms." Available in HTML format (mirror copy). The source for the package is available in .ZIP format or in tar-gzip format from the UNC Sunsite FTP server; [mirror copy].

  • November 19, 1996. New database entry for SGML and Math. Status: provisional, incomplete.

  • November 17, 1996. Announcement for an update of the WG8 Web Site, maintained by Dr. James D. Mason. A recent document ( provides an index for the "ISO/IEC JTC1/SC18/WG8 Document Collection from the Boston Meeting, November 1996." Some twenty-nine (29) documents are referenced by title and author, and cover a wide range of topics: SMSL, ISO HTML, Topic Navigation Maps, Technical Corrigendum for ISO/IEC 10179: DSSSL, "Module" Structures in SGML, U.S. Contribution on SGML Review [mirror copy], SGML TC for Extended Naming Rules, Font Services, etc.

  • November 17, 1996. Notice from Bart Roozendaal ( about work in progress on a project called 'Sgml2Xml'. Sgml2xml ia "a package of perl-scripts that enables you to convert SGML-documents to a HTML-publication of these documents on the fly. The converter can also be used as a off-line converter for creating more static documents." See the main URL: [Note: I have not been able to examine this package. -rcc]

  • November 17, 1996. An updated listing of the International SGML Users' Group Executive Council Members, with names and (contact) addresses for SUG Officers, Members, Elected Members, Chapters, and Special Interest Groups. For further information, see the main entry for the International SGML Users' Group, with National and Regional Affiliates.

  • November 17 [24], 1996. Availability of Seybold Reports online (Seybold Report on Publishing Systems, Seybold Special Report, Seybold Report on Desktop Publishing). Seybold Publications is offering a three-year archive of back issues in a searchable text (HTML) format; dates are September 1993 - August 1996. The Seybold Reports historically have given excellent coverage of SGML as it relates to the printing and electronic publishing industries. The online back issues can be accessed via top-level Table of Contents pages (e.g., SRPS Volume 25 [Sept. 1995 - Aug. 1996]), or through full-text searches, supported by a full-text search engine from Verity. Sample articles: (1) "Canadian Government Sinks Its Teeth Into Herculean SGML Effort", Liora Alschuler, from Seybold Report on Publishing Systems Vol 25, No 14; (2) "Making an Internet Newspaper: HTML and Signposts from Wyoming", by Mark Walter, from Seybold Report on Publishing Systems Vol 24, No 10, pages 1, 3-6. See also the bibliographic entry; (3) Applied Physics Letters Online: A Case Study in Online Journal Publishing [Physics Society Takes Its Journal Online], from Seybold Report on Publishing Systems Vol 25, No 8. See the Seybold entry in the Contacts section for further information.

  • November 12 [14], 1996. Announcement for the publication of Fragment Interchange. SGML Open Technical Resolution 9601:1996. The document is authored by the co-chairs of the SGML Open Fragment Interchange Subcommittee, Steve DeRose and Paul Grosso. "This is the Resolution defining the SGML Open fragment context specification language allowing for the interoperable interchange of fragments of valid SGML documents." See the bibliographic entry for complete bibliographic details, an extended abstract, and availablity on the Internet; or see the announcement from Paul Grosso.

  • November 11, 1996. Announcement for the first beta release of Jade (James' DSSSL Engine). Jade is implementation of the DSSSL style language developed by James Clark, author of the SGMLS and SP parser tools. "Jade is freely available, with source code, with no restrictions on commercial use. The development platforms are Windows 95 and Windows NT, but it also works on Unix. Jade allows you to display and print SGML documents; you control it by specifying a DSSSL style sheet. It has a modular design that allows you to add support for new output formats by adding a new "backend". At the moment the most mature backend is for RTF (as supported by Microsoft Word for Windows 95). A TeX backend has been contributed by David Megginson." See the Jade main entry on the Clark WWW server, or the the Jade entry in the SGML/XML Web Page.

  • November 08, 1996. New entry for the SGML-based (DynaText) version of the Arden Shakespeare. The Arden Shakespeare is a major SGML-based reference resource for Shakespeare scholars, originally developed as an electronic text project by Database Publishing Systems Ltd of Swindon, UK in conjunction with Routledge, where Brad Scott was the Electronic Project Manager. The project's consulting editor was Jonathan Bate of University of Liverpool, UK. Thomas Nelson will soon be distributing the electronic text as The Arden Shakespeare CD-ROM.

    The Arden Shakespeare CD-ROM contains the complete works of Shakespeare in the Arden edition (plays as well as poems and sonnets), together with commentary and variant notes. Related texts are are synchronized in a scrolling multiple-window display based upon a customized DynaText SGML browser application. SGML encoding permits searching within collections and tagged units: within all plays, within specific plays (acts, scenes), within groups of plays selected by the user, within prose or verse sections, within songs, asides, stage directions and speech prefixes, etc.

  • November 08, 1996. New entry for ASRL. The Army SGML Registry and Library (ASRL) is a site for "the U.S. Army Publications and Printing Command (USAPPC) Standard Generalized Markup Language (SGML) Registry and Library (ASRL). The ASRL is part of the Digital Publications Development (DPD) Program, and is the Army operational site for the DOD CALS SGML Registry (CSR) and CALS SGML Library (CSL). The ASRL is the central SGML data repository and single-point source for Army-approved SGML objects and constructs for publications developers. USAPPC is the approving authority for all Army standard SGML objects and constructs."

  • November 04 [05], 1996. New entry for XML (Extensible Markup Language), based upon a publicly available description of XML in the form of a W3C Working Draft, for review by W3C members and other interested parties. A summary of the recent activities of the W3C SGML Editorial Review Board (ERB), chaired by Jon Bosak, is given below. Extensible Markup Language (XML) is descriptively identified as "an extremely simple dialect of SGML" the goal of which "is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML," for which reason "XML has been designed for ease of implementation, and for interoperability with both SGML and HTML." XML is being "developed by a W3C Generic SGML Editorial Review Board formed under the auspices of the W3 Consortium in 1996 and chaired by Jon Bosak of Sun Microsystems, with the very active participation of a Generic SGML Working Group also organized by the W3C." [from the document Abstract]

    The official W3C document is: "Extensible Markup Language. W3C Working Draft 1-Nov-96, [WD-xml-961101]" edited by Tim Bray [Textuality] and C.M. Sperberg-McQueen [University of Illinois at Chicago]. This particular version is available via the W3C WWW server: The latest "working" version of the draft document is available from the Textuality WWW server:

  • November 01, 1996. Announcement from Geir O. Grønmo (Falch Infotek as) for version 0.2. " is a Perl5-script which converts Panorama stylesheets to DSSSL specifications. The script has been rewritten in order to be used with Paragraph-, Content-, Before- and After-properties are implemented. . . The generated DSSSL-specification can be seen more as a skeleton for creating complete DSSSL-specifications." See the link to the index page, or more information in the database section "DSSSL Software Tools" of the SGML/XML Web Page.

  • October 30, 1996. Announcement from Guy Teasdale for the online availability of two new books "in SGML." The volumes were created as part of an experiment in SGML-based electronic publication sponsored jointly by the École de bibliothéconomie et des sciences de l'information de l'Université de Montréal and the Presses de l'Université Laval. The volumes are accessible in SGML and HTML format. The books were encoded using the ISO 12083, and the HTML version was generated from the SGML source using a Panorama style sheet. A Web server at l'Université Laval is managing the online document delivery. See the link to the ULaval Web site, or the main entry for EBSI and GRDS (Groupe départemental de Recherche sur les Documents Structurés) at the University of Montreal.

    Details: "These two books are in French and were published to honour two distinguished professors of the Universite Laval (Quebec, Canada). The first one discusses the contributions of Marc-Adelard Tremblay to the field of Anthropology in the province of Quebec. The second one explains the work of an eminent sociologist, Fernand Dumont, in the understanding of the culture of Quebec. Seventy (70) scholars have contributed to these books."

  • October 28 [29], 1996. Announcement from Steve Pepper (Falch Infotek a.s), author of the The Whirlwind Guide to SGML Tools and Vendors since 1992, for an enhanced version of the Guide that is being prepared for the SGML '96 Conference. The database uses a revised taxonomy that "now operates with 30 different categories, which makes it possible to express some of the finer distinctions between tools. Tools can now appear in more than one category, which solves at least some of the 'pigeonholing' problems [the Guide ] has had in the past." Preliminary version: See the main bibliographic entry for the Whirlwind Guide for a document abstract and detailed information about its contents. Vendors are asked to check for accuracy in the description of their tools, since the information in this Guide will be printed in the proceedings of the SGML '96 conference.

  • October 28, 1996. Announcement from O. Rademakers of NICE technologies for "an inventory of all SGML related products." The listing is based upon questionnaires and a survey of similar databases. The survey has attempted to ascertain whether the products are still sold and supported, and to determine the user needs for SGML products. The database concentrates on vendors of products rather than on consultants and systems integrators. The URL for the HTML version:; RTF format. According to this particular survey, "there are around 130 SGML products on the market, of which about 50 are obsolete, withdrawn or not sold as off the shelf products."

  • October 25 [26], 1996. Announcement from David Allen (MD IPTC) concerning release of a production version 2 of the News Industry Text format (NITF) DTD. The DTD design and development is sponsored by The International Press Telecommunications Council and the Newspaper Association of America. The NITF DTD is "harmonized with HTML and designed to allow news information to be transferred with markup and be easily transformed into an electronically publishable format." "This DTD is designed for news information distribution and has little structural, but significant content mark-up. During development we have attempted to include the HyTime ILINK capablity as this is important to news providers who cannot afford to delay the news in order to enter links. As Panorama Pro [...] can display and resolve at least some HyTime link functions this was used for testing the DTD." See the NITF main entry in this database for further information on the NITF DTD, or see the main IPTC site. Eliot Kimber has supplied an evaluation of the HyTime aspects of the NITF 2.0 DTD.

  • October 19, 1996. Publication of the proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography held in Palo Alto, California, September 24-26, 1996. The proceedings volume is edited by Allen Brown, Anne Brüggemann-Klein, and An Fenn. It contain 240 pages, indexes, and 16 major articles; it is published by John Wiley & Sons: EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3.

    I have created entries in the bibliographic database (with abstracts) for some of the articles most relevant to SGML and structured documents: Eila Kuikka and Airi Salminen, "Filtering Structured Documents in the SYNDOC Environment" [compare: Eila Kuikka, Jouni Mykkänen, Arto Ryynänen, and Airi Salminen, "Implementation of Two-dimensional Filters for Structured Documents in SYNDOC Environment"]; Jacco van Ossenbruggen, Anton Eliëns, and BastiaanSchönhage, "Web Applications and SGML"; Patricia François, Philippe Futtersack, and Christophe Espert, "SGML/HyTime Repositories and Object Paradigms"; Philip N. Smith and David F. Brailsford, "Towards Structured, Block-Based PDF"; Ethan V. Munson, "A New Presentation Language for Structured Documents"; Xinxin Wang and Derick Wood, "XTABLE - A Tabular Editor and Formatter"; Helena Ahonen, "Automatic Generation of SGML Content Models"; Hélène Richy and Jacques André, "Typographic Sheets and Structured Documents"; P. R. King, "Modelling Multimedia Documents"; Anne Brüggemann-Klein, Rolf Klein, and Stefan Wohlfeil, "Pagination Reconsidered"; William S. Lovegrove and David F. Brailsford, "Document Analysis of PDF Files: Methods, Results and Implications"; Vijay Kumar, Richard Furuta, and Robert B. Allen, "Interactive Interfaces for Knowledge-Rich Domains".

  • October 19, 1996. Announcement from David Megginson for a utility 'psgml-dsssl.el', which works with the free 'psgml' editor under Gnu Emacs. Call it 'alpha software or a prototype'. "This small package works together with the PSGML editor in XEmacs or Gnu Emacs to produce a skeleton DSSSL style spec automatically for the current document's DTD." See the DSSSL Software Tools section for other details.

  • October 16 [November 04], 1996. Update of the document "Generic SGML over the Web" on the W3C Web server. This particular document is maintained by Dan Connolly (SGML activity contact) and Jon Bosak (SGML ERB and WG chair). Note specifically updated information on the work of the SGML Editorial Review Board: "The W3C has formed an SGML Working Group made up of SGML experts and an SGML Editorial Review Board made up of SGML experts who also have special standards and implementation experience to coordinate with existing related standards efforts and to provide specifications where needed to form a complete SGML Internet solution. Specific deliverables under development by the SGML WG/ERB include: (1) A specification for a simplified version of SGML suitable for Internet applications. Target delivery: draft by the SGML 96 Conference (November 1996); (2) A specification of standard hypertext mechanisms for SGML applications. Target delivery: draft by the WWW6 Conference (April 1997); (3) Public text and extensions needed to apply the DSSSL stylesheet language (ISO/IEC 10179) to Web browsers. Target delivery: draft by the SGML 97 Conference (December 1997)." See now: the database entry for XML (Extensible Markup Language).

  • October 12, 1996. Announcement from Earl Hood for a new release of the perlSGML tools -- a collection of perl software for processing SGML data. These SGML software tools run under Perl versions 4 and 5. "The most visible change is the new content hierarchy tree output of elements in a DTD. The new format preserves all the content model information." See the main entry for perlSGML for links and further description of the tools.

  • October 12, 1996. New entry for the (US) National Library of Medicine (NLM). The National Library of Medicine in Bethesda, Maryland, together with the William H. Welch Medical Library of Johns Hopkins University, has been building medical databases for full-text information delivery using SGML encoding since about 1988. The Online Reference Works (ORW) project and the more recent HSTAT database (Health Services/Technology Assessment Text) are examples of SGML-based electronic resources which continue this effort.

  • October 12, 1996. New entry for the University of Waterloo English Department - Technical Writing Course Using SGML. English 210E is a second year technical writing course that uses three DTDs, and requires written assignments encoded in SGML.

  • October 09 [15, 22, 25, 28] 1996. Communiques from Don Thieme and Debbie Lapeyre summarizing the highlights of the upcoming SGML '96 Conference, "Celebrating a Decade of SGML," sponsored by GCA. Press releases: #1, #2, #3, #4, #5. Dates: November 18-21, 1996, Sheraton Boston Hotel and Towers, Boston, MA. For other information on this important conference, see the SGML '96 entry in the Conference section of the SGML/XML Web Page.

  • October 06, 1996. New entry for Earth Interactions: An Electronic Journal in SGML. Earth Interactions ("An Electronic Journal Serving the Earth System Science Community") represents "a collaborative effort in which the American Geophysical Union (AGU) and the Association of American Geographers (AAG) are joining with the American Meteorological Society (AMS) as copublishers. The Oceanography Society (TOS) and the Ecological Society of America (ESA) have cooperated in the planning of this journal, as well. . . Individual articles within the journal will be coded in Standard Generalized Markup Language (SGML) rather than HTML and will be viewable by an SGML Web viewer that can be launched from the standard Web browser."

  • October 06, 1996. Leonard Norrgard reports that on Friday, October 04, 1996 at SGML Finland '96, Citec announced its new Multidoc Pro SGML browser. It is characterized as an "SGML browser on-the-fly," and is based upon Synex Viewport. A fuller description of Multidoc Pro in available in a press release from Citec. See also the entry for CITEC Information Technology in the Contacts section of the SGML/XML Web Page.

  • October 03, 1996. Announcement from Henry S. Thompson (Human Communication Research Centre, University of Edinburgh) for the public availability of DSC --- DSSSL Syntax Checker version 0.7. "This tool, implemented in Scheme, is designed both to provide an offline syntax checker for all DSSSL expression, style and transformation language programs, and to serve as a preprocessor for any Scheme-embedded DSSSL implementation. Virtually the entire language as specified in chapters 8 through 12 of the standard is supported." See also the main DSSSL entry.

  • October 03, 1996. Announcement from Michel Biezunski for the availability of the online Conference Proceedings from The 3rd GCA International HyTime Conference, August 20 and 21, 1996. The proceedings have been generated using EnLIGHTeN, an application being developed at High Text. See the main conference entry, or the link to the proceedings.

  • September 23, 1996. Announcement from Mike Petree and Carla Corkern (ISOGEN) for a Document Developer Workshop - DDW (October 7 - 11, 1996) and a Workshop on "Stylesheet Development in DynaText - SDD" (October 14 - 17, 1996). "DDW workshop attendees will learn what SGML is all about, who is using SGML, what SGML can do for your organization, and how to implement SGML." The SDD workshop "consists of three and a half days of hands-on experience starting with a quick introduction to the DynaText browser and carrying through to the creation of a suite of stylesheets for a representative DTD."

  • September 22, 1996. Announcement from Nancy Ide for a special issue of Cahiers GUTenberg dedicated to the Text Encoding Initiative. Number 24 of Cahiers GUTenberg is a 251-page issue edited by François Role, containing eleven articles on TEI (in French). The full text of the issue is now available at the following web site: For other journal special issues and monographs dedicated to the Text Encoding Initiative, see the relevant subentry for TEI.

  • September 22, 1996. Announcement from Harvey Bingham on the pasing of William W. Tunnicliffe, an early participant in the development of SGML. Appended: a longer tribute to William Tunnicliffe written by Charles Goldfarb.

  • September 22, 1996. Announcement from Steven R. Newcomb for support of James Clark's SP Parser. "In response to market demand, TechnoTeacher, Inc. now offers commercial support for the SP SGML Parser. Publicly licensed and available without charge in source code form (at, SP is the most modern SGML parser available today. We [TechnoTeacher, Inc.] believe in it; TechnoTeacher's own 'MarkMinder' and 'HyMinder' SGML and HyTime engines have been completely rewritten for use with SP." See the full text of the announcement for further information. James Clark has no commercial connection with TechnoTeacher, Inc.

  • September 12, 1996. Announcement from Paul Hermans for the SGML BeLux '96 Conference Programme Date: Thursday, October 31, 1996. The keynote speech is to be delivered by European TEI editor Lou Burnard (also of Oxford University Computing Services): "SGML on the Web: Too little too late, or too much too soon?" For other details, see the full conference entry.

  • September 12, 1996. Added entry for the prototype SGML Finding Aids database established by The Mandeville Special Collections Library at the University of California, San Diego. The library "has mounted a subset of the library's finding aids on the WWW in SGML and HTML coded versions. . . The database currently includes approximately 65 listings for manuscripts, personal papers, and UCSD records. Each listing contains a link to the SGML-coded finding aid, to the HTML-coded finding aid, and to the corresponding catalog record in ROGER WEB, the WWW version of UCSD's library catalog. In turn, each catalog record for items having a finding aid includes a link to the SGML finding aid, to the HTML finding aid, and to the "homepage" for the database, where one can search the entire database of SGML finding aids. The SGML part of the database is indexed and searchable by means of Verity Query Language." See the main database entry (link above), or the announcement from Bradley Westbrook.

  • September 07, 1996. Added entry for The Electronic Archive of Early American Fiction, a project centered at the University of Virginia Library. A grant of $400,000 has been received by the University of Virginia Library from the Andrew W. Mellon Foundation for the digitizing of a major collection of American fiction: "As part of the study, 582 first editions of the most important novels and short stories will be digitized and put on the World Wide Web. Called the Electronic Archive of Early American Fiction, the online collection will include books published between 1775 and 1850." The full text versions of the works will be prepared using TEI/SGML encoding; all images will be "24-bit colour, circa 400 dpi tiffs with jpeg derivatives on-line."

  • September 07, 1996. Added entry for the GATE Project, sponsored by the Natural Language Processing Research Group, Department of Computer Science, University of Sheffield. GATE is a "General Architecture for Text Engineering," and incorporates "SGML input and output for compatibility with, for example, MULTEX and TEI initiatives."

  • September 04, 1996. Made arrangements to mirror the SGML Syntax Summary, by Harvey Bingham. Having had occasion recently to use this resource, I hereby remind readers of the existence of this immensely useful tool which provides indexed and linked access to SGML grammar productions. Canonical URL: Separate listings are given for: (1) SGML Syntactic Variables; (2) SGML Keyword Syntactic Literals; (3) SGML Terminal Variables; (4) SGML Terminal Constants; (5) SGML Reference Delimiter Roles. The document will assist users in the "study [of] the syntax of ISO 8879-1986 Standard Generalized Markup Language, aided by hypertext links for the syntax productions, their names, objects in their definitions, where used and where defined, and cross-references to containing clause and page:line pairs in 'The SGML Handbook', by Charles Goldfarb." The resource is mirrored here by permission in the SGML Grammar section of the Topics Page.

  • September 03, 1996. Announcement for the availability of James Clark's Jade: "James' DSSSL Engine" - an implementation of the DSSSL style language. Jade is "available for testing by qualified individuals." "Jade is an application of DSSSL (ISO/IEC 10179:1996), the international standard language for specifying processing semantics to be applied to documents marked up in languages conforming to SGML (ISO 8879:1986). Jade takes as input an SGML DTD, a document marked up in the language specified by the DTD, and a DSSSL stylesheet. It produces as output an SGML representation of the resulting DSSSL flow object tree or, alternatively, an RTF file of the formatted document." Currently, Jade "includes the following components: (1) An abstract interface to groves; (2) An in-memory implementation of this interface built with SP; (3) A style engine that implements the DSSSL style language; (4) A command-line application, jade, that combines the style engine with the spgrove grove interface and two backends . . ." For other information on DSSSL, see the DSSSL main entry.

  • September 03, 1996. I have hastily compiled a document to answer the frequently-asked (and wearisome) question: whether/why name tokens can(not) be duplicated in an attribute definition list, even within different groups. Thus, it explains why the following is not currently allowed:

    <!ATTLIST candidate constantlyChangesPosition (YES | NO) YES
                        liesWithoutFlinching      (YES | NO) YES >

    It also explains the rationale for the particular design (limitation) in SGML, offers some work-arounds, and speculates on whether/how the infelicity (?) might be addressed in a revision to SGML. Your further contributions to this compilation are welcome.

  • August 31, 1996. Added a new subsection on the "Topics" page for a collection of links on SGML/DSSSL/HyTime "groves".

  • August 31, 1996. New entry for the Hebrew Syntax Encoding Initiative. "An ad hoc committee has been formed to extend the Westminster Morphologically Analyzed Machine-Readable Hebrew Bible (MORPH) to the syntactic level. The committee was formed loosely under the auspices of the Computer-Assisted Research Group (CARG) of the Society of Biblical Literature (SBL). . . A content model of MORPH is also being created, which is preliminary to modifying and applying TEI's DTD . . . Why TEI's DTD? Because it is a standard. SGML DTDs are reconfigurable either by changing the DTD itself (which changes the meaning of the tages already present) or by changing a parser which would actually reconfigure the data files in some arbitrary manner." [from the front page]

  • August 30 [September 22], 1996. Announcement for an SGML course entitled "Implementing an SGML Publishing System," offered by the University of Wisconsin-Madison, College of Engineering, Department of Engineering Professional Development. The course dates are October 22-25, 1996; instructors are Brian Travis and Dale Waldt. This latest offering of the SGML course is its seventh: the longest-running program of its kind at the University of Wisconsin. See the online brochure for details; [mirror copy]. Also: updated announcement September 22, 1996. Or contact: +1 (608) 262-4341; email: (Dick Vacca).

  • August 28 [30], 1996. A welcome conference report from Len Bullard: his observations on the Seattle HyTime conference, to help the rest of us "discover if the recent Technical Corrigendum and alignment of the SGML standards could help sew the Web back together." The appraisal is mixed, but cautiously optimistic: "The alignment is real not cosmetic. After a year of intense work, there is an adequate basis for using DSSSL, HyTime and SGML in application suites. To understand how this works, one must understand the grove and grove plan concepts. . ." Note: See also now the conference report from Eliot Kimber. Watch CTS (Usenet News 'comp.text.sgml') for continuing retrospectives and prospectives (e.g., Matt Moots, Steven R. Newcomb). The online conference proceedings are said to be available on the GCA Web server soon..

  • August 27, 1996. Announcement from Christophe Espert for the availability of YASP ('Yet Another SGML Parser', developed by Pierre G. Richard), on Windows 95 and Windows NT. YASP, an SGML system conforming to ISO 8879:1986, "is a library of functions providing a powerful C API on a lot of platforms. It works on DOS/Windows, Unix (including AIX, Solaris 2.x,...), OS/2, IBM MVS and VM, MacOS and now Windows 95/NT. The API is fully documented in PDF and the library comes with a sample application called SMP00. SMP00 is an SGML validator and inline editor. YASP allows you to get a lot of information from SGML documents. In particular it can be used to build SGML transformation systems, browsers, editors,... YASP provides more than enough information to build GROVEs as defined in ISO/IEC 10179:1996 and ISO/IEC 10744 TC. In particular YASP gives access not only to the SGML document instance, but also to the document's DTD and SGML declaration." See the main entry for YASP for further details.

  • August 22, 1996. Announcement from Wendell Piez for "a new pilot project developed at CETH, the Center for Electronic Texts in the Humanities, at Rutgers and Princeton Universities: A small but scalable example of electronic publishing of archival materials, the Griffis Collection Electronic Access Project serves to demonstrate the potential of standards-based electronic text technologies to provide new kinds of access to rare manuscripts and archival collections. The project uses two complementary implementations of SGML (Standard Generalized Markup Language) to provide networked access to the William Elliot Griffis Collection, a collection (held in Rutgers Special Collections and University Archives) of rare print, MS and photographic materials bearing largely on the history of U.S.-Japan relations in the 19th century. A 'skeleton finding aid' (a prototype of a larger-scale finding aid) is encoded according to the guidelines of the Encoded Archival Description (presently available in an alpha-testing version from the Library of Congress; cf." See further the main entry for CETH.

  • August 17, 1996. Announcement from Michael Sperberg-McQueen for the addition of a new page of TEI "Project Descriptions." The TEI (Text Encoding Initiative) has developed extensive documentation for the SGML encoding of texts for use in humanities research. Many of the projects are currently cited in the "Academic Applications" section of the SGML/XML Web Page, but some are not. The TEI Application Page currently lists fifty-one projects "which report that they use TEI for all or some of their text encoding. The page has brief descriptions, contact information, acknowledgements of funding sources, and links to the projects' own home pages." Projects using TEI/SGML not currently listed are encouraged to contact the TEI editors with relevant project information. See the link (, or the main TEI entry in the SGML/XML Web Page.

  • August 17, 1996. Announcement from Jon Bosak for the completion of an editorial review of the DSSSL Online Application Profile (dsssl-o specification). A subset of DSSSL, "DSSSL Online (informally "dsssl-o"), supports the basic features needed to provide publisher-oriented formatting control of online displays and a minimum set of page-oriented features needed to provide utility printouts from browsers and editors." Both Postscript and HTML 3.2 versions of the document are available -- generated through the use of an HTML 3.2 DSSSL stylesheet (available as: /pub/sun-info/standards/dsssl/stylesheets/html3_2). Changes: "Language has been added to reflect the emerging consensus that lambda, #!key, and let . . . should be part of a minimally conformant dsssl-o application, such as the one used in preparing the PostScript version of the revised document. Also, two new tables have been added to provide an alphabetical list of all the characteristics, inherited and non-inherited, that can be applied to the flow object classes in the dsssl-o subset." See the DSSSL main entry for other information.

  • August 12, 1996. New entry for RMIT - MDS (Royal Melbourne Institute of Technology - Multimedia Database Systems Group), a leading research center in Australia that has sponsored a wide range of development and research related to SGML technologies. SIM (Structured Information Manager) is one of the ongoing research and development efforts. The RMIT - MDS research endeavor is supported by a number of other academic and commercial partners (e.g., CITRI, Ferntree Computer Corporation, University of Melbourne).

  • August 10, 1996. Text for the final program of the Third GCA International HyTime Conference, from Steven R. Newcomb. For other information, see the main conference entry.

  • August 10, 1996. Announcement from David Cooper of Antech Systems, Inc. for a publicly available (sample) MID browser, which can be run under Windows95 or WinNT 3.51. MID ( Metafile for Interactive Documents) "is is a proposed U.S. Navy standard for scripting of interactive hypermedia documents (such as IETMs), and is an implementation of ISO 10744 (HyTime). The example MID browser (HyMID) package "uses TechnoTeacher's HyMinder(TM) HyTime engine in the form of a DLL" and contains two sample MID instances. See also the main database entry for Metafile for Interactive Documents.

  • August 08 [10], 1996. New entry for the The Corpus Legis Project, sponsored by The Swedish Law & Informatics Research Institute. "The general aim of the Corpus Legis project is to establish a permanent, computerised legal text resource for legal and linguistic studies. . .Questions of legal document management are investigated by means of the international document representation standard SGML - Standard Generalized Markup Language. The corpus is divided into the following three main categories with corresponding DTD's (Document Type Definitions): Public national legal information, Public international legal information, and Historical legal information. The main part of the corpus (Legis.dtd) focuses on documents reflecting the system for lawmaking and individualized case law, e.g., government bills, laws, and cases decided by courts and administrative bodies." [adapted from the project description]

  • August 05, 1996. Announcement from Richard Szary for the availability of a "prototype database for archival and manuscript finding aids" at Yale University, participating in the EAD "Findind Aids" projects. "The Finding Aids Project uses a WWW interface and OpenText search software to search finding aids encoded with the beta version of the Encoded Archival Description Document Type Definition (EAD-DTD), which is an SGML standard being developed by the archival profession. This development [at Yale University Library] is based on work originally done at the University of California at Berkeley. It currently contains finding aids for collections in three Yale repositories: the Beinecke Rare Book and Manuscript Library, Manuscripts and Archives in Sterling Memorial Library, and Special Collections in the Divinity Library." [from the announcement]. See Yale EAD Finding Aids Project, and also the main database entry for EAD.

  • August 02, 1996. Announcement [belated] for another SGML grammar tool from TEI editor Michael Sperberg-McQueen. See the SGML/XML Web Page database entry or the UIC FTP server ( for related SGML grammar tools. The new utility is called "Carthage." "Carthage is a yacc/lex-based parser for SGML DTDs which can delete references to undeclared elements. It can also do a few other things, depending on the run-time flags you give it." Some options include: (1) dropping or keeping marked sections; (2) warning if entities are declared twice; (3) dropping or keeping parameter entity declarations; (4) deleting named GIs from content models; (5) listing of specified classes of elements in the DTD [used, unused, default undeclared, declared]; (6) dropping or keeping comments in the output file, etc. The software is "unsupported" but "users who improve it or fix errors are requested to notify the author so he can also fix them." [extracts from the README file, dated June 17, 1996.

  • August 02, 1996. Announcement for an experimental/draft version of a meta-DTD for extended pointer syntax, by Arjan Loeffen of Utrecht University. See: Also from Arjan Loeffen: the TEI-L WWW shadow archive (HTML) has been updated through July 29th, 1996. The index is complete for "990 subjects in 1673 contributions." See

  • August 02, 1996. Call for assistance in setting up a working group for "creating DTDs for captioning, audio description, subtitling, and dubbing," by Joe Clark. Further information about the proposed effort is available in a descriptive document: "SGML for captioning, audio description, subtitling, and dubbing: Who needs it? And who cares?". [mirror copy, August 02, 1996]

  • July 30, 1996. Announcement from James Clark for a new version of SP, the "free, object-oriented toolkit for SGML parsing and entity management." Version 1.1.1 represents a minor revision: "The only serious bug 1.1 is [was] the incorrect handling of colons in SGML_CATALOG_FILES on MS-DOS and Windows machines. The only other very significant change [in 1.1.1] is that the Win32 Unicode binaries now work with Windows 95 as well as Windows NT." See the main database entry for further details pn SP.

  • July 28, 1996. Announcement from Christophe Espert for a new distribution package for YASP (Yorktown Advanced SGML Parser), for DOS and Windows. "It includes source code, documentation and binaries for Windows. The YASP library is a Dynamic Link Library." The author intends to build YASP on Windows NT in the near future. See the text of the announcement, or the main entry for YASP.

  • July 24 [August 04], 1996. Sample DSSSL stylesheet for HTML 3.2 print output, contributed by Jon Bosak (SunSoft) based on work by Anders Berglund (EBT), with critical assistance from James Clark. It is a "style sheet for producing hardcopy output from documents validated against the HTML 3.2 DTD. It supports most of the features of HTML 3.2 that make any sense in printed form and adds a couple of features of its own, notably headers, footers, and the autonumbering of heads and table captions." [from the CTS posting] **Update August 02: An updated/reworked version of the stylesheet is available as "html32hc.dsl.960802" [or its successor], accessible via FTP:, with a README file. See the main DSSSL entry for other information.

  • July 24, 1996. Announcement from Gary Houston for the release of gf version 0.46. gf is a "general formatter program." This version of the software has "slightly improved html2latex functionality," and support for HTML tables, contributed initially by Mr. Abbey Akalay-Watkin. See the main entry in the database.

  • July 22, 1996. New database entry for Topic Navigation Maps. Topic maps provide a means of designing navigation aids such as indexes, tables of contents, glossaries, thesauri, and related hypertext structures. The Topic Map architecture is being developed as an SGML/HyTime application, largely within the orbit of CApH (Conventions for the Application of HyTime).

  • July 22 [29], 1996. Announcement from Jean-Daniel Fekete (Universite de Paris-Sud) for tei2latex version 0.1, distributed under the GNU General Public License. "tei2latex is a Perl5 Program to Translate TEI Lite Documents into LaTeX2e documents. . . The translation process can be configured in several ways for two reasons: (1) to enhance the default translation in case TEI Lite lacks information about the presentation (as in tables for instance); (2) to personalize the presentation of a document or a set of documents." [Revised version 0.1f was released on July 29, 1996.] See the main entry.

  • July 17, 1996. Bibliography updates (about 36 new entries). Use your Web browser's search facility to find the string "199607" in the bibliography files. Top-level entry to the bibliography files is via its "Quick Access" Table of Contents.

  • July 16, 1996. New entry for Cheshire II Project and SGML, centered at UC Berkeley. "The Cheshire II project is developing a next-generation online catalog and full-text information retrieval system using advanced IR techniques. This system is being deployed in a working library environment and its use and acceptance by local library patrons and remote network users are being evaluated. The Cheshire II system was designed to overcome twin problems of topical searching in online catalogs, search failure and information overload. The system incorporates a client/server architecture with implementations of current information retrieval standards including Z39.50 and SGML."

  • July 16, 1996. New section (designated) for a collection of links on Architectural Forms and SGML Architectures, owing to the increased importance of SGML architectures in the context of the nearly-complete HyTime Technical Corrigendum.

  • July 15, 1996. Updated information from GCA on HyTime '96, the Third International Conference on the Application of HyTime. August 20 - 21, 1996. Westin Hotel, Seattle, Washington, USA. The Technical Corrigendum is at the top news, because it involves SGML Extended Facilities. See details on the conference page.

  • July 14, 1996. Announcement from Jörg Wittenberger for version 1.0 (beta) of the SDC Package. "SDC is a well featured, free system aiming to make SGML suitable for day to day use. SDC compiles SGML documents into representations as PostScript, LaTeX, HTML, man pages, (emacs) info files and is a little RTF aware." See also the main entry in the software page.

  • July 12, 1996. Announcement from Charles F. Goldfarb on the publication of the Third Interim Report of the Project Editor's Review of ISO 8879 (WG8 N1855). The document is available on the author's web site: See also the database entry for ISO 8879 revision.

  • July 12, 1996. Updated entry for SGML and Metadata

  • July 12, 1996. New entry for Information Mapping and SGML.

  • July 12, 1996. New entry for the Canadian Government Information Finder Technology - GIFT.

  • July 04, 1996. Announcement (7/1/96): Inso Corporation Announces Agreement to Acquire Electronic Book Technologies. See the press release [mirror copy].

  • July 04, 1996. Announcement from Jean Véronis for MtStr - Multilingual string library. "MtStr is a C library for UN*X developed in the context of the MULTEXT project, which extends the usual functions provided in the C character and string ctype and string libraries, in order to accommodate multi-lingual text processing. MtStr is designed especially for texts encoded using SGML. See also the entry for MULTEXT.

  • July 04, 1996. Announcement from Jean Véronis for MtRecode - A character conversion program. "MtRecode is a program for translation between various character sets, developed in the framework of the MULTEXT project. It has some of the functionality of the GNU `recode' tool, but it is based on different principles and is oriented towards SGML text manipulation. ISO 10646 is used internally as a pivot in the character translation process. When exact translation into a character is not possible, MtRecode can use SGML entities as a fallback. Conversely, MtRecode understands SGML entities in the input and can recode them into characters of the target character sets, if they exist. MtRecode is completely customizable: the user can add new character sets and/or entities by providing tables that map characters and entities to ISO 10646." [from the announcement]. See also the entry for MULTEXT.

  • June 11, 1996. Announcement from Murray Maloney and Holley Rubinsky concerning the The Yuri Rubinsky Insight Foundation. The foundation "is dedicated to commemorating the genius of the late Yuri Rubinsky (1952-1996). . . "

  • June 10 [11], 1996. Announcement by James Clark for the release of SP 1.1. SP is a "free, object-oriented toolkit for SGML parsing and entity management." It supports most of SGML's optional FEATURES (lacking only DATATAG and CONCUR), multi-byte character sets, and is highly portable. SP version 1.1 has a number of new and interesting features, including a separate utility for entity management (spent) and a markup stream editor (spam). SP also now supports Architectural Form Processing (i.e., an "architecture engine"). "If you build an application with SP that works with documents conforming to some DTD, the architecture engine will allow it automatically to work with any document that conforms to the architecture which has that DTD as its meta-DTD." See for more information, or the main database entry.

  • June 10, 1996. Announcement from Tito Orlandi (Accademia dei Lincei and Università di Roma La Sapienza) concerning the new ARTEM project (Archivio Testuale Multimediale). The ARTEM project will create a repository of texts in the Italian language, build a catalogue for existing Italian etexts, and provide links to other repositories. TEI-SGML is to be used in the project. Project email contact:

  • June 03, 1996. Updated information on the program for the ALLC/ACH '96 Conference, June 25 - 29, 1996. The annual ACH/ALL program normally features an interesting range of papers and panels on SGML and markup theory (especially TEI-SGML), with advocates and detractors, and this year's program is no exception.

  • May 29, 1996. Interesting things to think about from SGML Europe '96. See the online contributions (known to me so far) of Lou Burnard, Tim Bray and Jon Bosak, accessible from the main conference entry.

  • May 23, 1996. Announcement from Henry S. Thompson of the HCRC Language Technology Group, University of Edinburgh, for the public release of LT NSL. "LT NSL is a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation. It consists of a C-based API for accessing and manipulating SGML documents and an integrated set of SGML tools. The LT NSL initial parsing module incorporates v1.0 of James Clark's SP software, arguably the best SGML parser available. The basic architecture is one in which an arbitrary SGML document is parsed once, yielding two results: (1) An optimised representation of the information contained in the document's DTD; (2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc.

    "The use of the cached DTD together with the normalisation of SGML to nSGML means that applications processing nSGML streams can be very efficient. LT NSL provides two views of an nSGML file; one as a flat stream of markup elements and text; a second as a sequence of tree-structured SGML elements. The two views can be mixed, allowing great flexibility in the manipulation of SGML documents. It also includes a powerful, yet simple, querying language, which allows the user to quickly and easily select those parts of an SGML edocument which are of interest. Finally, LT NSL supports SGML output, making it easier to write SGML to SGML conversion programs." [from the Home Page]. See the Language Technology Group main entry for further information.

  • May 23, 1996. Announcement from Paul Hermans for SGML BeLux '96, on October 31, 1996. This it the "Third annual conference on the practical use of SGML" (Brussels, Belgium), sponsored by the Belgian-Luxembourgian Chapter of the International SGML Users' Group. See also the conference entry.

  • May 22, 1996. Announcement for alpha version 1.1 of MtScript, developed under the MULTEXT Project by Malek Boualem (Université de Provence) and Stéphane Harié (CNRS). ["MtScript does not yet enable saving an SGML version of the texts" - optimistically, I take this as a design goal, but the software is noteworthy in any case for its language support.] "MtScript is a multi-lingual text editor that enables using several different writing systems (Latin, Arabic, Cyrillic, Greek, Hebrew, Chinese, Japanese, Korean, etc.) in the same document. MtScript provides typical editing functions such as insertion and deletion, even for text containing portions of writing in opposite directions. In addition, MtScript allows the user to explicitly associate portions of the text with a particular language, and to associate keyboarding rules with any language. Different types of character sets (single byte, multiple-byte) can also be handled. MtScript has been developed on Sun Workstations using X-Windows, Unix, C, and Tcl/Tk. A compiled alpha version (v1.1) is available for Sun Sparc stations under Solaris 1.x or 2.x." [from the Home Page]

  • May 22, 1996. Announcement for an updated version (2.1) of MetaMorphosis-free, for LINUX. "MetaMorphosis is an SGML tree transformer which can be used to transform SGML document instances to various output formats. It requires the installation of nsgmls which is not part of the MM-free distribution. . . a selection of some of the new features in Version 2.1: (a) handling of SDATA entities and PIs as nodes; (b) support of indices; an index can hold anything which can be the result of a MM query (strings, attributes, nodes, ...) indices can be used to access large sets of data very fast; (c) built-in pacing mode allows to interactively step through transformations, move around in the tree, etc.; (d) the script compiler and many functions have been optimized; they perform now twice as fast as before." See the database entry for more information or the main MM URL.

  • May 22, 1996. Announcement from Lou Burnard (and Richard Light) for the availability of Richard Light's utility NORMDTD. "NORMDTDis a DOS (yes!) program that reads a valid SGML DTD, even a TEI-like one that uses marked sections and multiple input files, and generates a single file containing a normalized version of that DTD. The element content models in this normalized DTD will not contain any references to elements that are not declared, and so it can be used by highly-strung SGML packages such as RulesBuilder that refuse to process TEI applications (in particular) for this reason. In fact, having a normalized DTD in a single file can be helpful for a number of reasons, to a variety of SGML applications." See the main entry in the SW database.

  • May 22, 1996. Announcement by Derek Denny-Brown of TechnoTeacher, Inc. for a new online HTML version of Ralph Ferris's HyTime Application Development Guide. The URL:

  • May 21, 1996. Announcement for alpha version of SYNTEXT, by George L. Dillon (email: "SYNTEXT is an SGML DTD providing elements and attributes to mark up text in English for: (1) syntactic structure, including (a) X-bar based parsing, with Government and Binding-style PRO and t, (b)grammatical relations a la Quirk et al. marked as attributes; (2) cohesion ; (3) coreference; (4) conjunctive relations as attributes of sentence specifiers; (5) lexical cohesion as attributes of lexical items; (6) rhetorical figures. Any text marked up for these features and identifying itself as DOCTYPE SYNTEXT is an SGML document and can be browsed in a SGML browser or viewer such as SoftQuad's free Windows browser Panorama or the costwish viewer for X Windows being developed by Peter Murray-Rust. It is an SGML application, the purpose of which is to provide markup for the analysis of syntactic and textual structure; a marked up text can viewed as a tree and in other modes and can be searched with context sensitive and contingent scans, making it very powerful for stylistic analysis (once a passage is marked up!)." See the text of the announcement, the main URL:, or the main database entry.

  • May 21, 1996. New (provisional) entry for the SAE J2008 (and T2008) Automotive and Truck Standard. The Society of Automotive Engineers has developed a DTD for use in information delivery, based upon government requirements under the Clean Air Act of 1990. Your tax dollars at work. The Data Model for SAE J2008 is based upon SGML documents and relational database management.

  • May 21, 1996. New special entry for ISO/IEC 9070:1991 Registration Procedures for Public Text Owner Identifiers. An online database is being set up to assist in the handling of registration of Public Text. The registration is relevant to public text referenced in SGML FPIs (Formal Public Identifiers).

  • May 20, 1996. Announcement from Harvey Bingham (email: for "two suites of documents for public use" pertaining to SGML and DSSSL. "The SGML Syntax Summary of ISO 8879-1986 Standard Generalized Markup Language provides hypertext links for the syntax productions, their names, objects in their definitions, where they are defined, in what other productions are they used, and cross-reference to containing clause, and page:line in Goldfarb The SGML Handbook. The DSSSL Syntax Summary of ISO/IEC 10179:1996 Document Style Semantics and Specification Language provides hyperlinks links for the syntax productions, their names, objects in their definitions, where those objects are defined, in what other productions are they used, and cross-references to clauses in the standard."

  • May 16, 1996. Announcement for a provisional set of TEI formal public identifiers, for use with an CATALOG and similar purposes. See also the main TEI entry.

  • May 16, 1996. Announcement from Arjan Loeffen of Utrecht University (Humanities Computing, Faculty of Arts) for an HTML-ized and searchable archive for the TEI-L discusion forum-- logs of the Text Encoding Initiative public discussion list. It's being caled the "TEI-L WWW archive" (or 'shadow archive'). The database covers the past 6.5 years of discussion on TEI-L. See the URL: Reminder: in the same Net-territory one will find Arjan Loeffen's similarly-constructed and highly useful CTS archive.

  • May 16, 1996. Announcement from Chris Powell on the mirroring of texts in the Oxford Text Archive at UMich: "The Humanities Text Initiative at the University of Michigan and the Oxford Text Archive at the University of Oxford are very pleased to announce the opening of a North American mirror site for the Oxford Text Archive. . .The Oxford Text Archive has long provided access to published and unpublished electronic texts in English, Latin, Greek, and many other languages. Its current catalogue contains over 1300 texts, over 250 of which are in the public domain and are freely available. The number of these texts is growing steadily, and will increase rapidly during 1996 as a result of the Archive's recent appointment as textual service provider to the UK's Arts and Humanities Data Service. All public access texts are made available in an SGML format following the recommendations of the ACH-ALLC-ACL Text Encoding Initiative." See also the OTA and HTI main entries.

  • May 06, 1996. Added entry for SGML and its broad role in the Digital Library Initiative, and associated projects, including current development of models for metadata. This is currently a loose collection of links, but provides a basic point of departure.

  • May 06, 1996. Announcement from John Unsworth on the availability of MU (Forms Assisted SGML Markup) a new piece of software from the University of Virginia's Institute for Advanced Technology in the Humanities. "MU is a perl-based program that builds fill-out forms for SGML editing, based on simple templates. It supports lock files (for networked workgroups), and it is distributed with a TEI-lite template. Demonstrations, source code, help files, and an email list for bug reports and developers are available at: See also the entry in the public software database.

  • April 30, 1996. Added a small collection of links on the topic of "inclusion exceptions," following the appearance of David Megginson's post to CTS. The dominant [current] wisdom seems to be this: "use them very sparingly, provisionally, selectively, or not at all."

  • April 23 [May 20], 1996. Press release announcing the appointment of Robin A. Tomlin as the new Executive Director of the SGML Open Consortium. Or: see the announcement on the SGML Open WWW server, "SGML Open Appoints New Executive Director." See also "SGML Open Appoints New Executive Director" in <TAG>, May 1996.

  • April 23, 1996. Announcement by Peter Murray-Rust for Costwish 1.0. Costwish is a graphical interface (SGML postprocessor and renderer) for Joe English's CoST-2 tool. From the README: "costwish is a generic graphical interface to Joe English's CoST SGML/ESIS post-processing tool. It is aimed at those who wish to: (1) run sgmls (or other ESIS-based parser) under a graphical interface; (2) browse their documents graphically (3) customise their postprocessing easily, powerfully and flexibly; (4) construct powerful searches of SGML-based documents; (5) and manage the results interactively; (6) develop interfaces to helper applications (e.g. graphical renderers)." See also the costwish main entry.

  • April 18, 1996. Announcement from David G. Durand for the TclYasp SGML toolkit. Extracts from the announcement: "TclYasp integrates a conforming SGML parser with the TCL scripting language. . . Unlike CoST 1.1, it uses an simplest-possible procedure call interface, rather than an eloborate object-oriented interface. . . TclYasp does have a few unique features: it's based on YASP, which is more easily portable (it's in ANSI C and not C++) and was designed to be integrated with an application. Since Yasp is fully re-entrant, more than one parser can be active at a time. It is not restricted to the informationd efined by the ESIS, as the full parser data is available. . . TclYasp/Mac includes a command shell, multiple-pane windows, limited on-screen text formatting, and a variety of interface features as well as the SGML processing stuff." See also the database entry.

  • April 18, 1996. New entry for the University of Helsinki Document Management Research Group projects. One such current project is SID - "Structured and Intelligent Documents." Structured and Intelligent Documents (SID) is a three-year research project, which studies and develops methods for attaching intelligent features to structured documents. . .As a basis for the project we consider structured documents marked up according to the Standard Generalized Markup Language (SGML), which is an ISO standard for defining document markup languages."

  • April 18 [April 30], 1996. Announcement from Jani Jaakkola(email: for version 0.99 of 'sgrep' [earlier announcement:]. Extract: "'sgrep'. . .is designed for grep-like searching of structured documents. . . 'sgrep' (structured grep) is a tool for searching text files and filtering text streams using structural criteria. The data model of sgrep is based on regions, which are nonempty substrings of text. Regions are typically occurrences of constant strings or meaningful text elements, which are recognizable through some delimiting strings. Regions can be arbitrarily long, arbitrarily overlapping, and arbitrarily nested. . . Like grep, sgrep can be used for any kind of text files. However it is most useful for text files containing some kind of structured text. A file containing structured text could be defined as a file, which obeys some syntax. Examples of structured text files are SGML, HTML, C, Tex and mail files."

  • April 18, 1996. Link for the WWW pages of the SGML User Group Finland. The pages are maintained by current president,

  • April 14, 1996. Announcement from David L. Gants for the online edition of Studies in Bibliography. "The Bibliographical Society of the University of Virginia and the University Library's Electronic Text Center are pleased to announce their plans to create Studies in Bibliography On-Line. This service -- available free of charge on the Internet -- will include the full text of the nearly one thousand articles in the 49 annual volumes of Studies in Bibliography (1948-1996) in a searchable and browsable database. . . Studies in Bibliography On-Line will be encoded in Standard Generalized Mark-up Language (SGML), following the Text-Encoding Initiative Guidelines (TEI), and will be available from the Electronic Text Center's on-line library." See also the main entry for several University of Virginia SGML projects.

  • April 12, 1996. New entry for the Lingua Parallel Concordancing Project, a TEI-related text project. "The proposal for a parallel concordancing project put to the Lingua bureau of the EU originated in the desire of a group of lecturers from different European universities to enhance the use of concordancing in the process of learning a second language. . . We have seen how natural it was to add alignment information to a text, but in the context of second language teaching, there is always a possibility to add some specific syntactic or rethorical elements on the basis of the different sets of tags defined within the TEI. It is clear however, that such encodings have to be accompanied by a clearly defined set of tools which will effectively give a semantics to the corresponding marks."

  • April 12, 1996. New entry for Corpus Encoding Standard (CES), a TEI-related corpus project. "The CES has been designed to be optimally suited for use in language engineering research and applications, in order to serve as a widely accepted set of encoding standards for corpus-based work in natural language processing applications. The CES is an application of SGML (ISO 8879:1986, Information Processing--Text and Office Systems--Standard Generalized Markup Language) compliant with the specifications of the TEI Guidelines for Electronic Text Encoding and Interchange of the Text Encoding Initiative."

  • April 11, 1996. Announcement by Terry Allen for the release of DocBook Version 2.4.1, available from the Davenport Group's Web server. See the text of the announcement for further details.

  • April 10, 1996. New entry for Electronic PROTEIN SCIENCE, an undertaking by Cambridge University Press and the University of California, Irvine. ". . .the entire Protein Science editorial production process at Cambridge University Press has been redesigned to accommodate the electronic edition and to incorporate it into the routine production of the printed edition. Underlying both the printed and electronic edition is a single master document that is prepared in the Standard Generalized Markup Language (SGML) that is sent for production of typesetting code and to the Web site for the production of Hypertext Markup Language (HTML) documents used for the delivery of the electronic edition. Underlying the SGML document is another document called the Document Type Definition (DTD) which describes the information content of the document and makes possible sophisticated indexing. A great deal of innovation has gone into the design of the DTD to make it serve simultaneously the requirement of electronic and print media."

  • April 08, 1996. Complete bibliographic information for articles in the SGML Special Issue of Computer Standards & Interfaces, edited by Ian A. MacLeod, under the issue title SGML Into the Nineties, as promised earlier. Articles include: David T. Barnard, Lou Burnard, and C. Michael Sperberg-McQueen, "Lessons from Using SGML in the Text Encoding Initiative"; Bart Bauwens, Filip Evenepoel, and Jan Engelen, "SGML as an Enabling Technology for Access to Digital Information by Print Disabled Readers"; Franz Burger and Sigfried Reich, "Design and Implementation of an Abstract SGML Interface in Smalltalk"; Patricia Francois, "Generalized SGML Repositories: Requirements and Modelling"; Matthew Fuchs, "The User Interface as Document: SGML and Distributed Applications"; Edward Levinson, "Exchanging SGML Documents Using Internet Mail and MIME"; Ian A. Macleod, "SGML into the Nineties"; Hans Holger Rath and Hans-Peter Wiedling, "Making SGML Work: Introducing SGML Into an Enterprise and Using its Possibilities in Advanced Applications"; Darrell R. Raymond, Frank Wm. Tompa, and Derick Wood, "From Data Representation to Data Model: Meta-Semantic Issues in the Evolution of SGML".

  • April 03, 1996. Announcement for SGMLC-Lite free compiler. "The 16-bit SGMLC-Lite free compiler for MS-Windows is now available from A 32-bit version will be available soon from the same source . . SGMLC is a language designed for processing SGML documents. It is based upon the C language, with some elements of C++. It recognises events which occur when processing an SGML document. You then provide the code to tell the application how to process the event. . . SGMLC may be used, for example, for writing SGML transformation applications, for converting SGML documents into some other form; extracting selected bits of information from an SGML document. . ." See also the main entry.

  • April 03, 1996. Updated the section SGML and Chemistry: The OCLC CORE Project (Chemistry Online Retrieval Experiment) and other Initiatives. Comments, additions and corrections are welcome.

  • March 30, 1996. Four new entries in the SGML mailing/discussion lists page, including one for the The Encoded Archival Description Forum, and one for the QUILL Electronic Mailing List [a forum for users of Chadwyck-Healey's SGML-based full-text databases].

  • March 28, 1996. Announcement for an SGML version (1.2.5) of the "HyTime Application Development Guide", by Ralph E. Ferris. The SGML markup is based on a modified version of the DocBook DTD, Version 2.2.1. See also the README file on the TechnoTeacher WWW server.

  • March 28, 1996. Announcement for "a mailing list for programmer-level discussions of SP," James Clark's SGML Parser. "The purpose of the list is to provide those who are incorporating SP into their own programs with a forum to exchange information." See also the main database entry for James Clark's SP SGML Parser, including the link to a brief introduction to Programming with SP.

  • March 27, 1996. New entry for The Orlando Project: An Integrated History of Women's Writing in the British Isles. "This project's primary objective is to produce, in printed and electronic forms, the first full scholarly history of women's writing in the British Isles. The history of women's writing in the multiple traditions of England, Ireland, Scotland, and Wales will consist of five volumes: a comprehensive, sophisticated, integrated chronology of women's writing in the British Isles, and four period volumes of literary history dealing with British women's writings from the beginnings to the present (the early period to 1830, the nineteenth century to 1890, the modern period to 1945, and contemporary writing from 1939). An electronic product, based on a custom-designed coding system which is consistent with international standards for text-encoding (SGML), will permit rapid, complex interrogation of the accumulated primary and secondary information produced by the history."

  • March 26, 1996. Announcement for CETH TEI Pilot Projects on the WWW, by Wendell Piez. "The main purpose of these pilot projects in SGML markup is to develop and test applications of SGML (Standard Generalized Markup Language) in its implementation according to the guidelines of the Text Encoding Initiative. These projects have been designed to demonstrate a range of scholarly and educational applications of the TEI: we have produced an edition of a text rendered in both print and networked versions ('The Child in the House'); an edition with critical commentary which demonstrates content markup and the uses of TEI linking mechanisms in a hypertext rendition ('Their Eyes Were Watching God,' Chapter 1); and a multimedia rendition of a Renaissance manuscript in facsimile and analytical transcription (John Donne's Elegy 'Love's Progress'). In coming months we also expect to offer a demonstration of a hypertext version of fragments of Theophrastus (in the edition of William Fortenbaugh)." The SGML source texts may be browsed on the Internet using Panorama. See also the CETH main entry.

  • March 25, 1996. New database entry for the Joint Electronic Document Interchange (JEDI) project, managed by the Division Of Learning Development Research Group at De Monfort University Leicester, the Computer Science at University College London, and the Document Interchange project at UKERNA. "JEDI is studying the popular formats for word processing that exist in both academic and commercial environments. The project aims to identify format conversion methods for popular de facto standards and their relationship with internationally recognised standards such SGML and ODA. The work on SGML converters is being performed at De Montfort University (DMU), while the work on ODA, WWW, electronic mail, and database access is being performed at University College London (UCL)." The projext has concluded that "SGML is ideally suited for EDI as it is text based and is platform and operating system independent. For SGML to be "presented" it must have a style sheet mechanism that is also text based. The style sheet approaches we have studied all conform to this criterion."

  • March 23, 1996. New database entry for Stanford University's Academic Text Service. ATS is "now introducting Web-based access to a number of electronic texts, and by the beginning of the 1996-1997 academic year we plan to make all of Stanford University's electronic texts available over the Web. . . The texts that ATS is delivering are encoded in SGML, using a document type definition (DTD) that complies with the TEI."

  • March 23, 1996. New database entry for SGML and Physics: (The American Physical Society and The American Institute of Physics). Several physics journals are being prepared and/or delivered using SGML technologies.

  • March 21, 1996. New entry for the Canadian Strategic Software Consortium (CSSC). "The mandate of the consortium is to undertake pre-competitive research to: (1) create the technology that will permit the extension of database management technology to text-intensive data; (2) produce working prototypes that are based on these new technologies; (3) apply working technology to several large-scale real-world problems; and (4) present the research and the technology in forums that are appropriate to the establishment of technical standards." Several of the research and development efforts work toward the integration of SQL and (SGML) structured text models. A "Hybrid Query Processor" (HPQ) being developed at the University of Waterloo "will provide a gateway to a federated database system and will support the construction of "virtual" tables managed (and updated) solely by the HQP. Tuples in these managed tables can contain TEXT and standard types of relational information stored on one, two or many underlying database engines."

  • March 19, 1996. New database entry for the Linguistic Data Consortium (LDC). "The Linguistic Data Consortium is an open consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes." Several of the lexical and text databases distributed through the Linguistic Data Consortium are structured using SGML encoding.

  • March 15, 1996. New database entry for the American Memory Project, from the US Library of Congress. "American Memory consists of collections of primary source and archival material relating to American culture and history. These historical collections are the Library of Congress's key contribution to the national digital library." Several of the collections are primarily textual (as opposed to photographic), and use SGML encoding. The Library of Congress Network Development and MARC Standards Office (LC/NDMSO) is also sponsoring development of an Encoded Archival Description DTD (formerly 'Finding Aid DTD'), together with the ATLIS Consulting Group and the Society of American Archivists. See the EAD database entry for further details about the EAD DTD and EAD Tag Library.

  • March 14 [21], 1996. Updated entry for the TEI Workshop to be held in conjunction with DIGITAL LIBRARIES '96, March 23, 1996. The program organizers are Nancy Ide and Judith Klavans. The full program listing and a workshop description are available online, as well as HTML versions of several papers. For now: (1) David R. Chesnutt and C. M. Sperberg-McQueen, "The Model Editions Partnership: Creating Editions of Historical Documents for the Internet. A Proposal for the TEI Workshop at the ACM Digital Libraries '96"; (2) Marta Pino, "Encoding two large Spanish corpora with the TEI scheme: design and technical aspects of textual markup"; (3) Debbie Lapeyre and Tommie Usdin, "TEI and the American Memory Project at the Library of Congress"; (4) Keith Shafer, "Creating DTDs via Fred"; (5) Julia Flanders, "Some Problems of TEI Markup and Early Printed Books"; (6) Stephen Davis, "SGML-MARC: Incorporating Library Cataloging into the TEI Environment".

  • March 13, 1996. Added entry for the MERS (Multiagency Electronic [Pharmaceutical] Regulatory Submission) Project. "The purpose of the MERS project is to develop and demonstrate an interchange standard for the electronic exchange of pharmaceutical regulatory information based on existing de jure information standards." Several government and industry parties have been collaborating on the development of an SGML DTD to govern regulatory submissions.

  • March 11, 1996. Announcement for " a first draft of documentation on programming SP." SP ('SGML Parser'), now in release version 1.0.1, is the current successor to the sgmls parser, by James Clark. For now: "SP provides two APIs: a native API and a generic API. There is not yet any documentation on the native API. I have written a first draft of documentation on the generic API. The draft applies to the current development version of SP." [James Clark]. See the CTS posting for a few other details. A related matter can be framed as a note of appreciation to Nelson H. F. Beebe (email, University of Utah, who has provided a collection of binaries for the SP parser, in addition to those provided by James Clark's FTP server.

  • March 11, 1996. Announcement for the conference "SGML Technology 1996, Applications in Government and Industry." March 27, 1996, Ottawa. Sponsored by Centre de recherche en droit public. See the conference entry for further details. Charles F. Goldfarb is a keynote speaker.

  • March 11, 1996. Publication of the SGML Special Issue of Computer Standards & Interfaces, edited by Ian A. MacLeod, Department of Computing and Information Science, Queen's University. Watch this space [see above] or the main bibliographic entry for links to the individual articles in this issue, including authors/titles/abstracts. We report with deep sadness Ian's death in a tragic automobile accident on December 15, 1995; interested parties may contribute to the Ian A. Macleod Memorial Fund.

  • March 11, 1996. Announcement for the public availability of the ATA DTDs, from Matt Moots. The Air Transport Association (ATA) 2100 and related DTDs document "specifications that allow industry participants to achieve major cost savings through the use of common systems and procedures. ATA's role is to facilitate this process by bringing industry members together to reach a consensus that all can support and implement. Thus, ATA specifications ("SPECs") are voluntary industry agreements on accepted means of communicating information, conducting business, performing operations or adhering to accepted practices." See the main ATA entry for links to the sample DTDs.

  • March 11, 1996. Announcement for the online availability of version 2 (May 1995) of the HyTime Application Development Guide, by Ralph Ferris and Victoria Taussig Newcomb, in PDF format. The PDF version of the Guide is hosted on PHOENIX DATA LABS WWW server. Previously, the guide was available only in Postscript. See the main bibliographic entry.

  • March 06, 1996. News about style sheets: " The World Wide Web Consortium (W3C) at INRIA and MIT's Laboratory for Computer Science has announced a major step in building a coherent World Wide Web, the universe of hyperlinked information available on the Internet. As part of a W3C convergence initiative, Consortium members have agreed to develop a common way of integrating style sheets into the Web's hypertext documents. . ." See the announcement [mirror copy], or the database entry on stylesheets.

  • March 05, 1996. New entry for the Japanese Text Initiative (University of Virginia and the University of Pittsburgh). Co-edited by Kendon Stubbs and Sachie Noguchi, the Japanese Text Initiative is a collaborative effort by The University of Virginia Library's Electronic Text Center and the University of Pittsburgh East Asian Library to make searchable SGML texts of classical Japanese literature available on the World Wide Web.

  • February 29, 1996. Updated description of the SARA System (SGML-Aware Retrieval Application), developed primarily for the BNC. SARA "is a client/server software tool allowing a central database of texts with SGML mark-up to be queried by remote clients." The four parts are: "(a) the indexing program, which generates an index of tokens from an SGML marked-up text; (b) the server program, which accepts messages in the Corpus Query Language and returns results from the SGML text; (c) the SARA protocol, a formally defined set of message types which determines legal interactions between the client and server programs; this protocol makes use of a high-level query language known as CQL (for Corpus Query Language); (d) one or more client programs, with which a user interacts in any appropriate platform-specific way, and which communicate with the server program using the protocol."

  • February 29, 1996. New database entry for NOLA - Network of Literary Archives. The NOLA consortium currently has about a dozen corporate members. Its goal is to "focus on what may be called literary archives - the unpublished sources on major novelists, philosophers, musicians and painters. Some of these collections are kept by libraries, others by museums, archives and research institutions, in the private as well as in the public sector." "The TEI recommendations for encoding and meta-description of manuscript materials will be the particular focus of NOLA, which will assess their suitability and recommend their extension and modification as necessary. A major emphasis will be placed on development of procedures for the integration and use of TEI-aware tools for the creation, management, documentation, analysis and dissemination of archival resources. Such tools are already being developed by several of the participants, and other TEI users."

  • February 28, 1996. New entry for the Electronic New Testament Manuscript Project. "Transcriptions will be done under the Standard Generalized Markup Language (SGML) application produced by the Text Encoding Initiative (TEI). This scheme, devised by Humanities scholars in Europe and North America provides a standard means for the the transcription of primary sources and textual variants. An increasing amount of software works with TEI encoded documents, including software for collating manuscript variants automatically. Because the manuscript transcriptions will be based on an SGML application, they will be platform-independent and will not become obsolete as software technology advances."

  • February 28, 1996. Request for comments on a HyTime lecture, by Arjan Loeffen. See the text of the announcement, or access the document directly: VIA FTP [mirror]. For more information on HyTime, see the main HyTime entry.

  • February 24 [26], 1996. Announcement for a beta release of Henry Thompson's Normalised SGML Library API. This work comes out of the MULTEXT project. See the links to NSL (documentation and distribution) from Henry Thompson's Home Page. Or see the bibliographic entry for the documentation.

    "In pursuit of a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation, LTG have developed an integrated set of SGML tools and a developers tool-kit, including a C-based API. This software described here contains everything required to process a very wide range of conformant SGML documents. Its initial parsing module incorporates v1.0.1 of James Clark's SP software, arguably the broadest coverage SGML parser available anywhere, commercial or not.

    "The basic architecture is one in which an arbitrary SGML document is processed on the way in, as it were, yielding two results: 1) An optimised representation of the information contained in the document's DOCTYPE; 2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc. The use of the cached DOCTYPE together with the normalisation of the SGML to nSGML means that applications processing nSGML streams can be very efficient." [from the announcement]

  • February 24, 1996. A new entry for the University of Cincinnati College of Law, Center for Electronic Text in the Law. CETL makes use of DynaWeb (from EBT)for management and delivery of DIANA documents from an SGML database. Documents themselves use the (abridged) TEI Header for bibliographic control. By clicking on the "TEI" icon for a given document, the SGML version is sent by DynaWeb instead of the HTML version. DynaWeb is a Web server software that, in addition to supporting standard HTTPD Web server protocol, "converts DynaText electronic books stored in SGML into HTML on-the-fly for rapid navigating and searching by any Web browser."

  • February 24 [March 25], 1996. A Dutch SGML Web Page is now available, thanks to Jan Grootenhuis The page contains: an introduction to SGML, a page identifying SGML's critical features, a glossary, a literature list, and a set of links to further information. [According to a Newswire posting: information in "Dutch, Flemish, Afrikaans, and Low-Saxon."]

  • February 19, 1996. Updated announcement for the Humanities Text Initiative American Verse Project, from John Price-Wilkin (University of Michigan). "The Humanities Text Initiative (HTI) is assembling an electronic archive of volumes of American verse. Most of the archive is made up of 19th century poetry, although a few early 20th century texts are included. The full text of each volume is being converted into digital form and coded in Standard Generalized Mark-up Language (SGML) using the TEI Guidelines." See also the main entry.

  • February 19, 1996. Announcement for a comprehensive set of SGML SDATA entities based upon UNICODE, from Rick Jelliffe. Overview: "All the ISO 10646 (Unicode) characters as SGML SDATA entities, arranged into convenient subsets according to the script of the characters. From SPREAD, the Standardization Project Regarding East Asian Documents." WWW-viewable at, or gzipped tar file at See more on character sets in the main entry.

  • February 13, 1996. Announcement for the online availabilty of two valuable white papers, from Exoterica Corporation. The first is Understanding The SGML Declaration, which will guide users through the process of writing or modifying an SGML declaration; see the main bibliography for other details. The second document bears the title Content Model Algebra; see the main bibliography entry for further summary.

  • February 03 [April 01], 1996. Update on the Berkeley Finding Aid Project, from Daniel Pitti. "The Library at the University of California, Berkeley is pleased to announce WWW access to sample HTML and SGML encoded finding aids. The source finding aids for both kinds of network delivery are encoded using the FindAid DTD developed in the Berkeley Finding Aid Project. The HTML version of the finding aids are being converted 'on the fly' from SGML to HTML through Electronic Book Technologies DynaWeb. The full SGML finding aids are available through SoftQuad's Panorama." From the announcement, or link to the site.

  • February 03, 1996. Announcement concerning the availability of a report by The SGML Project: "An Assessment of SGML-aware Software", from Paul Ellison. For other information on the SGML Project, see the database entries under archive sites or SGML groups.

  • February 03, 1996. Announcement for the availability of a new book, by Eve Maler and Jeanne El Andaloussi: Developing SGML DTDs: From Text to Model to Markup. The book is now published and available in the stores. More information on the book is accessible via the Prentice Hall WWW server: See also the main bibliographic entry for the book. The author's new address: [Eve Maler]

  • February 03, 1996. Announcement for a HyTime Workshop, organised by SGML BeLux (the Belgian-luxembourgian Chapter of the International SGML Users' Group). Venue: Wednesday 28 February from 14u till 17.30h; Sema, Stallestraat 96, 1180 Brussels, Belgium. See the full text of the announcement.

  • January 23 [21], 1996. A day of immense loss for the SGML community: the passing of Yuri Rubinsky. By all accounts -- many are now being written -- Yuri was to us a model of generosity and extraordinary vision. Some strive to build empires and gain personal wealth; Yuri followed neither as his passion, but gave much and risked much to help others envision a better world. He shared his dreams, and with constant encouragement helped others achieve their highest goals. ICADD was but one domain in which he worked selflessly for the public good. ICADD (International Committee for Accessible Document Design) is an effort dedicated to making printed materials accessible to persons with print disabilities, using SGML encoding for the automatic generation of Braille, large print, and voice-synthesized texts. Some of Yuri's publications are referenced in the SGML bibliography.

    May Yuri Rubinsky be remembered for this: by enthusiastically sharing his personal visions for a more sane information and publishing world, he inspired others with confidence to articulate and elaborate their own visions. Thus he gave of himself so that others might see more than they ever thought they could. Will we be remembered, as Yuri is now, for what we have given our world?

    Tributes and eulogies are being collected, and later will be formally edited. Information about the collection of tributes, services, donations, and other details has been provided by Murray Maloney of SoftQuad. Obituaries have been provided by from SoftQuad and from Val Ross (The Globe and Mail, Toronto). Here is a picture of the one whose life we honor [credits to OCLC]:

    Photo of Yuri Rubinsky

  • January 21, 1996. Announcement for a TEI workshop to be held at the University of Tübingen, Zentrum für Datenverarbeitung, March 11-13, 1996. See the text of the announcement from Winfried Bader, or the main conference entry.

  • January 21, 1996. Announcement for the "Fifth Annual CETH Summer Seminar on Methods and Tools for Electronic Texts in the Humanities." It will be held at Princeton University, New Jersey on July 14-26, 1996, under the direction of Susan Hockey (CETH) and Willard McCarty (University of Toronto). "The six parallel tracks will cover textual analysis, TEI/SGML, scholarly editing, hypertext, tools for historical analysis, and the design and planning of an electronic text center. Each track will allow for intensive works on participants' own projects, opportunities for both hands-on experience with current software and extensive discussion." See the text of the announcement, or the conference entry.

  • January 14, 1996. New entry for the Model Editions Partnership (MEP). The Model Editions Partnership is a consortium of seven historical editions projects which has been funded to work closely with the Text Encoding Initiative (TEI) and the Center for Electronic Text in the Humanities (CETH) in the production of digital text editions. The Model Editions Partnership will use the Text Encoding Initiative's markup Guidelines to create an SGML archive for samples from each edition.

  • January 14 [Feb 08], 1996. Call for papers for a proposed session "Text Encoding and Textual Theory" at MLA in December [27-30], 1996. The organizer is C. M. Sperberg-McQueen. See HTML version of the call, or the text version of the announcement, or the conference entry.

  • January 14, 1996. Announcement for a free program for converting TEI-tagged bibliographic headers into MARC format. The program 'tei2marc' (based upon perl scripts) was developed by Jeff Herrin and Jackie Shieh of The University of Virginia Library. "Tei2marc is designed to read in headers created on TEILITE.DTD and give the output in a bare-bones setup for the MARC computer file format. Because the University of Virginia Library is participating in OCLC's Internet Cataloging Project (known as InterCat), the converted records will be sent to OCLC's union catalog via FTP."

  • January 14, 1996. New entry (provisional) for the IEEE, which uses SGML in the creation of standards documents. "The purpose of the SPAsystem Authoring DTD Suite (SPA Z-30) is to create an environment that allows authors to write an IEEE standards document in SGML in a simple and intuitive manner. This is accomplished through a series of DTD modules. Each module is a small, highly structured DTD that defines a particular portion of an IEEE standards document." [see the links in the main entry]

  • January 14, 1996. Entry for the EAGLES (Expert Advisory Group for Language Engineering Standards) Initiative. The EAGLES Corpus Encoding Standard (CES) is being "formulated as a Text Encoding Initiative (TEI)-conformant application of the Standard Generalized Markup Language (SGML) ISO 8879. . . Documents encoded using the CES will be processable using any SGML-aware software. Because the CES also conforms to TEI recommendations, corpora encoded using the CES should also be processable by any TEI-conformant software."

  • January 10, 1996. Announcement from James David Mason (ISO/IEC JTC1/SC18/WG8 Convenor) that "ISO/IEC 10179, Document Style Semantics and Specification Language (DSSSL) has been transmitted to the ITTF for publication." Links to the materials have been placed on the WG8 home page: Dr. Mason asks that the SGML community join him "in congratulating the DSSSL project team for this effort and thanking them for all their hard work." Electronic versions of DSSSL are available in various formats, including searchable HTML (from Novell), Postscript, PDF, SGML, and DynaText (browser format). See the main DSSSL entry for links and other information.

  • January 10, 1996. Pointer to a tutorial introduction to SGML, available on the WWW. The "SGML Introduction - An Introduction to the Standard Generalized Markup Language (SGML)" is sponsored by the English Department of the University of Waterloo.

  • January 09, 1996. New entry for the University of Pittsburgh Electronic Text Project. "The University of Pittsburgh Electronic Text Project is a reseach and development effort investigating the technology and policy issues involved in producing, collecting, and serving richly marked-up scholarly texts over the University and wide-area network. The project was chartered in July 1994, and commenced in September 1994. The Electronic Text Project is composed of librarians and faculty from the University of Pittsburgh. An initial and large and component of the project is the SGML TEI encoding of the texts, and a subset of the Project Team is working exclusively on those aspects."

  • January 09, 1996. Announcement for SENG, a free transformation engine [SENG = Scheme engine add-in for SP]. "SENG is a transformation engine based on SP 1.0. It executes an SGML document as scheme code. SENG provides some basic procedures (some DSSSL like) to manipulate and access information from an SGML Instance. SENG was developed as a testbed for DSSSL experiments as well as an interm transformation engine for SGML. Features: (1) Cross document transformations; (2) Access to element context and left-siblings; (3) R4RS Scheme programming environment; (4) Simple syntax for style semantics (style sheets)." See also the database entry for SENG.

  • January 07, 1996. Announcement for TEI workshop: "The Text Encoding Initiative Guidelines, Application to Building Digitial Libraries, Held in conjunction with Digital Libraries'96: First ACM International Conference On Digital Libraries." See the text of the call for papers and participation.

  • January 07, 1996 [March 20, 1997]. Clarification concerning the canonical and current version of the TEI's "Gentle Introduction to SGML", edited by C. M. Sperberg-McQueen and Lou Burnard. Here's the [again corrected 1997] URL: For anyone who has missed this resource, it provides an excellent SGML introduction. It is Chapter 2 of the TEI P3 Guidelines.

What Was New in 1995-1998

Other SGML/XML news items recorded for 1995 and later may be found in separate online documents:

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: