The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: April 11, 2002
Public SGML/XML Software

Publicly Available Software for SGML/XML/DSSSL

Introduction

Priority is given to "public" SGML/XML software in this document database since the scope of interest is mainly the Internet, where the ethic of public gift is highly esteemed. The wealth of SGML software made freely available for public use is evidence of that ethos. As a supplement to the links and information provided on public SGML software below, readers should consult Steve Pepper's "Whirlwind Guide to SGML Tools and Vendors." See the main bibliographic entry for the Whirlwind Guide for a document abstract and detailed information about its contents.

See also the detailed software summary for 207 products extracted from the technical report of Eila Kuikka and Erja Nikunen [updated January 1998]: (a) the full bibliographic entry, or (b) the overview in the "Commercial SGML Software" page. NICE Technologies [November 1996] also has an online database of SGML vendors and products (local archive copy).

Primary sections in this document include the following -- however infelicitous the taxonomy for software categories. See the Contents listing to link directly to a particular description.


Public SGML Software: Table of Contents


SGML Parsers


SP: James Clark's SGML Parser

[CR: 20001011]

James Clark's SP parser toolkit is the successor to his SGMLS parser. Formally, SP is "An SGML System Conforming to International Standard ISO 8879 -- Standard Generalized Markup Language" [and] "A free, object-oriented toolkit for SGML parsing and entity management."

[October 11, 2000] SP development (OpenSP) in the OpenJade project. OpenJade Source Control Repository Home Page". See also the project summary page. Contact Matthias Clasen. OpenSP-1.4, cache. See also OpenSP-1.5 pre-release in CVS.

[March 2000] New Version of OpenSP from the OpenJade Team. Matthias Clasen (Mathematisches Institut, Albert-Ludwigs-Universität Freiburg) has announced the availability of a new version of OpenSP (OpenSP-1.5pre1). OpenSP is a variant of James Clark's SP SGML parser, maintained by the OpenJade team. "The OpenJade team has made a prerelease of OpenSP-1.5 available at ftp://openjade.sourceforge.net/pub/openjade/OpenSP-1.5pre1.tar.gz. Changes in version 1.5 include: (1) More of Annex K supported: Common data attributes can now be specified in external entity declarations. (2) The architecture engine supports #MAPTOKEN. (3) The multibyte version of OpenSP now uses 32bit chars and supports the full UTF-16 range 0x0000-0x10ffff." Bugs in the release should be sent to the development team at jade-bugs@infomansol.com." OpenJade "is a project undertaken by the DSSSL community to maintain and extend Jade. OpenJade is distributed under the same license as Jade. Jade is James Clark's implementation of DSSSL -- Document Style Semantics and Specification Language -- an ISO standard for formatting SGML (and XML) documents."

[March 10, 1998] See the announcement from James Clark for the public availability of SP version 1.3 and Jade version 1.1. "The main change in SP 1.3 is better support for XML based on the Web SGML TC. In Jade 1.1 the main changes are the experimental extensions for XSL (documented in dsssl2.htm), and the use of XML for the FOT backend's output." See Clark's Web site for detailed information. Note to SP and Jade users who depend upon the architectural processing support: the appropriate ArcBase processing instruction is now <?IS10744 ArcBase DSSSL>, and no longer <?ArcBase DSSSL>; SP and Jade will now require the former, on penalty of an error message (ca.) "jade:E: specification document does not have the DSSSL architecture as a base architecture. . ." or similarly. Thanks to Eliot Kimber (ISOGEN International) for clarification on this point. Also: Jade 1.1 and sp 1.3 for OS/2 provided by David J. Birnbaum.

[February 16, 1998] An announcement from James Clark for a new test release of SP (version 1.2.92) and Jade (version 1.0.93). The main changes in Clark's SP package since version 1.2.91 are enhanced support for XML based on the final WebSGML Adaptations Annex (ISO 8879 Annex K) and the inclusion of the SX application (for converting SGML to normalized XML). [SP version 1.2.92 and Jade version 1.0.93, sources, archive copy]; [SP version 1.2.92 and Jade version 1.0.93, Win32 binaries, archive copy]

[October 17, 1997] An announcement from James Clark describes a test release of SP with improved XML support. This test/experimental version is available via FTP as part of a Jade test release: source, or Win 32 binaries. In this distribution, SP supports "a number of key features from the WebSGML SGML TC," including: unbundling of SHORTTAG, feature to allow elements declared EMPTY to have end-tags, duplicate enumerated attribute tokens are allowed, support for multiple ATTLIST declarations for a single element type, relaxation of rules on use of parameter entity references inside groups, feature that turns off SGML's traditional record end rules, NESTC (net-enabling start tag close) delimiter, support for predefined single character entities in the SGML declaration (lt, amp etc), etc. See the text of the announcement for full details about this SP test release.

[September 03, 1997] As of this time, the most recent version of SP is also available as part of James Clark's Jade package.

[October 28, 1997] Announcement from James Clark for a "very preliminary release of SX, an application built with the SP library for converting SGML to XML." This tool will eventually be included in the standard SP distribution. SX (the provisional name) "parses and validates the SGML document contained in sysid... and writes an equivalent XML document to the standard output. SX will warn about SGML constructs which have no XML equivalent." The distribution includes both source and Win 32 binaries (the sp120u.dll file included in the SP 1.2.1 Win32 Unicode binary distribution is required). Note that the program "does not yet provide enough to handle the situation where you want to migrate your document source from SGML to XML. In particular it doesn't try to preserve entity references; all entities are expanded."

Note: this paragraph is not up-to-date for SP version 1.2, released in September 1997; see the official documentation, and/or the links in the description of SP version 1.2. . . The current version is SP 1.1.1 (July 30, 1996). SP is a "free, object-oriented toolkit for SGML parsing and entity management." SP is written in C++, supports the LINK feature, is reentrant (a single process can use multiple parsers at the same time), is command-line compatible with SGMLS, includes an application [nsgmls] to generate sgmls-style output format, and an application [rast] to generate RAST output format (like SGMLS) conforming to ISO/IEC 13673:1944. Other parser tools include [sgmlnorm], a simple SGML tag normalizer, and [spent], a facility for printing an SGML entity on standard output. SP supports any concrete syntax allowed by ISO 8879, and supports large character sets (can be compiled to use 16-bit characters internally; supported systems include UTF-8, Unicode/UCS-2, UJIS/EUC, and Shift-JIS). It is said to be fast for large documents. In addition to the C++ source code, binaries [nsgmls and rast] are available for MS-DOS (SP version 0.2) and several UNIX systems. The MS-DOS binaries use a 32-bit DOS extender (included in the distribution), so that the MS-DOS 640K conventional memory barrier should not be a limiting factor in the use of SP.

In the most recent releases of SP, James Clark has also issued some very useful tools that handle entities and "normalize" SGML documents in various ways, as specified in command line options. For example, SPAM (SP Add Markup) will provide canonical SGML when SHORTTAG and OMITTAG have been used in the SGML source. The output SGML is determined by the user's specification. SPAM (SP Add Markup) thus serves as a markup stream editor. See the documentation from the official site for complete details. Version 1.1 also supports Architectural Form Processing [mirror copy], on which, see the following "toy example".

[April 10, 2000] XML Base Architectures in SP. Steve Newcomb writes: "You can now use SP to validate the conformance of XML documents to base architectures (meta-DTDs). TechnoTeacher has created a version of SP with full industrial-strength support for the alternative PI-based "Base Architecture Declaration" syntax. The enhancement builds on pioneering work done by Luis Martinez while he was working at TechnoTeacher, and it has recently been brought up to industrial strength by Peter Newcomb. Because of urgent need in certain industrial quarters (mortgage, healthcare, etc.), we've placed binaries of this version of SP at our FTP site: ftp://ftp.techno.com/TechnoTeacher/SPt..." [cache]

[September 1996] Commercial support for SP is provided by TechnoTeacher, Inc. - NB, James Clark himself has no commercial connection with TechnoTeacher, Inc. See the support announcement.

[November 25, 1997] See the announcement for a GC-enabled spgrove application, from Vladimir V. Tsychevski.

Other links:

Pointers to the latest released version of the SP parser (version 1.0.1: October 21, 1995) and its description:


parseDTD - DTD parser package for SP

[CR: 19980612]

[February 06, 1998] From Peter Newcomb, of TechnoTeacher Inc.: parseDtd. It parses an SGML declaration set in the absence of a document (e.g., can parse a DTD and spit out information about the elements and attributes defined in it). It is based on the SP SGML parser, version 1.2.1, written by James Clark. Peter's description: "I recently put together a small SP-based package that parses declaration sets irrespective of particular documents, returning the result as an SP DTD object."

Links:


Graphical Front Ends for SP

[CR: 19971028]

Probably there are several such front ends. [Please let me know what's missing in the list below.]

  • SP Wizard: Advertised functionality: ". . . a freeware 32 or 16 bit Windows interface using OLE Automation wrappers around NSGMLS and SPAM. (1) Allows you to interactively change settings of all command line parameters and environment variables. (2) Allows multiple files to be parsed at the press of a button. (3) Displays clickable error messages which puts the cursor in front of the offset within the line that was in error. (4) Allows you to correct errors as you find them. (5) Search and Replace. (6) Undo up to 32000 characters at multiple levels. (7) Prints reports of error messages and files that parsed with no errors. (8) OLE Automation for NSGMLS, SPAM and execution of DOS programs which can be used from Visual Basic and Visual C++. (9) All SP files were taken from the SP 1.1.1 distribution."
  • Apropos of the above: Announcement from Larry Robertson for "a web page with a sample program and some notes on the Grove OLE Automation class. . . The Grove OLE Automation Class is basically intended for parsing and fully supports the 9401 catalog; it is extremely fast and easy to use." Title: How to use the Grove OLE Automation Class in Visual Basic 5.0. "The sample program will batch parse sgml and html files. It will print reports has a very simple editor." [September 13, 1997]
  • CSW Parser Plus. "CSW Parser Plus is a graphical front end for the popular SP parser, running under Windows NT/95. With CSW Parser Plus, its easy to set up options for the SP parser and process SGML files one at a time, or in batches. . . CSW Parser Plus is packed with useful features to help set up and run the SP parser, including: (1) set the SGML Declaration and DTD; (2) process one document file, or a batch of files; (3) view errors on screen, or redirect to a file; (4) set warning and output options; (5) define locations for multiple catalog files; (6) launch editors and processing tools"
  • RUNSP2: a user-friendly Windows shell for NSGMLS, from Richard Light. "RUNSP2 is designed to let you run the NSGMLS parser in a Windows environment. It provides standard Windows facilities for opening a file to be parsed and running the parser, but goes beyond that by 'reading' the error messages, and providing a helpful editing environment in which the user can correct the errors found. The original idea was to support all the command-line options of NSGMLS via menu options or a dialog box, and I will go on to do this if the basic idea works well enough to justify the effort. At present this program just runs the parser (NSGMLS) and the simple normalizer (SGMLNORM). Later, I may extend it to run all the programs in the SP suite." source, and local archive copy [September 18, 1997].
  • See also Groves and Grove Plans in SGML/DSSSL/HyTime

ARC-SGML: Charles Goldfarb's Almaden Research Center SGML Parser

ARC-SGML was one of the first SGML parsers to be made publicly available, and it provided the basis for the development of SGMLS by James Clark.


SGMLS: James Clark's SGMLS parser

[CR: 19970909]

SGMLS is probably the most widely used "public domain" parser as of late 1994. It has been incorporated as a validating parser into several commercial products as well. It is superseded now in part by James Clark's "SP" parser (and perhaps by the YASP and YAO parser materials) though for many simple validation tasks, SGMLS remains quite useful. SGMLS is also very fast. Its output is intended for a structure-oriented application, and this output is trivially parsable. SGMLS has been ported to many platforms, including OS/2.


YASP: Pierre Richard's Yorktown Advanced SGML Parser (or: 'Yet Another SGML Parser')

[CR: 19970405]

  • [April 1997.] Announcement from Christophe Espert (Electricité de France, Direction des Etudes et Recherches) for a new release of the YASP SGML parser interface. YASP has been implemented as a DLL for Windows NT and Windows 95, but the source code may also be compiled on Unix and other systems. The new version of YASP (1.36) has functionality "that will help enhance GROVE building in applications. YASP now reports ELEMENT, ATTLIST, NOTATION and ENTITY declarations as it parses them. YASP still gives access to the fully resolved DTD after the document prolog has been parsed. Therefore objects of classes in the PRLGABS0, PRLGABS1 and PRLGSDS modules can be built."
  • Announcement from Christophe Espert (Electricité de France, Direction des Etudes et Recherches) for the availability of YASP ('Yet Another SGML Parser', developed by Pierre G. Richard), on Windows 95 and Windows NT. August 27, 1996. URL: ftp://ftp.edf.fr/pub/SGML/YASP.
  • Announcement from Christophe Espert for a new distribution package for YASP, for DOS and Windows (July 1996); [winyasp.zip, 1258734 bytes] "It includes source code, documentation and binaries for Windows. The YASP library is a Dynamic Link Library. It has been built with Visual C++. . ."
  • April 1997 sources: ftp://ftp.edf.fr/pub/SGML/YASP; archive copy
  • April 1997: documentation in PDF format
  • FTP YASP from the SGML Repository
  • FTP YASP from Exeter
  • FTP: ftp://ftp.edf.fr/pub/SGML/YASP (A new package for the YASP parser, available for UNIX; from Christophe ESPERT ]Christophe.Espert@der.edf.fr], February 1996)
  • See also the TclYasp SGML toolkit

YAO (Yuan-Ze--Almaden--Oslo project) Parser Materials


PSGML, by Lennart Staflin

[CR: 20001201]

PSGML is described as "a major mode for editing SGML and XML documents. It works with GNU Emacs 19.34, 20.3 and later or with XEmacs 19.9 and later [perhaps also Lucid Emacs 19.9, OEmacs, NTEmacs]. PSGML contains a simple SGML parser and can work with any DTD. Functions provided includes menus and commands for inserting tags with only the contextually valid tags, identification of structural errors, editing of attribute values in a separate window with information about types and defaults, and structure based editing." David Megginson's personal testimonial: "XEmacs+PSGML is my editor of choice for all of my XML and SGML work. I've used it to create probably close to 10,000 printed pages of documentation over the last few years, and have used XEmacs's regular-expression facilities for adding complex markup to e-texts. It's probably not suitable for naive users (give 'em XMetaL or WordPerfect, or maybe XED), but for the tech-savvy, it's great." [XML-DEV]

[December 06, 2001] "Using Emacs for XML Documents. Install add-ons to the powerful Emacs text editor to build a platform-independent (and free) environment for working with XML." By Brian Gillan (Software engineer, ID Technology and Design Group, IBM). From IBM developerWorks XML Zone. December 2001. ['Emacs, best known as a powerful text editor for UNIX developers, can be an ideal XML editor for MS-DOS, Windows, and MacOS. The author describes how to install the right add-on packages and modify settings to create a powerful XML/SGML editing-and-validation environment in Emacs with extensions such as PSGML and OpenSP. Most of the work involved in setting up this environment ends with downloading and installing Emacs and the individual packages, but you must also configure Emacs properly and enable the DTDs you plan to work with. The article includes sample configuration files and XHTML DTDs.'] "Though it's best known as a powerful text editor favored by UNIX developers, Emacs can be used to work with XML in non-UNIX platforms such as Windows, MS-DOS, and MacOS. Emacs works as a full-blown development environment for processing text, writing applications, and, as I'll discuss, creating structured information like XML and SGML. I use it as a general-purpose editor for creating and managing some of my programming projects, and for writing XHTML and playing around with SGML and XML. In fact, I used it to write this article. This article tells how to install Emacs and the extensions PSGML and OpenSP. It also outlines how to customize Emacs to make it function with a variety of DTDs. I present many of the Emacs customizations one piece at a time. However, you can download a zip file with sample DTDs and all of the Emacs customizations. My intent is to get you started using Emacs by providing you with just enough information for you understand what's going on. Then you'll be able to add DTDs and customize Emacs based on your needs and preferences..." PSGML version 1.2.3 was released on SourceForge November 8, 2001; see the download. [PSGML version 1.2.3, November 8, 2001, cache]

[December 01, 2000] Update notice 2000-10-27. "The future of PSGML: It is currently not in active development. I plan to put out one or two bug fix releases and the move the sources to source forge (possibly after restructuring the code a bit and merging in various patches and additions that has been send to me.) I will then invite others to take an active part in the future development of PSGML. To start this I have created two mailing lists on source forge. A psgml-user for general discussion and questions about PSGML and psgml-devel for discussion about the future development of PSGML. Visit the SourceForge: Mailing Lists for PSGML page for subscription information..."

  • Description HTML version of PSGML
  • [March 2001] See the source for PSGML version 1.2.2, from SourceForge.
  • [October 14, 1999] Staflin released a beta version (1.2.0) with XML editing support. [local archive copy]
  • [1999-10-14] Kai Grossjohann described a problem with incompatible system identifiers when using psgml to edit XML documents; David Megginson supplied the lisp code for a provisional fix.
  • See also David Megginson's enhancements for XML Editing Mode in PSGML and psgml-dsssl (DSSSL editing mode). Updated 980223 and possibly later.
  • Miyashita Hisashi has reportedly implemented a version of PSGML-XML that works on Meadow. Meadow ('Multilingual enhancement to gnu Emacs with ADvantages Over Windows') is a fully internationalized version of Emacs20 on MS Windows.
  • Version 1.0.1 (November 20, 1996); [archive copy]
  • [December 16, 1998] Bob DuCharme posted an announcement for the online availability of Chapter 2 of his book, SGML CD: "Editing SGML Documents with the Emacs Text Editor." This Adobe Acrobat version of Chapter 2 (99 pages) "assumes no initial knowledge of Emacs and provides a basic introduction to creating and navigating simple text files before it covers PSGML - Lennart Staflin's add-in that turns Emacs into a menu-driven, validating, SGML/XML editor." Bob says: "The SGML CD book is a tutorial and user's guide to free SGML/XML software, and you can link to all the software from the web page whether you want to buy the book or not. I have my own time- and keystroke-saving PSGML tricks (mostly in the form of .emacs lines) and I'm curious about those of other PSGML users, so I'll be posting a Web page of my own and soliciting those of others to add in a few weeks. Feel free to send them to me anytime; I'll credit all contributors."
  • See Markus Hoenicka's SGML/DSSSL Setup for Windows NT - including PSGML
  • Editing SGML with Emacs and PSGML - Manual
  • PSGML and Fonts. David Megginson explains how to map font faces to any or all of the symbols 'comment', 'doctype', 'end-tag', 'entity', 'ignored', 'ms-end', 'ms-start', 'pi','sgml', 'short-ref', 'start-tag' and so forth. This works! [June 1997]
  • Another discussion (TEI-L) on fontifying/colorizing with PGSML; see also (in greater detail) David Megginson's recipe above.
  • SGML: Lysator PSGML (Remote file ftp.lysator.liu.se/pub/sgml)
  • FTP PSGML from the SGML Repository
  • FTP PSGML from Exeter
  • Setting up PSGML and sgmls for HTML, or try: this link; (courtesy of Martijn Koster, m.koster@nexor.co.uk)
  • [October 14, 1998] PSGML setup instructions, provided by Peter Flynn
  • [August 09, 1997] Announcement from David Megginson (Microstar Software Ltd.) for initial enhancements of PSGML to enable an XML editing mode: ". . . I patched PSGML to add an XML mode that enables XML-specific delimiters, parsing, and error-reporting -- in other words, it's a real, native XML DTD-driven editor." The new code for XML support has not yet been incorporated into the main psgml distribution, but Megginson is requesting assistance from qualified alpha testers to help debug the code.


Emacs LISP Mode - sgml-mode.el

From James Clark et al.:


tdtd - Emacs Macro Package for Editing SGML/XML DTDs

[CR: 20011102]

[June 09, 2001] The web site URL for 'dtd -- Emacs Major Mode for SGML and XML DTDs' is http://www.menteith.com/tdtd/. The latest version is 0.7.1. Features of tdtd revision 0.7.1 include: (1) Standalone mode for editing DTDs; (2) "Goto" menu for locating declarations within the current buffer; (3) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; (4) dtd-grep function for searching files that shares a file history with dtd-etags for easy searching of the same files with both functions; (5) Specific font lock highlighting of declarations in XML DTDs, SGML DTDs, SGML Declarations, and System Declarations so that the important information stands out; (6) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; (7) Functions for writing and editing element, attribute, internal parameter entity and external parameter entity declarations and comments to ease creating and keeping a consistent style; and (8) Elements and parameter entity names referenced in declarations are stored in minibuffer history to minimise retyping in new declarations..." [cache cersion 0.7.1]

In March 1999, Tony Graham (Mulberry Technologies, Inc.) released an updated version of his tdtd 'Emacs Major Mode for SGML and XML DTDs'. Features in revision 0.7: (1) Standalone mode for editing DTDs; (2) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; (3) dtd-grep function for searching files that shares a file history with dtd-etags for easy searching of the same files with both functions; (4) Specific font lock highlighting of declarations in XML DTDs, SGML DTDs, SGML Declarations, and System Declarations so that the important information stands out; (5) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; (6) Functions for writing and editing element, attribute, internal parameter entity and external parameter entity declarations and comments to ease creating and keeping a consistent style; (7) Elements and parameter entity names referenced in declarations are stored in minibuffer history to minimise retyping in new declarations."

[August 03, 1998] Update of the tdtd emacs macro package for editing SGML/XML DTDs.

[May 27, 1998] The tdtd Emacs Macro Package for editing SGML/XML DTDs was updated by Tony Graham on May 24, 1998. Version 0.5.1 features: "1) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; 2) Font lock highlighting of declarations so that the important information stands out; 3) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; 4) Functions for writing and editing declarations and comments to ease both creating and keeping a consistent style."

Previously: Tony Graham (Mulberry Technologies, Inc.) announced the availability of a tdtd Emacs Macro Package for editing DTDs (revision 3, December 14, 1997). The macro package was presented in a poster session at SGML/XML '97. The macros have been developed "intermittently over the last two years." Tony says: "The tdtd macro package for an Emacs major mode for editing DTDs is available at ftp://ftp.mulberrytech.com/pub/tdtd. The package includes font lock keywords for colour highlighting of declarations and reserved words plus a collection of macros that help when writing DTDs. The dtd-mode is a derived mode that builds on sgml-mode, and the features of sgml-mode are still available." The author will gladly accept bug reports and/or enhancements.

Links:


Panorama: SoftQuad's SGML Viewer for WWW

[CR: 19980408]

SoftQuad Panorama is a free version of SoftQuad Panorama PRO. It supports browsing (and searching?) of fully compliant SGML documents on the WWW.


HoTMetaL: SoftQuad's HoTMetaL editor for HTML

HoTMetaL is an unsupported version of the commercial product HoTMetaL Pro. It provides an editor/browser for (extended) HTML documents. HoTMetaL is available on a number of platforms (UNIX, MS-Windows, etc.). A tutorial for HoTMetaL Pro teaches HTML basics, supported by an HTML Quick Reference guide. The most recent [March 1995] Windows version of HoTMetaL supports some of the Netscape extensions (e.g., <CENTER>, <BLINK>), displays graphics inline, uses a stylesheet configured to look like a standard HTML browser, and supports a filter for loading plain text files and invalid HTML documents. See the posted public announcement or the fuller description on the SoftQuad server, including FTP location. Try the FTP directory ftp://ftp.ncsa.uiuc.edu/Web/html/hotmetal/Windows, and specifically the binary file ftp://ftp.ncsa.uiuc.edu/Web/html/hotmetal/Windows/hotm1new.exe).

Other mirror FTP sites list for HoTMetaL

Connect to the SoftQuad server for a recent list of FTP sites in the US, Canada, and Europe that host HoTMetaL. The FTP links below are older, but may still be alive:

  • ftp.ncsa.uiuc.edu:/Mosaic/contrib/SoftQuad
  • ftp.ifi.uio.no:/pub/SGML/HoTMetaL
  • sgml1.ex.ac.uk:SoftQuad
  • doc.ic.ac.uk:/pub/packages/WWW/ncsa/contrib/SoftQuad
  • askhp.ask.uni-karlsruhe.de: /pub/infosystems/mosaic/contrib/SoftQuad
  • ftp.cs.concordia.ca:/pub/www
  • ftp.cc.gatech.edu:/pub/gvu/www/pitkow/misc
  • ftp.sunet.se:/pub/www/Mosaic/contrib/SoftQuad
  • ftp.uco.es:/www
  • olymp.wu-wien.ac.at:/pub/sgml/exeter/SoftQuad
  • ftp.germany.eu.net: /pub/infosystems/www/ncsa/Web/contrib/SoftQuad
  • ftp.informatik.uni-freiburg.de: /pub/WWW/editors/HoTMetaL
  • gatekeeper.dec.com: /pub/net/infosys/Mosaic/contrib/SoftQuad
  • Email to: webmaster@sq.com


HyBrick - SGML/XML Browser

[CR: 19990304]

[March 04, 1999] Ralph E. Ferris (Fujitsu Software Corporation) has announced a new release of Fujitsu's HyBrick SGML/XML browser, with expanded support for XLink/XPointer. It is available from the Fujitsu Software Corporation's Web site. New features in HyBrick V0.82 related to XLink and XPointer include: "1) XLink/XPointer error/warning info is shown in the error list dialog; 2) A 'Document Group' sub-menu has been added in the 'XLink/XPointer' menu; users can now navigate between inter-linked documents by using Document Groups as well as through individual links; 3) In the 'select link' dialog, link element 'role' values are displayed instead of GIs. This feature, as well as the 'Document Group' display feature, are particularly useful for creating and navigating 'Topic Maps.'; 4) The mouse cursor now changes its shape over links." Also new in HyBrick 0.82 are multiple stylesheet support (if multiple stylesheet PIs are present, users are presented with a dialog box to select the stylesheet they want to use), 'Reload hubdocument' function and 'Close window' function. 'HyBrick' is "an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. 'HyBrick' is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. It supports both valid and well-formed XML documents, XLink and XPointer (XLink implemented as a subset of the HyTime property set), SGML (ISO 8879), DSSSL (ISO 10179) online specification, printing and print previewing based on DSSSL stylesheets." See more on HyBrick Support for XPointer in a posting of March 4, 1999.

[February 15, 1999] Ralph E. Ferris (Fujitsu Software Corporation) posted an update on the HyBrick V0.80 support for XLink and XPointer. HyBrick is an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. HyBrick is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. It supports "both valid and well-formed XML documents, XLink and XPointer, SGML (ISO 8879), DSSSL (ISO 10179) online specification, printing and print previewing based on DSSSL stylesheets." To make the point [about HyBrick XLink/XPointer support, Ralph has] put some files with XLink/XPointer declarations in them up on the HyBrick Web site at http://www.fsc.fujitsu.com/hybrick/. These files are intended to be accessed over the Web. If your network access environment allows you to though, you can see XLink and XPointer at work over the Web by downloading HyBrick and pointing it at: http://www.fsc.fujitsu.com/hybrick/hubdoc-1.xml . . ." [see the posting for caveats and full details.] HyBrick Version 0.8 with XLink/XPointer support is now available for download.

[Earlier description:] "HyBrick" is 'an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. "HyBrick" is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. HyBrick supports: 1) Both valid and well-formed XML documents; 2) XLink/XPointer on the local file system [XPointer is implemented as a subset of the HyTime property set; Link traversal can use either "New" or "Replace" to display a new page]; 3) SGML (ISO 8879); 4) DSSSL (ISO 10179) online specification; 5) Printing and print previewing based on DSSSL stylesheets.'

[November 03, 1998] Ralph E. Ferris of Fujitsu Software Corporation has announced that HyBrick V0.8 with XLink/XPointer is Now Available for download.

Links:


The Wurd [was: WP] Project

"Wurd is an SGML capable Wurd Processor and publishing tool for multiple operating systems/platforms - although at the moment the only operating system supported is Linux. [June 1997]

[Work in progress only] WP is "a word processor being built by linux enthusiasts. . . with a native file format based on the SGML model. . .The use of SGML as the file format means that wp has an open interchange format. It will be possible to maintain World-Wide Web pages directly with wp."


GRIF Symposia: "A Collaborative Authoring Tool for the World Wide Web" (HTML and XML)

[CR: 19970827]

Links:


HyBrowse HyTime Browser

[CR: 19961126]

HyBrowse is a HyTime Browser from TechnoTeacher, Inc., - a HyMinder application. "HyBrowse is a true HyTime (ISO/IEC 10744) hyperdocument browser for Windows 95 and Windows NT. It is useful for developing electronic document architectures that employ HyTime's strongly typed location-independent linking mechanisms." HyBrowse is publicly available (free) [as of November 22, 1996] for a trial period of 45 days. In addition to standard features one would expect, it supports: (1) True HyTime independent hyperlinking; (2) User-defined strong hyperlink typing with [a] icons assignable to anchor roles over entire bounded object set (BOS), [b] rendering styles assignable to anchor roles over entire BOS; (3) HyTime-conforming address elements ; (4) Aggregate location and hyperlink traversal handling; (5) Arbitrary BOS awareness allows users to add (import) a document into the current BOS; (6) Re-open browsing sessions without reparsing or reprocessing."

Eliot Kimber writes: "NOTE: HyBrowse is intended as a tool for creating prototypes and demos of HyTime features. It is not intended to be a production-quality information delivery system. The formatting features are minimal compared to Panorama or DynaText but sufficient to demonstrate the very interesting things you can do with independent links and anchors thereof. If you've been thinking of ways that HyTime hyperlinking could solve some of your information management problems but never had a way to realize or test those ideas, now you do, for free."

Links:


perlSGML - Perl programs and libraries (Earl Hood)

[CR: 19970918]

perlSGML is a collection of Perl programs and libraries written by Earl Hood for processing SGML documents. The following software is available in the perlSGML distribution: dtd.pl (A Perl library to parse SGML DTDs), dtd2html (An SGML DTD documentation/navigation tool), dtddiff (a utility to list changes in a DTD), dtdtree (Generate content hierarchy trees of SGML elements), dtdview (Interactively query a DTD), sgml.pl (A Perl library to parse SGML instances), stripsgml (utility to remove SGML markup).

The 'dtd2html' tool is widely used. "What is dtd2html: dtd2html is part of the perlSGML package. dtd2html is a program that generates an HTML document (composed of several files) that documents and allows hypertext navigation of an SGML DTD."

  • [September 18, 1997] Announcement from Earl Hood (University of California, Irvine) for a new release of the perlSGML toolkit. perlSGML is a collection of Perl programs and libraries for processing SGML DTDs and documents. "This release mainly includes a new set of Perl 5 modules. A new stripsgml is available and some corrections to dtd.pl are included in the release."
  • perlSGML Main Page
  • Documentation for perlSGML
  • October 09, 1996: Announcement from Earl Hood for a new release of the perlSGML tools -- a collection of perl software for processing SGML data. These SGML software tools run under Perl versions 4 and 5. Most important changes: (a) "Hierarchial tree output of DTDprint_tree of dtd.pl modified to preserve the content model in the output. New tree format utilized by dtd2html, dtdtree, and dtdview; (b) sgml.pl rewritten to be more efficient and be useable for large files. Still more suited for simple tasks. stripsgml rewritten to utilize new sgml.pl." Available in .gz or .zip distribution format.
  • December 09, 1995: Announcement for a new version or Earl Hood's perlSGML. perlSGML is a collection of Perl programs and libraries for processing SGML documents: dtd.pl (2.2.0) -- A Perl library to parse SGML DTDs; dtd2html (1.4.0) -- An SGML DTD documentation/navigation tool; dtddiff (1.1.0) -- List changes in a DTD; dtdtree (1.2.0) -- Generate content hierarchy trees of SGML elements; sgml.pl (0.1.0) -- A Perl library to parse SGML instances; stripsgml (0.1.1) -- Remove SGML markup. Changes: (1) Fixed code so it will run under Perl 4 and 5; (2) MS-DOS usage support; (3) Entity map file syntax has changed to the SGML open catalog format; (4) Support for the envariables SGML_SEARCH_PATH, SGML_CATALOG_FILES; (5) New functions added; (6) Speed improvement; (7) Bug fixes. See the text of the announcement, or link to the WWW page.
  • Links on Earl Hood's page, including demos for DTDs processed (TEI, HTML 2.0, HTML 3.0).
  • FTP from the SGML Repository
  • FTP from Exeter
  • documentation for dtd2html (Earl Hood) via CETHMAC
  • documentation for dtd2html (etc) on Earl Hood's (OAC) Home Page
  • FTP to Darmstadt

Carthage, dpp, and Bison tools by Michael Sperberg-McQueen

[CR: 19970122]

Several SGML grammar tools have been created and made publicly available by TEI editor Michael Sperberg-McQueen. DPP: "DPP is a parser for SGML document type declarations, intended for use as a front end for filters which modify DTDs (e.g. filters to expand all or some parameter entity references, or to rename elements, etc.). Since DPP uses the same output format as sgmls. . .many existing tools for writing filters for SGML document instance . . . can be used with DPP to make filters for DTDs." Bison tools: "The subdirectory pub/tei/grammar/bison contains files with Bison grammars and Flex scanners for SGML document type definitions, SGML document instances, and SGML declarations. See ftp://ftp-tei.uic.edu/pub/tei/sgml/grammar for fuller description of these grammar tools.

Another of the tools is a utility called Carthage. "Carthage is a yacc/lex-based parser for SGML DTDs which can delete references to undeclared elements. It can also do a few other things, depending on the run-time flags you give it." Some options include: (1) dropping or keeping marked sections; (2) warning if entities are declared twice; (3) dropping or keeping parameter entity declarations; (4) deleting named GIs from content models; (5) listing of specified classes of elements in the DTD [used, unused, default undeclared, declared]; (6) dropping or keeping comments in the output file, etc. The software is "unsupported" but "users who improve it or fix errors are requested to notify the author so he can also fix them." [extracts from the README file, dated June 17, 1996.


DTDParse, by Norman Walsh

[CR: 19980409]

"DTDparse reads an SGML DTD and constructs a simple, easily parsed database of its content. This database can be examined to construct other views of the DTD. The DTDparse distribution contains several scripts which use the database to extract useful information about the DTD: (1) parents lists the parents of a particular element; (2) children lists the children of a particular element; (3) dtd2man produces DocBook RefEntry pages ('man' pages in common UNIX parlance) for the components of the DTD; (4) dtd2html [unrelated to Earl Hood's program of the same name] builds an HTML web of the components of the DTD." The documentation page provides sample output for DTDs such as DocBook 3.0, HTML 3.2, ISO 12083 DTDs, TEI Lite 1.6, and the CALS Table DTD.


Fred - The SGML Grammar Builder

[CR: 19980508]

"Fred is an ongoing research project at OCLC Online Computer Library Center, Inc. (OCLC) studying the manipulation of tagged text. As a service to the community, OCLC has decided to make several portions of Fred freely available via a WWW server." These services include (subject to documented limitations): automatic SGML DTD creation from tagged text, grammar reduction (BNF, DTD, and Four-Tuple output formats), and arbitrary transformations.

Links:


NORMDTD (by Richart Light)

[May 1996] "NORMDTD is a DOS (yes!) program that reads a valid SGML DTD, even a TEI-like one that uses marked sections and multiple input files, and generates a single file containing a normalized version of that DTD. The element content models in this normalized DTD will not contain any references to elements that are not declared, and so it can be used by highly-strung SGML packages such as RulesBuilder that refuse to process TEI applications (in particular) for this reason. In fact, having a normalized DTD in a single file can be helpful for a number of reasons, to a variety of SGML applications."

NORMDTD is written in Borland Pascal and runs only under DOS.


Babble - Synoptic Text Browsing/Searching Tool

[CR: 19970628]

"Babble, under development by Robert Bingler at the Institute for Advanced Technology in the Humanities (University of Virginia in Charlottesville), is an SGML-capable synoptic text tool that can display multiple texts in parallel windows. It uses Unicode, an ISO 16-bit character set standard, which allows multilingual texts, using mixed character sets, to be displayed simultaneously. Babble also allows users to search for strings in text or in tags, and to link open texts for scrolling and searching. Currently, Babble runs as an application, and not as an applet . . . Babble was originally prototyped in C++ and Motif++ for AIX 3.25 by Pete Yadlowsky. The current version is written in Java." [from the Home Page]

Note: Babble has been described to me as nominally but usefully SGML-aware. For example: "The search function allows you to search for strings, either in text or--if the file you're searching is marked up in SGML--within tags. When you click on the search button, a dialogue box appears, offering two choices: search in text or in tags, and a character set for the search. It is assumed that SGML tagging will be done in the Latin alphabet, but Babble will allow you to search for a non-Latin string within tags." [from the online documentation]

Links:


IADS: Integrated Authoring and Display System

[CR: 20011019]

"Interactive Authoring and Display System (IADS) was developed as a U.S. Army Missile Command initiative to reduce or eliminate paper documentation. IADS utilizes standard generalized markup language (SGML) to manipulate the text and graphics. The author can chose to display graphics within the text and/or in separate windows." [from the Home Page]


SARA (SGML-Aware Retrieval Application)

The SARA system. SARA (SGML-Aware Retrieval Application) is a client/server software tool allowing a central database of texts with SGML mark-up to be queried by remote clients. The system was developed at Oxford University Computing Services, with funding from the British Library Research and Development Department (1993-4) and the British Academy. The original motivation for its development was the need to provide a robust low-cost search-engine for use with the 100 million word British National Corpus, and several features of the system design necessarily reflect this.

The SARA system has four key parts:

  • the indexing program, which generates an index of tokens from an SGML marked-up text
  • the server program, which accepts messages in the Corpus Query Language (see below) and returns results from the SGML text
  • the SARA protocol, a formally defined set of message types which determines legal interactions between the client and server programs; this protocol makes use of a high-level query language known as CQL (for Corpus Query Language)
  • one or more client programs, with which a user interacts in any appropriate platform-specific way, and which communicate with the server program using the protocol

Links:


Ispell for SGML

[CR: 19970225]

  • Announcement from R. Alexander Milowski of Copernican Solutions Incorporated for a utility that 'spell-checks' SGML documents: Ispell for SGML. Sources are available as a patch to the standard distribution; binaries are also available for Solaris 2.5, and a WIN32 port will be provided in the future. The brief description on the COPSOL WWW site says [970225]: "Ispell for SGML is a version of the ispell spell checker distribution that has been patched to understand and ignore SGML markup. This version is a simple markup scanner that does not assume any further knowledge of the DTD. It purely relies on markup mode scanning as specified in the SGML standard."


Syntext -- the SGML Grammar Grapher

[CR: 19960521]

"SYNTEXT is an SGML DTD providing elements and attributes to mark up text in English for: (1) syntactic structure, including (a) X-bar based parsing, with Government and Binding-style PRO and t, (b)grammatical relations a la Quirk et al. marked as attributes; (2) cohesion ; (3) coreference; (4) conjunctive relations as attributes of sentence specifiers; (5) lexical cohesion as attributes of lexical items; (6) rhetorical figures. Any text marked up for these features and identifying itself as DOCTYPE SYNTEXT is an SGML document and can be browsed in a SGML browser or viewer such as SoftQuad's free Windows browser Panorama or the costwish viewer for X Windows being developed by Peter Murray-Rust. It is an SGML application, the purpose of which is to provide markup for the analysis of syntactic and textual structure; a marked up text can viewed as a tree and in other modes and can be searched with context sensitive and contingent scans, making it very powerful for stylistic analysis (once a passage is marked up!)."

Links:


MtSgmlQL, the SgmlQL interpreter

[CR: 19971216]

"The SGML query language SgmlQL was developed in the context of the MULTEXT project. It is a functional language based on SQL, which enables complex operations on SGML documents, for instance: (1) extraction of parts of an SGML document that satisfy given criteria; (2) tests, counts, and various other computations on SGML elements in a document; (3) construction of new elements and documents using the result of queries. Because SgmlQL is a functional language, all data and program statements are expressions, or queries, which are recursively evaluated. It allows for manipulation of numbers, strings, (SGML) names, elements, attribute-value sets, documents, and (mixed content) lists. A free alpha version for UN*X of MtSgmlQL, the SgmlQL interpreter, can be downloaded to your system for non-commercial, non-military purposes (see the user agreement).

Links:


'sgrep' grep-like searching of structured documents

[CR: 19981210]

Description: 'sgrep' (structured grep) "is a tool for searching text files and filtering text streams using structural criteria. The data model of sgrep is based on regions, which are nonempty substrings of text. Regions are typically occurrences of constant strings or meaningful text elements, which are recognizable through some delimiting strings. Regions can be arbitrarily long, arbitrarily overlapping, and arbitrarily nested. Sgrep is a convenient tool for making queries to almost any kind of text files with some well kown structure. These include programs, mail folders, news folders, HTML, SGML, etc... With relatively simple queries you can display mail messages by their subject or sender, extract titles or links or any regions from HTML files, function prototypes from C or make complex queries to SGML files based on the DTD of the file." Sgrep is distributed under GNU General Public License.

[December 10, 1998] Jani Jaakkola has announced the availabilty of "sgrep-1.90a - An SGML and XML Search and Indexing Tool." Sgrep is a tool to search and index text, SGML, XML and HTML files using structured patterns. New features in Sgrep version 1.90a include: 1) query operators that support direct containment, so that one may query children and parents of given elements; 2) the sources are available under GPL-license for those interested in compiling sgrep; 3) Sgrep now uses GNU autoconf, so compiling sgrep under Unix-systems should be easy; 4) bug fixes. This version of Sgrep contains the sources, Win32 binaries, and binaries for HP-UX, Linux, OSF1 and Solaris. The Win32 binary also includes the m4 macro processor. For more information on Sgrep, see README file or the overview.

[August 29, 1998] Jani.Jaakkola@cs.helsinki.fi (Department of Computer Science, University of Helsinki) posted an announcement for the release of sgrep version 1.71a as the first prerelease of sgrep-2. Sgrep is a tool to search and index text, SGML, XML and HTML files using structured patterns. Features new in version 1.17 include: "1) Indexing of both structure and content; 2) SGML/XML/HTML scanner; 3) both Win32 and i386-Linux binaries; 4) compatibility with older versions of sgrep; 5) no dependence upon 'sgtool'. Features announced for inclusion in sgrep-2 are: 1) Support for querying notations, element type declarations and attribute list declarations inside SGML/XML document prolog; 2) Parsing of all well-formed XML-documents; 3) Proper documentation.

Links:


Inside & Out, from ZGDV

[CR: 19970522]

Inside & Out is a graphical DTD editor created by Hans Holger Rath and Ulrich von Engelberg, of the Computer Graphics Center (ZGDV) in Darmstadt, Germany. It runs under MS-Windows 3.1 (386 PC) with 4 MB RAM. The editor is designed to build SGML DTDs interactively, providing a graphical presentation of the DTD in the shape of a a syntax (or railroad) diagram. Every element and parameter entity definition is shown in a single diagram. All definitions are alphabetically sorted (first all entity, second all element definitions)"

Links:


MU: Forms Assisted SGML Markup

"MU is a perl-based program that builds fill-out forms for SGML editing, based on simple templates. It supports lock files (for networked workgroups), and it is distributed with a TEI-lite template. Demonstrations, source code, help files, and an email list for bug reports and developers are available. . .Features: (1) Helps to automate the SGML markup process; (2) Quite general - works on various types of DTD templates; (3) Version 1.1 deals quite nicely with attributes; (4) Allows for multi-user editorial communication through the use of remarks; (5) Supports internet workgroups via lockfiles."


Markus Hoenicka's SGML/DSSSL Setup for Windows NT

[CR: 19981014]

"These pages describe how to set up a free integrated SGML editing and publishing system running under Windows NT - and, with a few modifications of the installation procedure, also on Windows 95/98 boxes." The documentation provides instructions for the installation of Emacs, Jade, PSGML, Ghostscript, Acrobat, MiKTeX, AucTeX, Jadetex, DocBook, etc.

Links:


SGML Data Conversion, Transformation, and Manipulation

At SGML'96, Boston, November 1996, Tony Graham (Mulberry Technologies, Inc.) presented "Free SGML Transformation Tools." "The criteria for selecting an SGML transformation processing tool are discussed, and the details and SGML-processing features of several free SGML transformation tools are listed."

Rainbow

Several companies have collaborated on the design of an SGML interchange language for word-processing formats. Rainbow makers produce SGML from the supported word-processing formats, preserving as much information about document structure as can be deduced reliably. The Rainbow SGML format can then be used as input to other applications. See further explanation on EBT's server or on the mirrors in the file 'rainbow.why'. Rainbow makers are now available (free) for FrameMaker/FrameBuilder MIF, RTF, Interleaf, and (possibly) Ventura. Authoritative files for the Rainbow distribution are located on EBT's FTP server (SGML Rainbow via ftp.ebt.com/pub/nv/dtd/rainbow/

Other sources for Rainbow makers include:


ICA: Integrated Chameleon Architecture

The ICA (Release 1.6, February 1994) is a toolset for generating data translators. In particular, the toolset can be used to generate translators to and from a constrained subset of instances of SGML Document Type Definitions (DTDs). There are several example translators included in the distribution. The first is a book DTD and includes specific translators for the LaTeX book documentstyle and a specific troff macro package. The second is a bibliographic DTD and includes specific translators for BibTeX and refer bibliographic database formats. Please note that the ICA is for developing translators and not providing translators. The ICA runs in the Unix environment, using the X Window System for the basis of the graphical user interfaces.

A new user's manual for ICA is also available. Published by Prentice Hall, the book is entitled The Integrated Chameleon Architecture: Translating Documents with Style, by Sandra Mamrak, Conleth S. O'Connell, and Julie Barnes. ISBN 0-13-056418-4. This book contains much new and revised material over the previously available online documentation, including a chapter on the ICA and SGML. See also description in excerpts from the release notes.

See further description in the ICA toolkit anouncement, and see network addresses for supporting mailing list. The sources for ICA on the Internet are:


STIL - `SGML Transformations in Lisp'

STIL is a stylesheet language developed by Joachim Schrod (Computer Science Department Technical University of Darmstadt, Germany). "STIL (`SGML Transformations in Lisp') is a style sheet language to create structure-controlled SGML applications. In these applications you have neither access to the DTD nor to the original document source, instead you operate on a tree representation of the document. If you know CoST (the tree mode version) or SGMLSpm, STIL uses the same concept as these style sheet languages. The most obvious difference is the use of Common Lisp instead of Tcl or Perl5.

You define classes for elements that appear in a document, instances of these classes are the inner nodes of the tree. Automatic transformation of attributes to data structures more appropriate in your task domain than simple strings is available. Elaborate handling of PCDATA is supported, too.

The document tree is traversed, you can specify operations (`callbacks') that are triggered at certain points in that traversal. Within these callbacks, you have access to the full tree." [from the README, 1995/09/09]

Links:


CoST (Copenhagen SGML Tool, UNIX)

[CR: 19990628]

[June 28, 1999] Joe English has announced the release of Cost version 2.2, which now provides 'preliminary support for XML'. Cost is a free "structure-controlled SGML application programming tool. It is implemented as a Tcl extension, and works in conjunction with James Clark's nsgmls and/or sgmls parsers. Cost provides a flexible set of low-level primitives upon which sophisticated applications can be built. These include: (1) A powerful query language for navigating the document tree and extracting ESIS information; (2) An event-driven programming interface; (3) A specification mechanism which binds properties to nodes based on queries. Cost can be dynamically loaded into a Tcl application with the usual package mechanism, or it can be statically linked into a custom Tcl interpreter. There is also a command-line interface, costsh, which can be used interactively or as part of a command pipeline. A windowing interface, costwish, is also available for building GUI applications with Cost and Tk. New features in Cost version 2.2 include: (1) It should compile and install out-of-the-box on most Unix platforms, with any Tcl release from 7.5 through 8.1.1 - courtesy autoconf; (2) One can load more than one document at a time, and switch between them with the new 'selectDocument' and 'withDocument' commands; (3) It allows comments at certain places in specifications. (4) It provides preliminary support for XML, courtesy expat by James Clark. Note: XML support is largely untested and has a few known deficiencies (and probably several unknown ones!); I'd appreciate any feedback/bug reports. (5) It is released under a Tcl-style license instead of the 'Artistic' license. (6) Cost can now be loaded as an extension into multiple Tcl interpreters without conflicts. (7) Many minor bugfixes, enhancements, and cleanups."

[1997] "What is CoST? CoST (Copenhagen SGML Tool) is a structure-controlled SGML application programming tool. It is built on top of a public domain SGML tool: the SGMLS parser made by James Clark. With CoST you can write translation specifications for SGML document instances. CoST is purely structure driven, i.e. it gives you access to the structure of the SGML document instance. It won't, however, let you access the lexical and syntactical details in the SGML entities that represent the document instance in storage. You can write CoST programs that will translate SGML document instances or perform other processing in response to SGML documents. You program CoST using TCL - Tool Command Language." [from the Manual Introduction [March 1995]

CoST was written by Klaus Harbo (Klaus.Harbo@euromath.dk) and is maintained by Joe English (joe@flightlab.com).

Links:


costwish - SGML postprocessor and renderer

"Costwish is a graphical interface (SGML postprocessor and renderer) for Joe English's CoST-2 tool. From the README: "costwish is a generic graphical interface to Joe English's CoST SGML/ESIS post-processing tool. It is aimed at those who wish to: (1) run sgmls (or other ESIS-based parser) under a graphical interface; (2) browse their documents graphically (3) customise their postprocessing easily, powerfully and flexibly; (4) construct powerful searches of SGML-based documents; (5) and manage the results interactively; (6) develop interfaces to helper applications (e.g. graphical renderers)." [from the README, April 1996]

Links:


SGMLS.pm and sgmlspl: A Simple Post-Processor for SGMLS and NSGMLS

[CR: 19980423]

SGMLS.pm and sgmlspl were written by David Megginson, and were maintained by him through 1995. The current maintainer [1998] of the SGMLS.pm Perl package is Ingo Macherius (Ingo.Macherius@tu-clausthal.de).

David's description: "SGMLSpm is a free perl5 object-oriented postprocessor for James Clark's SGMLS and NSGMLS parsers. The main part of this release is a library, SGMLS.pm, which repackages the ESIS output of (N)SGMLS into perl5 objects. On top of this, I have built a script, sgmls.pl, for formatting or processing SGML documents quickly using event patterns. Like CoST (which is several times slower), and unlike QWERTZ (etc.), SGMLSpm is a general-purpose package which can be used with any DTD. It even includes a script, skel.pl, which will write a skeleton conversion script for your document automatically!"

"sgmlspl is a sample application distributed with the SGMLS.pm perl5 class library -- you can use it to convert SGML documents to other formats by providing a specification file detailing exactly how you want to handle each element, external data entity, subdocument entity, CDATA string, record end, SDATA string, and processing instruction. sgmlspl also uses the Output.pm library (included in this distribution) to allow you to redirect or capture output."


OmniMark LE

[CR: 19970923]

[September 23, 1997] Announcement for the OmniMark LE, available "at no charge for a limited time." OmniMark is a flagship industry software product -- a leading SGML based "hypertext programming language for development of on-line, Web, CD-ROM and print-on-demand publishing applications, being used for SGML conversion by a wide range of industry-leaders, including over 700 companies in 34 countries." OmniMark LE is a free product which runs utility-sized OmniMark programs. It is described as useful for: "(a) small-sized utility programs; (b) program development on the road away from your commercial licenses (since OmniMark LE will compile a large program -- it won't just run it); (c) evaluating OmniMark V3's capabilities before licensing V3." OmniMark LE is available on many platforms, including Windows NT/95 and many varieties of UNIX. See the program description for other information, or the main database entry.

"OmniMark LE will compile and execute programs that contain 200 or fewer actions in the program source. An action is a statement that OmniMark executes, distinguished from a "rule header" (e.g. an element rule) which describes an event. Within each element rule, one action is not counted towards the 200-action limit. The action count is performed at compile time, not run time; this means that any given action in a 200-action program could execute millions of times."

Links:


LT NSL and NSL (Normalised SGML Library)

[CR: 19970128]

From the Language Technology Group, Human Communication Research Centre, University of Edinburgh: the "Normalised SGML Library (NSL version 2.0 ) . . .consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents."

"LT NSL is a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation. It consists of a C-based API for accessing and manipulating SGML documents and an integrated set of SGML tools. The LT NSL initial parsing module incorporates v1.1.1 of James Clark's SP software, arguably the best SGML parser available. The basic architecture is one in which an arbitrary SGML document is parsed once, yielding two results: (1) An optimised representation of the information contained in the document's DTD; (2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc.

Links:

  • The main entry for the LTG in the "Academic Applications" area of this database
  • January 28, 1997: Announcement from David McKelvie for the HCRC Language Technology Group's public release of LT NSL --- Normalised SGML Library, version 1.4.6. The toolkit offers significant enhancements over version 1.4.4. "LT NSL is an integrated set of SGML querying/manipulation tools and a C-language application program interface (API) designed to ease the writing of C programs which manipulate SGML documents. Its API is based on the idea of using 'normalised' SGML (i.e. an expanded, easily parsable subset of SGML) as a data format for inter-program communication of structured textual information. The API defines a powerful query language which makes it easy to access (either from the shell or in a program) those parts of an SGML document which you are interested in. Both event based and (sub-)tree based views of SGML documents are supported."
  • LTG Home Page [or, no frames: ]
  • LT NSL main page
  • The LT NSL documentation


TclYasp SGML toolkit

Extracts from the announcement by David Durand: "TclYasp integrates a conforming SGML parser with the TCL scripting language. . . Unlike CoST 1.1, it uses an simplest-possible procedure call interface, rather than an eloborate object-oriented interface. . . TclYasp does have a few unique features: it's based on YASP, which is more easily portable (it's in ANSI C and not C++) and was designed to be integrated with an application. Since Yasp is fully re-entrant, more than one parser can be active at a time. It is not restricted to the informationd efined by the ESIS, as the full parser data is available. . . TclYasp/Mac includes a command shell, multiple-pane windows, limited on-screen text formatting, and a variety of interface features as well as the SGML processing stuff."

Links:


Python for XML/SGML Processing

[CR: 19981103]

A few people (at least) believe that Python is well suited for SGML text processing. Sean McGrath wrote that it "beats any other language I know for SGML processing hands down", and Paul Prescod said: "Python is a really easy, incredibly powerful scripting language. . . [it] combines the best features of other scripting languages and borrows many neat features from the Great Languages from history (Simula, SmallTalk, Lisp, Algol)."

Links [provisionally]:

  • Documents on Paul Prescod's Home page: "SGML Processing in Python"; "Using SGML Groves from Python, Visual Basic and other OLE client scripting languages"; "PySgml: A Module (under development) for SGML Processing in Python"; "An Introduction to Groves for Python Programmers."
  • Announcement from Paul Prescod for a series of documents on SGML processing using Python
  • XML and Python - Database section in the XML page.
  • See ParseMe.1st, by Sean McGrath: several chapters illustrate the Python framework for processing SGML information objects; [bibliographic entry].
  • Python module for XML. "Extensible Markup Language Scanner, Checker, and Utilities," from Dan Connolly, May 1997; [local archive copy]
  • Python XML SIG. As of March 17, 1998, a mailing list "has been created for discussing XML and Python, with the goal of developing a set of Python tools for processing XML documents."
  • [November 03, 1998] Python and SGML" - By W. Eliot Kimber. ". . .Its easy-to-use object orientation, its built-in list semantics, and the fact that it's interpreted make it really easy to create the same sorts of programs you might use DSSSL or Balise for, but with a general-purpose programming language that is easy to learn and much more familiar that DSSSL or Omnimark. Python is a free, publicly-developed language, not a commercial product. . ."
  • [February 12, 1998] "XML Programming in Python," by Sean McGrath. In Dr. Dobb's Journal February 1998 [Scripting Languages]. Abstract: "XML brings to the document world what the database world has had for a long time -- interoperability via open systems. Sean shows how you can use Python as a development platform for XML programming. Additional resources include the Python web page, and PXML.TXT (listings from DDJ)." See also the bibliography entry for "XML Programming in Python."
  • XMLParser class in the Python [1.5] distribution. 11.10 Standard Module xmllib: "This module defines a class XMLParser which serves as the basis for parsing text files formatted in XML (eXtended Markup Language)." [from lmg]


I4I S4-Desktop V2.1 SGML middleware

[CR: 19970212]

Educational Support Program: