Publicly Available Software for SGML/XML/DSSSL
Introduction
Priority is given to "public" SGML/XML software in this document database since the scope of interest is mainly the Internet, where the ethic of public gift is highly esteemed. The wealth of SGML software made freely available for public use is evidence of that ethos. As a supplement to the links and information provided on public SGML software below, readers should consult Steve Pepper's "Whirlwind Guide to SGML Tools and Vendors." See the main bibliographic entry for the Whirlwind Guide for a document abstract and detailed information about its contents.
See also the detailed software summary for 207 products extracted from the technical report of Eila Kuikka and Erja Nikunen [updated January 1998]: (a) the full bibliographic entry, or (b) the overview in the "Commercial SGML Software" page. NICE Technologies [November 1996] also has an online database of SGML vendors and products (local archive copy).
Primary sections in this document include the following -- however infelicitous the taxonomy for software categories. See the Contents listing to link directly to a particular description.
- SGML Parsers
- SGML/HyTime Editing, Browsing, and Searching Tools
- SGML Data Conversion, Transformation, and Manipulation
- SGML Formatting Tools
- DSSSL Software Tools
- XML/XSL/XLL Software Tools
Public SGML Software: Table of Contents
- SGML Parsers
- SP: James Clark's SGML Parser Toolkit: SP
- parseDTD - DTD parser package for SP
- Graphical Front Ends for SP
- ARC-SGML: Charles Goldfarb's Almaden Research Center SGML Parser
- ASP-SGML: Jos Warmer's Amsterdam SGML Parser
- SGMLS: James Clark's SGMLS parser
- YASP: Pierre Richard's Yorktown Advanced SGML Parser (or: 'Yet Another SGML Parser'
- YAO (Yuan-Ze--Almaden--Oslo project) Parser Materials
- SGML/HyTime Editing, Browsing, and Searching Tools - For DTDs and instances
- Lennart Staflin's PSGML
- Emacs LISP Mode - sgml-mode.el
- tdtd - Emacs Macro Package for Editing SGML/XML DTDs
- Panorama: SoftQuad's SGML Viewer for WWW
- HoTMetaL: SoftQuad's HoTMetaL editor for HTML
- HyBrick - SGML/XML Browser
- The WP Project
- GRIF Symposia: "A Collaborative Authoring Tool for the World Wide Web"(HTML and XML)
- perlSGML - Perl programs and libraries (Earl Hood)
- Carthage, dpp, and Bison tools by Michael Sperberg-McQueen
- DTDParse, by Norman Walsh
- Fred - The SGML DTD/Grammar Builder
- NORMDTD (by Richart Light)
- Babble - Synoptic Text Browsing/Searching Tool
- IADS: Integrated Authoring and Display System
- SARA (SGML-Aware Retrieval Application)
- Ispell for SGML
- Syntext -- the SGML Grammar Grapher
- MtSgmlQL, the SgmlQL interpreter
- 'sgrep' grep-like searching of structured documents
- Inside & Out, from ZGDV
- MU: Forms Assisted SGML Markup
- Markus Hoenicka's SGML/DSSSL Setup for Windows NT
- SGML Data Conversion, Transformation, and Manipulation
- Rainbow
- ICA: Integrated Chameleon Architecture
- STIL - `SGML Transformations in Lisp'
- CoST (Copenhagen SGML Tool, UNIX)
- costwish - SGML postprocessor and renderer based upon CoST
- SGMLS.pm and sgmlspl: A Simple Post-Processor for SGMLS and NSGMLS
- OmniMark LE
- LT NSL and NSL (Normalised SGML Library)
- TclYasp SGML toolkit
- Python for XML/SGML Processing
- I4I S4-Desktop V2.1 SGML middleware
- SENG: SGML/Scheme Transformation Engine
- SGML-SPGrove
- SGMLC (-Lite) products for MS-Windows
- SGML Formatting Tools
- format: Thomas Gordon's QWERTZ SGML -> LaTeX formatting package
- gf: Gary Houston's general formatter program
- Jörg Wittenberger's Typeset Package
- Jörg Wittenberger's SDC Package
- SGML-Tools [Was: Linuxdoc-SGML]
- TEItools
- MetaMorphosis - SGML/XML Tree Transformer
- gmat: an SGML Publishing System
- SGML2TeX - SGML-to-TeX converter
- Ken MacLeod's Generalized Document Objects (GDO)
- tei2latex - TEILITE to LaTeX2e
- DSSSL Software Tools
- Jade - James [Clark]'s DSSSL Engine
- Jade MIF Backend
- YADE (Yet Another DSSSL Engine)
- DSC---DSSSL Syntax Checker
- DSSSL Developer's Toolkit
- Kawa - Java-based Scheme system (SENG)
- psgml-dsssl
- panodssl
- psgml-jade
- Jadetex Package
- DSSSL editing under emacs (dsssl/scheme mode)
- SGML/DSSSL Presentation Development Application
- XML/XLink/XSL Software Tools
- Lark, an XML processor
- DXP - DataChannel XML Parser
- [NXP - Norbert's XML Parser]
- Microsoft XML parser in Java (MSXML)
- XP, an XML parser in Java (James Clark)
- expat - XML parser in C (James Clark)
- [XMLTok - XML parser in C (James Clark)]
- SX - An SP application for SGML to normalized XML
- SAX - the Simple API for XML
- FREE-DOM - W3C DOM API using SAX (formerly: SAXDOM)
- Saxon: An Open-Source XSLT Processor
- XAF - an XML Architectural Forms Processor
- XML Testbed - Java XML application environment
- DAE SDK and DAE Server SDK (Copernican Solutions)
- IBM XML for Java - validating XML processor in Java
- JUMBO - XML browser/editor
- LT XML - XML toolset
- RXP XML (SGML) parser program
- XED - A WYSIWYG XML instance editor
- Ælfred XML Parser
- DataChannel XML Development Environment (DXDE)
- Tcl XML Parsing Package
- XML Editing Mode in PSGML
- XSLJ: Jade-compatible XSL-to-DSSSL translator
- docproc - an XML + XSL document processor
- DTDGenerator - XML DTD Generator
- Near & Far Designer - DTD Design Tool
- The Ace Scripting Language
- HXA/HXP - Hubick's XML Analyzer, Parser
- Microsoft XML Notepad
- xmlproc: A Python XML parser
- xmlarch.py: An XML architectural forms processor
- DB2XML
SGML Parsers
SP: James Clark's SGML Parser
[CR: 20001011]
James Clark's SP parser toolkit is the successor to his SGMLS parser. Formally, SP is "An SGML System Conforming to International Standard ISO 8879 -- Standard Generalized Markup Language" [and] "A free, object-oriented toolkit for SGML parsing and entity management."
[October 11, 2000] SP development (OpenSP) in the OpenJade project. OpenJade Source Control Repository Home Page". See also the project summary page. Contact Matthias Clasen. OpenSP-1.4, cache. See also OpenSP-1.5 pre-release in CVS.
[March 2000] New Version of OpenSP from the OpenJade Team. Matthias Clasen (Mathematisches Institut, Albert-Ludwigs-Universität Freiburg) has announced the availability of a new version of OpenSP (OpenSP-1.5pre1). OpenSP is a variant of James Clark's SP SGML parser, maintained by the OpenJade team. "The OpenJade team has made a prerelease of OpenSP-1.5 available at ftp://openjade.sourceforge.net/pub/openjade/OpenSP-1.5pre1.tar.gz. Changes in version 1.5 include: (1) More of Annex K supported: Common data attributes can now be specified in external entity declarations. (2) The architecture engine supports #MAPTOKEN. (3) The multibyte version of OpenSP now uses 32bit chars and supports the full UTF-16 range 0x0000-0x10ffff." Bugs in the release should be sent to the development team at jade-bugs@infomansol.com." OpenJade "is a project undertaken by the DSSSL community to maintain and extend Jade. OpenJade is distributed under the same license as Jade. Jade is James Clark's implementation of DSSSL -- Document Style Semantics and Specification Language -- an ISO standard for formatting SGML (and XML) documents."
[March 10, 1998] See the announcement from James Clark for the public availability of SP version 1.3 and Jade version 1.1. "The main change in SP 1.3 is better support for XML based on the Web SGML TC. In Jade 1.1 the main changes are the experimental extensions for XSL (documented in dsssl2.htm), and the use of XML for the FOT backend's output." See Clark's Web site for detailed information. Note to SP and Jade users who depend upon the architectural processing support: the appropriate ArcBase processing instruction is now <?IS10744 ArcBase DSSSL>, and no longer <?ArcBase DSSSL>; SP and Jade will now require the former, on penalty of an error message (ca.) "jade:E: specification document does not have the DSSSL architecture as a base architecture. . ." or similarly. Thanks to Eliot Kimber (ISOGEN International) for clarification on this point. Also: Jade 1.1 and sp 1.3 for OS/2 provided by David J. Birnbaum.
[February 16, 1998] An announcement from James Clark for a new test release of SP (version 1.2.92) and Jade (version 1.0.93). The main changes in Clark's SP package since version 1.2.91 are enhanced support for XML based on the final WebSGML Adaptations Annex (ISO 8879 Annex K) and the inclusion of the SX application (for converting SGML to normalized XML). [SP version 1.2.92 and Jade version 1.0.93, sources, archive copy]; [SP version 1.2.92 and Jade version 1.0.93, Win32 binaries, archive copy]
[October 17, 1997] An announcement from James Clark describes a test release of SP with improved XML support. This test/experimental version is available via FTP as part of a Jade test release: source, or Win 32 binaries. In this distribution, SP supports "a number of key features from the WebSGML SGML TC," including: unbundling of SHORTTAG, feature to allow elements declared EMPTY to have end-tags, duplicate enumerated attribute tokens are allowed, support for multiple ATTLIST declarations for a single element type, relaxation of rules on use of parameter entity references inside groups, feature that turns off SGML's traditional record end rules, NESTC (net-enabling start tag close) delimiter, support for predefined single character entities in the SGML declaration (lt, amp etc), etc. See the text of the announcement for full details about this SP test release.
[September 03, 1997] As of this time, the most recent version of SP is also available as part of James Clark's Jade package.
[October 28, 1997] Announcement from James Clark for a "very preliminary release of SX, an application built with the SP library for converting SGML to XML." This tool will eventually be included in the standard SP distribution. SX (the provisional name) "parses and validates the SGML document contained in sysid... and writes an equivalent XML document to the standard output. SX will warn about SGML constructs which have no XML equivalent." The distribution includes both source and Win 32 binaries (the sp120u.dll file included in the SP 1.2.1 Win32 Unicode binary distribution is required). Note that the program "does not yet provide enough to handle the situation where you want to migrate your document source from SGML to XML. In particular it doesn't try to preserve entity references; all entities are expanded."
Note: this paragraph is not up-to-date for SP version 1.2, released in September 1997; see the official documentation, and/or the links in the description of SP version 1.2. . . The current version is SP 1.1.1 (July 30, 1996). SP is a "free, object-oriented toolkit for SGML parsing and entity management." SP is written in C++, supports the LINK feature, is reentrant (a single process can use multiple parsers at the same time), is command-line compatible with SGMLS, includes an application [nsgmls] to generate sgmls-style output format, and an application [rast] to generate RAST output format (like SGMLS) conforming to ISO/IEC 13673:1944. Other parser tools include [sgmlnorm], a simple SGML tag normalizer, and [spent], a facility for printing an SGML entity on standard output. SP supports any concrete syntax allowed by ISO 8879, and supports large character sets (can be compiled to use 16-bit characters internally; supported systems include UTF-8, Unicode/UCS-2, UJIS/EUC, and Shift-JIS). It is said to be fast for large documents. In addition to the C++ source code, binaries [nsgmls and rast] are available for MS-DOS (SP version 0.2) and several UNIX systems. The MS-DOS binaries use a 32-bit DOS extender (included in the distribution), so that the MS-DOS 640K conventional memory barrier should not be a limiting factor in the use of SP.
In the most recent releases of SP, James Clark has also issued some very useful tools that handle entities and "normalize" SGML documents in various ways, as specified in command line options. For example, SPAM (SP Add Markup) will provide canonical SGML when SHORTTAG and OMITTAG have been used in the SGML source. The output SGML is determined by the user's specification. SPAM (SP Add Markup) thus serves as a markup stream editor. See the documentation from the official site for complete details. Version 1.1 also supports Architectural Form Processing [mirror copy], on which, see the following "toy example".
[April 10, 2000] XML Base Architectures in SP. Steve Newcomb writes: "You can now use SP to validate the conformance of XML documents to base architectures (meta-DTDs). TechnoTeacher has created a version of SP with full industrial-strength support for the alternative PI-based "Base Architecture Declaration" syntax. The enhancement builds on pioneering work done by Luis Martinez while he was working at TechnoTeacher, and it has recently been brought up to industrial strength by Peter Newcomb. Because of urgent need in certain industrial quarters (mortgage, healthcare, etc.), we've placed binaries of this version of SP at our FTP site: ftp://ftp.techno.com/TechnoTeacher/SPt..." [cache]
[September 1996] Commercial support for SP is provided by TechnoTeacher, Inc. - NB, James Clark himself has no commercial connection with TechnoTeacher, Inc. See the support announcement.
[November 25, 1997] See the announcement for a GC-enabled spgrove application, from Vladimir V. Tsychevski.
Other links:
- [September 03 [09], 1997] Announcement from James Clark for the release of SP version 1.2 -- the version of SP included with Jade version 1.0. New features in SP version 1.2 (other than bug fixes) are as follows: (1) "The Extended Naming Rules TC is supported. The extensions supported in external concrete syntaxes have been changed for compatibility with this [Extended Naming Rules were specified in Annex J of ISO 8879:1986, added by the 1996 TC = TC for Extended Naming Rules for SGML: N1896Rev]; (2) The handling of character sets in the multi-byte version is more sophisticated. The character sets HTML page gives more information.; (3) SP has built-in knowledge of many more base character sets; (4) nsgmls will report empty elements if the
-oemptyoption is used." SP 1.2 etc. "adds support for (XML) documents that are merely well-formed. This is enabled by using-wno-valid. There's also an undocumented-wxmlswitch that warns about various things that are legal SGML but not XML." See the main SP page on James Clark's WWW server for the full documentation. SP 1.2 is available in several packages, including source code and binaries with Unicode support for Windows 95 and Windows NT . - Hints about enhancements possibly in SP version 1.2, from test version 1.1.2; see summary on the "What's New" page (February 18, 1997). Note the update from James Clark: "A new release of SP is available as part of Jade 0.5 from ftp://ftp.jclark.com/pub/jade/jade0_5.zip. This fixes the compilation problems with gcc as well as a couple of other minor glitches. This SP release should be considered a beta release. [February 21, 1997]
- Announcement from James Clark for version 1.1.1 of SP. Version 1.1.1 represents a minor revision: "The only serious bug 1.1 is [was] the incorrect handling of colons in SGML_CATALOG_FILES on MS-DOS and Windows machines."
- For configuration of SP and Jade, note that Henry S. Thompson (HCRC Language Technology Group, University of Edinburgh) also has a 'configure' file (uses 'install' - from X11R5, mit/util/scripts/install.sh); it has been tested for Jade 1.1. [local archive copy, 1998-09-25]; [local archive copy, earlier version]
- Compilation notes: Notes on compiling and installing SP 1.0.1 for several systems, by Nelson H. F. Beebe (Email: beebe@math.utah.edu) [ mirror December 26, but use the canonical version if possible]. As of 15-November-95, Nelson Beebe had successfully compiled SP 1.0.1 on these systems: DEC Alpha OSF/1 3.0, DECstation 3100 and 5000 ULTRIX 4.3, Hewlett-Packard 9000/735 HP-UX 10.0.1, IBM RS/6000 AIX 3.2.5, Silicon Graphics Indigo/2 IRIX 5.3, Sun SPARCstation SunOS 4.1.3, and Sun SPARCstation Solaris 2.3 and 2.4.
- Programming with SP
- A mailing list for programmer-level discussions of SP. See also the entry in the lists page. Mail subscription requests sp-prog-request@jclark.com. Messages for the list should go to sp-prog@jclark.com.
- Nelson Beebe's collection of binaries for various Unix machines
- [*NB June 10, 1996. The following links are somewhat out-of-date; see the main site] Some very accessible written description and documentation (HTML format) from a recent [June 10, 1996] version (test release 1.1). See the current and test releases (JClark FTP server) for more current and more complete information. See here provisionally (with some incomplete linking):
- Summary of SP's features
- What's new in SP?
- How to get SP
- nsgmls, a replacement for sgmls [incompletely linked here]
- spam, a sophisticated normalizer, perhaps better thought of as a markup stream editor [incompletely linked here]
- Generic API to SP
- Catalogs: Using SGML Open catalogs to generate system identifiers
- sp-1.1.1 with gcc 2.7.2 under Solaris 2.5 (binaries)
Pointers to the latest released version of the SP parser (version 1.0.1: October 21, 1995) and its description:
- SP - an SGML Parser (Official WWW Page for SP)
- SP source-code changes for "port of James Clark's SP version 1.1.1 to Mac PowerPC as a set of MPW tools, including Open Transport support for HTTP" December 1996. [Ashley Colin Yakeley]
- FTP to JClark. Data is mirrored on the SGML Repository FTP server
- FTP to Darmstadt
- WWW link to JClark
- Overview of SP (described by James Clark) with links to FTP server
- Formatted man pages for (version 0.2) nsgmls and rast applications. [Note: get version 0.3 now]
- See also: David Megginson's SGMLS.pm: A Post-Processor for SGMLS and NSGMLS
- [July 16, 1998] Porting SP 1.3 to Macintosh, by Peter Robinson
parseDTD - DTD parser package for SP
[CR: 19980612]
[February 06, 1998] From Peter Newcomb, of TechnoTeacher Inc.: parseDtd. It parses an SGML declaration set in the absence of a document (e.g., can parse a DTD and spit out information about the elements and attributes defined in it). It is based on the SP SGML parser, version 1.2.1, written by James Clark. Peter's description: "I recently put together a small SP-based package that parses declaration sets irrespective of particular documents, returning the result as an SP DTD object."
Links:
- Information: (FTP Directory)
- Sources: ftp://ftp.techno.com/TechnoTeacher/parseDtd/parseDtd.zip
- [June 12, 1998] Patch to update parseDtd for SP 1.3
- The README file; [local archive copy, 980508]
- Some discussion about parseDTD in email
- Source, local archive copy, 980508
Graphical Front Ends for SP
[CR: 19971028]
Probably there are several such front ends. [Please let me know what's missing in the list below.]
- SP Wizard: Advertised functionality: ". . . a freeware 32 or 16 bit Windows interface using OLE Automation wrappers around NSGMLS and SPAM. (1) Allows you to interactively change settings of all command line parameters and environment variables. (2) Allows multiple files to be parsed at the press of a button. (3) Displays clickable error messages which puts the cursor in front of the offset within the line that was in error. (4) Allows you to correct errors as you find them. (5) Search and Replace. (6) Undo up to 32000 characters at multiple levels. (7) Prints reports of error messages and files that parsed with no errors. (8) OLE Automation for NSGMLS, SPAM and execution of DOS programs which can be used from Visual Basic and Visual C++. (9) All SP files were taken from the SP 1.1.1 distribution."
- Apropos of the above: Announcement from Larry Robertson for "a web page with a sample program and some notes on the Grove OLE Automation class. . . The Grove OLE Automation Class is basically intended for parsing and fully supports the 9401 catalog; it is extremely fast and easy to use." Title: How to use the Grove OLE Automation Class in Visual Basic 5.0. "The sample program will batch parse sgml and html files. It will print reports has a very simple editor." [September 13, 1997]
- CSW Parser Plus. "CSW Parser Plus is a graphical front end for the popular SP parser, running under Windows NT/95. With CSW Parser Plus, its easy to set up options for the SP parser and process SGML files one at a time, or in batches. . . CSW Parser Plus is packed with useful features to help set up and run the SP parser, including: (1) set the SGML Declaration and DTD; (2) process one document file, or a batch of files; (3) view errors on screen, or redirect to a file; (4) set warning and output options; (5) define locations for multiple catalog files; (6) launch editors and processing tools"
- RUNSP2: a user-friendly Windows shell for NSGMLS, from Richard Light. "RUNSP2 is designed to let you run the NSGMLS parser in a Windows environment. It provides standard Windows facilities for opening a file to be parsed and running the parser, but goes beyond that by 'reading' the error messages, and providing a helpful editing environment in which the user can correct the errors found. The original idea was to support all the command-line options of NSGMLS via menu options or a dialog box, and I will go on to do this if the basic idea works well enough to justify the effort. At present this program just runs the parser (NSGMLS) and the simple normalizer (SGMLNORM). Later, I may extend it to run all the programs in the SP suite." source, and local archive copy [September 18, 1997].
- See also Groves and Grove Plans in SGML/DSSSL/HyTime
ARC-SGML: Charles Goldfarb's Almaden Research Center SGML Parser
ARC-SGML was one of the first SGML parsers to be made publicly available, and it provided the basis for the development of SGMLS by James Clark.
- ARC-SGML from the SGML Repository
- ARC-SGML from Exeter
SGMLS: James Clark's SGMLS parser
[CR: 19970909]
SGMLS is probably the most widely used "public domain" parser as of late 1994. It has been incorporated as a validating parser into several commercial products as well. It is superseded now in part by James Clark's "SP" parser (and perhaps by the YASP and YAO parser materials) though for many simple validation tasks, SGMLS remains quite useful. SGMLS is also very fast. Its output is intended for a structure-oriented application, and this output is trivially parsable. SGMLS has been ported to many platforms, including OS/2.
- Get SGMLS Source (James Clark): Remote file ftp.jclark.com/pub/sgmls/
- SGMLS sources mirrored at SGML Repository
- SGMLS sources mirrored at Exeter
- [September 09, 1997] Macintosh versions of the SGMLS parser are available from the Brown University Scholarly Technology Group SGML Archives, maintained [September 1997] by David G. Durand. URLs: (1) 68K version: ftp://ftp.stg.brown.edu/pub/sgml/sgmls_68K.hqx; [local archive copy]; (2) FAT version: ftp://ftp.stg.brown.edu/pub/sgml/sgmls_FAT.hqx; [local archive copy]; (3) Power PC version: ftp://ftp.stg.brown.edu/pub/sgml/sgmls_PPC.hqx; [local archive copy]. [thanks to Elli Mylonas for the URLs]
- An SGMLS help file prepared by Michael Sperberg-McQueen [September 1993, revised January 1994] explains the SGMLS entity manager's use of the environment variable SGML_PATH and other strategies for locating entities. The document is available in HTML format: "Notes on sgmls handling of search for entities" [mirror copy, January 1996]. Or obtain it via FTP from the SGML Repository or via email from the TEI/UICVM Listserver. In the latter case, send email with the command GET SGMLSENT DOC TEI-L in the body of the email message to listserv@uicvm.uic.edu). Or get a text version of the help file from the local WWW server.
YASP: Pierre Richard's Yorktown Advanced SGML Parser (or: 'Yet Another SGML Parser')
[CR: 19970405]
- [April 1997.] Announcement from Christophe Espert (Electricité de France, Direction des Etudes et Recherches) for a new release of the YASP SGML parser interface. YASP has been implemented as a DLL for Windows NT and Windows 95, but the source code may also be compiled on Unix and other systems. The new version of YASP (1.36) has functionality "that will help enhance GROVE building in applications. YASP now reports ELEMENT, ATTLIST, NOTATION and ENTITY declarations as it parses them. YASP still gives access to the fully resolved DTD after the document prolog has been parsed. Therefore objects of classes in the PRLGABS0, PRLGABS1 and PRLGSDS modules can be built."
- Announcement from Christophe Espert (Electricité de France, Direction des Etudes et Recherches) for the availability of YASP ('Yet Another SGML Parser', developed by Pierre G. Richard), on Windows 95 and Windows NT. August 27, 1996. URL: ftp://ftp.edf.fr/pub/SGML/YASP.
- Announcement from Christophe Espert for a new distribution package for YASP, for DOS and Windows (July 1996); [winyasp.zip, 1258734 bytes] "It includes source code, documentation and binaries for Windows. The YASP library is a Dynamic Link Library. It has been built with Visual C++. . ."
- April 1997 sources: ftp://ftp.edf.fr/pub/SGML/YASP; archive copy
- April 1997: documentation in PDF format
- FTP YASP from the SGML Repository
- FTP YASP from Exeter
- FTP: ftp://ftp.edf.fr/pub/SGML/YASP (A new package for the YASP parser, available for UNIX; from Christophe ESPERT ]Christophe.Espert@der.edf.fr], February 1996)
- See also the TclYasp SGML toolkit
YAO (Yuan-Ze--Almaden--Oslo project) Parser Materials
- FTP YAO from the SGML Repository
- See the description of the Project YAO: "Project YAO Announced [December 7, 1993]," <TAG> 7/1 (January 1994) 20.
- Pekka Kataja's UNIX port of YAO
PSGML, by Lennart Staflin
[CR: 20001201]
PSGML is described as "a major mode for editing SGML and XML documents. It works with GNU Emacs 19.34, 20.3 and later or with XEmacs 19.9 and later [perhaps also Lucid Emacs 19.9, OEmacs, NTEmacs]. PSGML contains a simple SGML parser and can work with any DTD. Functions provided includes menus and commands for inserting tags with only the contextually valid tags, identification of structural errors, editing of attribute values in a separate window with information about types and defaults, and structure based editing." David Megginson's personal testimonial: "XEmacs+PSGML is my editor of choice for all of my XML and SGML work. I've used it to create probably close to 10,000 printed pages of documentation over the last few years, and have used XEmacs's regular-expression facilities for adding complex markup to e-texts. It's probably not suitable for naive users (give 'em XMetaL or WordPerfect, or maybe XED), but for the tech-savvy, it's great." [XML-DEV]
[December 06, 2001] "Using Emacs for XML Documents. Install add-ons to the powerful Emacs text editor to build a platform-independent (and free) environment for working with XML." By Brian Gillan (Software engineer, ID Technology and Design Group, IBM). From IBM developerWorks XML Zone. December 2001. ['Emacs, best known as a powerful text editor for UNIX developers, can be an ideal XML editor for MS-DOS, Windows, and MacOS. The author describes how to install the right add-on packages and modify settings to create a powerful XML/SGML editing-and-validation environment in Emacs with extensions such as PSGML and OpenSP. Most of the work involved in setting up this environment ends with downloading and installing Emacs and the individual packages, but you must also configure Emacs properly and enable the DTDs you plan to work with. The article includes sample configuration files and XHTML DTDs.'] "Though it's best known as a powerful text editor favored by UNIX developers, Emacs can be used to work with XML in non-UNIX platforms such as Windows, MS-DOS, and MacOS. Emacs works as a full-blown development environment for processing text, writing applications, and, as I'll discuss, creating structured information like XML and SGML. I use it as a general-purpose editor for creating and managing some of my programming projects, and for writing XHTML and playing around with SGML and XML. In fact, I used it to write this article. This article tells how to install Emacs and the extensions PSGML and OpenSP. It also outlines how to customize Emacs to make it function with a variety of DTDs. I present many of the Emacs customizations one piece at a time. However, you can download a zip file with sample DTDs and all of the Emacs customizations. My intent is to get you started using Emacs by providing you with just enough information for you understand what's going on. Then you'll be able to add DTDs and customize Emacs based on your needs and preferences..." PSGML version 1.2.3 was released on SourceForge November 8, 2001; see the download. [PSGML version 1.2.3, November 8, 2001, cache]
[December 01, 2000] Update notice 2000-10-27. "The future of PSGML: It is currently not in active development. I plan to put out one or two bug fix releases and the move the sources to source forge (possibly after restructuring the code a bit and merging in various patches and additions that has been send to me.) I will then invite others to take an active part in the future development of PSGML. To start this I have created two mailing lists on source forge. A psgml-user for general discussion and questions about PSGML and psgml-devel for discussion about the future development of PSGML. Visit the SourceForge: Mailing Lists for PSGML page for subscription information..."
- Description HTML version of PSGML
- [March 2001] See the source for PSGML version 1.2.2, from SourceForge.
- [October 14, 1999] Staflin released a beta version (1.2.0) with XML editing support. [local archive copy]
- [1999-10-14] Kai Grossjohann described a problem with incompatible system identifiers when using psgml to edit XML documents; David Megginson supplied the lisp code for a provisional fix.
- See also David Megginson's enhancements for XML Editing Mode in PSGML and psgml-dsssl (DSSSL editing mode). Updated 980223 and possibly later.
- Miyashita Hisashi has reportedly implemented a version of PSGML-XML that works on Meadow. Meadow ('Multilingual enhancement to gnu Emacs with ADvantages Over Windows') is a fully internationalized version of Emacs20 on MS Windows.
- Version 1.0.1 (November 20, 1996); [archive copy]
- [December 16, 1998] Bob DuCharme posted an announcement for the online availability of Chapter 2 of his book, SGML CD: "Editing SGML Documents with the Emacs Text Editor." This Adobe Acrobat version of Chapter 2 (99 pages) "assumes no initial knowledge of Emacs and provides a basic introduction to creating and navigating simple text files before it covers PSGML - Lennart Staflin's add-in that turns Emacs into a menu-driven, validating, SGML/XML editor." Bob says: "The SGML CD book is a tutorial and user's guide to free SGML/XML software, and you can link to all the software from the web page whether you want to buy the book or not. I have my own time- and keystroke-saving PSGML tricks (mostly in the form of
.emacslines) and I'm curious about those of other PSGML users, so I'll be posting a Web page of my own and soliciting those of others to add in a few weeks. Feel free to send them to me anytime; I'll credit all contributors." - See Markus Hoenicka's SGML/DSSSL Setup for Windows NT - including PSGML
- Editing SGML with Emacs and PSGML - Manual
- PSGML and Fonts. David Megginson explains how to map font faces to any or all of the symbols 'comment', 'doctype', 'end-tag', 'entity', 'ignored', 'ms-end', 'ms-start', 'pi','sgml', 'short-ref', 'start-tag' and so forth. This works! [June 1997]
- Another discussion (TEI-L) on fontifying/colorizing with PGSML; see also (in greater detail) David Megginson's recipe above.
- SGML: Lysator PSGML (Remote file ftp.lysator.liu.se/pub/sgml)
- FTP PSGML from the SGML Repository
- FTP PSGML from Exeter
- Setting up PSGML and sgmls for HTML, or try: this link; (courtesy of Martijn Koster, m.koster@nexor.co.uk)
- [October 14, 1998] PSGML setup instructions, provided by Peter Flynn
- [August 09, 1997] Announcement from David Megginson (Microstar Software Ltd.) for initial enhancements of PSGML to enable an XML editing mode: ". . . I patched PSGML to add an XML mode that enables XML-specific delimiters, parsing, and error-reporting -- in other words, it's a real, native XML DTD-driven editor." The new code for XML support has not yet been incorporated into the main psgml distribution, but Megginson is requesting assistance from qualified alpha testers to help debug the code.
tdtd - Emacs Macro Package for Editing SGML/XML DTDs
[CR: 20011102]
[June 09, 2001] The web site URL for 'dtd -- Emacs Major Mode for SGML and XML DTDs' is http://www.menteith.com/tdtd/. The latest version is 0.7.1. Features of tdtd revision 0.7.1 include: (1) Standalone mode for editing DTDs; (2) "Goto" menu for locating declarations within the current buffer; (3) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; (4) dtd-grep function for searching files that shares a file history with dtd-etags for easy searching of the same files with both functions; (5) Specific font lock highlighting of declarations in XML DTDs, SGML DTDs, SGML Declarations, and System Declarations so that the important information stands out; (6) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; (7) Functions for writing and editing element, attribute, internal parameter entity and external parameter entity declarations and comments to ease creating and keeping a consistent style; and (8) Elements and parameter entity names referenced in declarations are stored in minibuffer history to minimise retyping in new declarations..." [cache cersion 0.7.1]
In March 1999, Tony Graham (Mulberry Technologies, Inc.) released an updated version of his tdtd 'Emacs Major Mode for SGML and XML DTDs'. Features in revision 0.7: (1) Standalone mode for editing DTDs; (2) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; (3) dtd-grep function for searching files that shares a file history with dtd-etags for easy searching of the same files with both functions; (4) Specific font lock highlighting of declarations in XML DTDs, SGML DTDs, SGML Declarations, and System Declarations so that the important information stands out; (5) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; (6) Functions for writing and editing element, attribute, internal parameter entity and external parameter entity declarations and comments to ease creating and keeping a consistent style; (7) Elements and parameter entity names referenced in declarations are stored in minibuffer history to minimise retyping in new declarations."
[August 03, 1998] Update of the tdtd emacs macro package for editing SGML/XML DTDs.
[May 27, 1998] The tdtd Emacs Macro Package for editing SGML/XML DTDs was updated by Tony Graham on May 24, 1998. Version 0.5.1 features: "1) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; 2) Font lock highlighting of declarations so that the important information stands out; 3) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; 4) Functions for writing and editing declarations and comments to ease both creating and keeping a consistent style."
Previously: Tony Graham (Mulberry Technologies, Inc.) announced the availability of a tdtd Emacs Macro Package for editing DTDs (revision 3, December 14, 1997). The macro package was presented in a poster session at SGML/XML '97. The macros have been developed "intermittently over the last two years." Tony says: "The tdtd macro package for an Emacs major mode for editing DTDs is available at ftp://ftp.mulberrytech.com/pub/tdtd. The package includes font lock keywords for colour highlighting of declarations and reserved words plus a collection of macros that help when writing DTDs. The dtd-mode is a derived mode that builds on sgml-mode, and the features of sgml-mode are still available." The author will gladly accept bug reports and/or enhancements.
Links:
- [March 22, 1999] dtd Version 07, March 15, 1999. [local archive copy] See also the 0.7 README document.
- [August 03, 1998] Announcement for the 0.6 release of tdtd. - The current revision is 0.6, dated August 1, 1998 [or later].
- Sources, version 0.6, archive copy
- Version 0.6 README
- [May 27, 1998] Announcement for the 0.5 release.
- [April 22, 1998] Update of the macro package to version 0.4. Changes to 'tdtd-font.el' include the addition of '(WWW)' and 'xml' as reserved words.
- Sources via FTP: ftp://ftp.mulberrytech.com/pub/tdtd
- README document
- Local archive copy, version 0.5
- Local archive copy, revision 4; April 21, 1998.
- Local archive copy, revision 3; December 1997.
- Also by Tony K. Graham of Mulberry Technologies, Inc.: xslide. The xslide package features an Emacs major mode for editing XSL stylesheets.
Panorama: SoftQuad's SGML Viewer for WWW
[CR: 19980408]
SoftQuad Panorama is a free version of SoftQuad Panorama PRO. It supports browsing (and searching?) of fully compliant SGML documents on the WWW.
- Panorama is now released (May 1995) as the "First Freeware SGML Viewer for the World Wide Web". See the public announcement and the Information Page: The Wider World of SGML on the Web. If you already have Panorama, link here
- PanoramaFree for Windows 3.1; [mirror copy]
- Register/Download Panorama Viewer [May 1997]
- The software and documentation are also available from sites in Sweden
- A list of links and resources from the University of Michigan Humanities Text Initiative: HTI Resources in support of Panorama (DTDs, style sheets, navigators, SDATA mapping files, etc.)
- University of Michigan Press ISO 12083 stylesheet (Panorama); [mirror copy]
- See provisionally the description of the commercial version, called Panorama PRO [announcement of the pre-release edition of Panorama PRO, 8-April-95]
- See a brief overview of features in Panorama's style sheet language
- Notes on Panorama for users of the EAD DTD [principles applicable to other complex DTDs. From Stephen D. Miller. [mirror copy of help document, .ZIP file with help, catalog, and entityrc]
- See help information for use of Panorama with the TEI (Lite) DTD
- See the description of SoftQuad Panorama on SoftQuad's WWW server [from a press release, October 19, 1994]; see also the announcement and feature list in mirror copy here.
- See "SoftQuad Panorama -- A Companion for Mosaic," <TAG> 7/11 (November 1994) 9.
- Eliot Kimber: Nifty...Panorama
- Scholar's Press public domain Greek font (SPIonic), and an sdata.map for adding support for SPIonic to SoftQuad's Panorama
HoTMetaL: SoftQuad's HoTMetaL editor for HTML
HoTMetaL is an unsupported version of the commercial product HoTMetaL Pro. It provides an editor/browser for (extended) HTML documents. HoTMetaL is available on a number of platforms (UNIX, MS-Windows, etc.). A tutorial for HoTMetaL Pro teaches HTML basics, supported by an HTML Quick Reference guide. The most recent [March 1995] Windows version of HoTMetaL supports some of the Netscape extensions (e.g., <CENTER>, <BLINK>), displays graphics inline, uses a stylesheet configured to look like a standard HTML browser, and supports a filter for loading plain text files and invalid HTML documents. See the posted public announcement or the fuller description on the SoftQuad server, including FTP location. Try the FTP directory ftp://ftp.ncsa.uiuc.edu/Web/html/hotmetal/Windows, and specifically the binary file ftp://ftp.ncsa.uiuc.edu/Web/html/hotmetal/Windows/hotm1new.exe).
- FTP from SGML Repository
- FTP from Exeter
- HoTMetaL executables (Remote file ftp.ncsa.uiuc.edu/Web/contrib/SoftQuad/hotmetal)
Other mirror FTP sites list for HoTMetaL
Connect to the SoftQuad server for a recent list of FTP sites in the US, Canada, and Europe that host HoTMetaL. The FTP links below are older, but may still be alive:
- ftp.ncsa.uiuc.edu:/Mosaic/contrib/SoftQuad
- ftp.ifi.uio.no:/pub/SGML/HoTMetaL
- sgml1.ex.ac.uk:SoftQuad
- doc.ic.ac.uk:/pub/packages/WWW/ncsa/contrib/SoftQuad
- askhp.ask.uni-karlsruhe.de: /pub/infosystems/mosaic/contrib/SoftQuad
- ftp.cs.concordia.ca:/pub/www
- ftp.cc.gatech.edu:/pub/gvu/www/pitkow/misc
- ftp.sunet.se:/pub/www/Mosaic/contrib/SoftQuad
- ftp.uco.es:/www
- olymp.wu-wien.ac.at:/pub/sgml/exeter/SoftQuad
- ftp.germany.eu.net: /pub/infosystems/www/ncsa/Web/contrib/SoftQuad
- ftp.informatik.uni-freiburg.de: /pub/WWW/editors/HoTMetaL
- gatekeeper.dec.com: /pub/net/infosys/Mosaic/contrib/SoftQuad
- Email to: webmaster@sq.com
HyBrick - SGML/XML Browser
[CR: 19990304]
[March 04, 1999] Ralph E. Ferris (Fujitsu Software Corporation) has announced a new release of Fujitsu's HyBrick SGML/XML browser, with expanded support for XLink/XPointer. It is available from the Fujitsu Software Corporation's Web site. New features in HyBrick V0.82 related to XLink and XPointer include: "1) XLink/XPointer error/warning info is shown in the error list dialog; 2) A 'Document Group' sub-menu has been added in the 'XLink/XPointer' menu; users can now navigate between inter-linked documents by using Document Groups as well as through individual links; 3) In the 'select link' dialog, link element 'role' values are displayed instead of GIs. This feature, as well as the 'Document Group' display feature, are particularly useful for creating and navigating 'Topic Maps.'; 4) The mouse cursor now changes its shape over links." Also new in HyBrick 0.82 are multiple stylesheet support (if multiple stylesheet PIs are present, users are presented with a dialog box to select the stylesheet they want to use), 'Reload hubdocument' function and 'Close window' function. 'HyBrick' is "an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. 'HyBrick' is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. It supports both valid and well-formed XML documents, XLink and XPointer (XLink implemented as a subset of the HyTime property set), SGML (ISO 8879), DSSSL (ISO 10179) online specification, printing and print previewing based on DSSSL stylesheets." See more on HyBrick Support for XPointer in a posting of March 4, 1999.
[February 15, 1999] Ralph E. Ferris (Fujitsu Software Corporation) posted an update on the HyBrick V0.80 support for XLink and XPointer. HyBrick is an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. HyBrick is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. It supports "both valid and well-formed XML documents, XLink and XPointer, SGML (ISO 8879), DSSSL (ISO 10179) online specification, printing and print previewing based on DSSSL stylesheets." To make the point [about HyBrick XLink/XPointer support, Ralph has] put some files with XLink/XPointer declarations in them up on the HyBrick Web site at http://www.fsc.fujitsu.com/hybrick/. These files are intended to be accessed over the Web. If your network access environment allows you to though, you can see XLink and XPointer at work over the Web by downloading HyBrick and pointing it at: http://www.fsc.fujitsu.com/hybrick/hubdoc-1.xml . . ." [see the posting for caveats and full details.] HyBrick Version 0.8 with XLink/XPointer support is now available for download.
[Earlier description:] "HyBrick" is 'an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. "HyBrick" is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. HyBrick supports: 1) Both valid and well-formed XML documents; 2) XLink/XPointer on the local file system [XPointer is implemented as a subset of the HyTime property set; Link traversal can use either "New" or "Replace" to display a new page]; 3) SGML (ISO 8879); 4) DSSSL (ISO 10179) online specification; 5) Printing and print previewing based on DSSSL stylesheets.'
[November 03, 1998] Ralph E. Ferris of Fujitsu Software Corporation has announced that HyBrick V0.8 with XLink/XPointer is Now Available for download.
Links:
- Main web site
- [Another Information Page] [English]
- Version 0.8 Announcement
- Download from CO.JP: http://www.fujitsu.co.jp/hypertext/free/HyBrick/download2.html
- Send questions or comments to: hb-staff@ml.flab.fujitsu.co.jp
The Wurd [was: WP] Project
"Wurd is an SGML capable Wurd Processor and publishing tool for multiple operating systems/platforms - although at the moment the only operating system supported is Linux. [June 1997]
[Work in progress only] WP is "a word processor being built by linux enthusiasts. . . with a native file format based on the SGML model. . .The use of SGML as the file format means that wp has an open interchange format. It will be possible to maintain World-Wide Web pages directly with wp."
- Home Page
- Home Page (http://wpprj.home.ml.org)
- Trevor Jenkins' development site
- Call for participation issued by Paul Colclough
GRIF Symposia: "A Collaborative Authoring Tool for the World Wide Web" (HTML and XML)
[CR: 19970827]
Links:
- Symposia Home Page
- Demo Version
- "Symposia doc+ - is a complete intranet publishing solution that combines a powerful WYSIWYG authoring tool, a database publishing mode and a graphical site manager in a single, easy-to-use package."
- Authoring and Formatting XML Documents
- Symposia- Welcome
- http://symposia.inria.fr/symposia/userdoc/put/writable-server.html
- Grif FAQ Document
HyBrowse HyTime Browser
[CR: 19961126]
HyBrowse is a HyTime Browser from TechnoTeacher, Inc., - a HyMinder application. "HyBrowse is a true HyTime (ISO/IEC 10744) hyperdocument browser for Windows 95 and Windows NT. It is useful for developing electronic document architectures that employ HyTime's strongly typed location-independent linking mechanisms." HyBrowse is publicly available (free) [as of November 22, 1996] for a trial period of 45 days. In addition to standard features one would expect, it supports: (1) True HyTime independent hyperlinking; (2) User-defined strong hyperlink typing with [a] icons assignable to anchor roles over entire bounded object set (BOS), [b] rendering styles assignable to anchor roles over entire BOS; (3) HyTime-conforming address elements ; (4) Aggregate location and hyperlink traversal handling; (5) Arbitrary BOS awareness allows users to add (import) a document into the current BOS; (6) Re-open browsing sessions without reparsing or reprocessing."
Eliot Kimber writes: "NOTE: HyBrowse is intended as a tool for creating prototypes and demos of HyTime features. It is not intended to be a production-quality information delivery system. The formatting features are minimal compared to Panorama or DynaText but sufficient to demonstrate the very interesting things you can do with independent links and anchors thereof. If you've been thinking of ways that HyTime hyperlinking could solve some of your information management problems but never had a way to realize or test those ideas, now you do, for free."
Links:
- Announcement for the HyBrowse HyTime Browser, from Eliot Kimber
- Announcement for HyBrowse 1.0.1, from Steven R. Newcomb
- HyBrowse description and download instructions on the TechnoTeacher server
- Supporting documentation from W. Eliot Kimber
perlSGML - Perl programs and libraries (Earl Hood)
[CR: 19970918]
perlSGML is a collection of Perl programs and libraries written by Earl Hood for processing SGML documents. The following software is available in the perlSGML distribution: dtd.pl (A Perl library to parse SGML DTDs), dtd2html (An SGML DTD documentation/navigation tool), dtddiff (a utility to list changes in a DTD), dtdtree (Generate content hierarchy trees of SGML elements), dtdview (Interactively query a DTD), sgml.pl (A Perl library to parse SGML instances), stripsgml (utility to remove SGML markup).
The 'dtd2html' tool is widely used. "What is dtd2html: dtd2html is part of the perlSGML package. dtd2html is a program that generates an HTML document (composed of several files) that documents and allows hypertext navigation of an SGML DTD."
- [September 18, 1997] Announcement from Earl Hood (University of California, Irvine) for a new release of the perlSGML toolkit. perlSGML is a collection of Perl programs and libraries for processing SGML DTDs and documents. "This release mainly includes a new set of Perl 5 modules. A new stripsgml is available and some corrections to dtd.pl are included in the release."
- perlSGML Main Page
- Documentation for perlSGML
- October 09, 1996: Announcement from Earl Hood for a new release of the perlSGML tools -- a collection of perl software for processing SGML data. These SGML software tools run under Perl versions 4 and 5. Most important changes: (a) "Hierarchial tree output of DTDprint_tree of dtd.pl modified to preserve the content model in the output. New tree format utilized by dtd2html, dtdtree, and dtdview; (b) sgml.pl rewritten to be more efficient and be useable for large files. Still more suited for simple tasks. stripsgml rewritten to utilize new sgml.pl." Available in .gz or .zip distribution format.
- December 09, 1995: Announcement for a new version or Earl Hood's perlSGML. perlSGML is a collection of Perl programs and libraries for processing SGML documents: dtd.pl (2.2.0) -- A Perl library to parse SGML DTDs; dtd2html (1.4.0) -- An SGML DTD documentation/navigation tool; dtddiff (1.1.0) -- List changes in a DTD; dtdtree (1.2.0) -- Generate content hierarchy trees of SGML elements; sgml.pl (0.1.0) -- A Perl library to parse SGML instances; stripsgml (0.1.1) -- Remove SGML markup. Changes: (1) Fixed code so it will run under Perl 4 and 5; (2) MS-DOS usage support; (3) Entity map file syntax has changed to the SGML open catalog format; (4) Support for the envariables SGML_SEARCH_PATH, SGML_CATALOG_FILES; (5) New functions added; (6) Speed improvement; (7) Bug fixes. See the text of the announcement, or link to the WWW page.
- Links on Earl Hood's page, including demos for DTDs processed (TEI, HTML 2.0, HTML 3.0).
- FTP from the SGML Repository
- FTP from Exeter
- documentation for dtd2html (Earl Hood) via CETHMAC
- documentation for dtd2html (etc) on Earl Hood's (OAC) Home Page
- FTP to Darmstadt
Carthage, dpp, and Bison tools by Michael Sperberg-McQueen
[CR: 19970122]
Several SGML grammar tools have been created and made publicly available by TEI editor Michael Sperberg-McQueen. DPP: "DPP is a parser for SGML document type declarations, intended for use as a front end for filters which modify DTDs (e.g. filters to expand all or some parameter entity references, or to rename elements, etc.). Since DPP uses the same output format as sgmls. . .many existing tools for writing filters for SGML document instance . . . can be used with DPP to make filters for DTDs." Bison tools: "The subdirectory pub/tei/grammar/bison contains files with Bison grammars and Flex scanners for SGML document type definitions, SGML document instances, and SGML declarations. See ftp://ftp-tei.uic.edu/pub/tei/sgml/grammar for fuller description of these grammar tools.
Another of the tools is a utility called Carthage. "Carthage is a yacc/lex-based parser for SGML DTDs which can delete references to undeclared elements. It can also do a few other things, depending on the run-time flags you give it." Some options include: (1) dropping or keeping marked sections; (2) warning if entities are declared twice; (3) dropping or keeping parameter entity declarations; (4) deleting named GIs from content models; (5) listing of specified classes of elements in the DTD [used, unused, default undeclared, declared]; (6) dropping or keeping comments in the output file, etc. The software is "unsupported" but "users who improve it or fix errors are requested to notify the author so he can also fix them." [extracts from the README file, dated June 17, 1996.
- The Carthage README file; [mirror copy, made August 02, 1996]
- See the database entry in the SGML/XML Web Page for dpp and other SGML grammar tools by Sperberg-McQueen
- Main FTP directory for Sperberg-McQueen SGML grammar tools
DTDParse, by Norman Walsh
[CR: 19980409]
"DTDparse reads an SGML DTD and constructs a simple, easily parsed database of its content. This database can be examined to construct other views of the DTD. The DTDparse distribution contains several scripts which use the database to extract useful information about the DTD: (1) parents lists the parents of a particular element; (2) children lists the children of a particular element; (3) dtd2man produces DocBook RefEntry pages ('man' pages in common UNIX parlance) for the components of the DTD; (4) dtd2html [unrelated to Earl Hood's program of the same name] builds an HTML web of the components of the DTD." The documentation page provides sample output for DTDs such as DocBook 3.0, HTML 3.2, ISO 12083 DTDs, TEI Lite 1.6, and the CALS Table DTD.
- DTDParse Home Page
- Version 0.97 sources; [local mirror copy]
- Sample output for the DocBook 3.0 DTD
- Sources (will require Perl)
- Contact the author (Norman Walsh, Technical Director, Online Publishing, O'Reilly & Associates, Inc.
- Links to some other tools that (sort of) generate DTD documentation from DTDs
Fred - The SGML Grammar Builder
[CR: 19980508]
"Fred is an ongoing research project at OCLC Online Computer Library Center, Inc. (OCLC) studying the manipulation of tagged text. As a service to the community, OCLC has decided to make several portions of Fred freely available via a WWW server." These services include (subject to documented limitations): automatic SGML DTD creation from tagged text, grammar reduction (BNF, DTD, and Four-Tuple output formats), and arbitrary transformations.
Links:
- The OCLC Fred Home Page
- Automatic DTD Creation from a URL or Sample Text - Description of the online Fred service
- Automatic DTD Creation from a URL
- Automatic DTD Creation from Sample Text
- Fred Translation Services
- OCLC SGML Grammar Builder Project - DTD and document grammar. Additional links and background.
- See also: XML DTD generation using DTDGenerator - XML DTD Generator
NORMDTD (by Richart Light)
[May 1996] "NORMDTD is a DOS (yes!) program that reads a valid SGML DTD, even a TEI-like one that uses marked sections and multiple input files, and generates a single file containing a normalized version of that DTD. The element content models in this normalized DTD will not contain any references to elements that are not declared, and so it can be used by highly-strung SGML packages such as RulesBuilder that refuse to process TEI applications (in particular) for this reason. In fact, having a normalized DTD in a single file can be helpful for a number of reasons, to a variety of SGML applications."
NORMDTD is written in Borland Pascal and runs only under DOS.
- The text of the announcement, with brief documentation
- Source for self-executing utility on OTA FTP server: ftp://ota.ox.ac.uk/pub/ota/TEI/software/normdtd1.exe
- Mirror copy, May 1996
- Also by Richard Light: The SGML Tagger, from Oxford University Press; [mirror copy]. "SGML Tagger is loaded on top of word-processing software, and allows users to insert SGML markup accurately and efficiently, without the need to learn a specialized SGML editor." See the bibliographic entry.
Babble - Synoptic Text Browsing/Searching Tool
[CR: 19970628]
"Babble, under development by Robert Bingler at the Institute for Advanced Technology in the Humanities (University of Virginia in Charlottesville), is an SGML-capable synoptic text tool that can display multiple texts in parallel windows. It uses Unicode, an ISO 16-bit character set standard, which allows multilingual texts, using mixed character sets, to be displayed simultaneously. Babble also allows users to search for strings in text or in tags, and to link open texts for scrolling and searching. Currently, Babble runs as an application, and not as an applet . . . Babble was originally prototyped in C++ and Motif++ for AIX 3.25 by Pete Yadlowsky. The current version is written in Java." [from the Home Page]
Note: Babble has been described to me as nominally but usefully SGML-aware. For example: "The search function allows you to search for strings, either in text or--if the file you're searching is marked up in SGML--within tags. When you click on the search button, a dialogue box appears, offering two choices: search in text or in tags, and a character set for the search. It is assumed that SGML tagging will be done in the Latin alphabet, but Babble will allow you to search for a non-Latin string within tags." [from the online documentation]
Links:
- [June 28, 1997] Announcement from David L. Gants and John Unsworth for Babble version 1.1.1, with several significant enhancements
- Babble Home Page
- Help for Babble: A Synoptic Unicode Browser
- Unicode and the Web
IADS: Integrated Authoring and Display System
[CR: 20011019]
"Interactive Authoring and Display System (IADS) was developed as a U.S. Army Missile Command initiative to reduce or eliminate paper documentation. IADS utilizes standard generalized markup language (SGML) to manipulate the text and graphics. The author can chose to display graphics within the text and/or in separate windows." [from the Home Page]
- New URL: iads.redstone.army.mil
- IADS version 3.0 feature list
Interactive Authoring and Display System (IADS). The IADS program distribution includes an XML DTD. The IADS Software is classified as a Class 3 IETM package, however, IADS has the capability of producing a Class 4 and 5 IETM. IADS uses SGML as its underlying text format. WYSIWYG editing is now provided which allows text entry, graphic manipulation, tag insertion, and modification within the context of the formatted display. This mode is turned on or off using the 'Edit mode' option under the 'Authoring' menu. The DTD (if specified in the DOCTYPE) is loaded, processed, and its rules stored for use when inserting or editing tags in the document. The tag editor dialog box will only allow tags and tag attributes to be inserted that are defined in the DTD. 'Currently [2001-10], IADS is the only software able to parse and display IETMs meeting MIL-STD-40051A.' Contact: iads@redstone.army.mil ([Neil Frazier] IADS / Publications Services, US Army AMCOM).
- IADS Users Group
- IADS Software Main Page
- U.S. Army Missile Command (Sponsor Site)
- Author and Editor modes in IADS
- ]FTP from the SGML Repository - probably an out-of-date version by now]
- FTP from Exeter
- IADS Version 2.0, available from the Exeter FTP Server or from the SGML Repository.
SARA (SGML-Aware Retrieval Application)
The SARA system. SARA (SGML-Aware Retrieval Application) is a client/server software tool allowing a central database of texts with SGML mark-up to be queried by remote clients. The system was developed at Oxford University Computing Services, with funding from the British Library Research and Development Department (1993-4) and the British Academy. The original motivation for its development was the need to provide a robust low-cost search-engine for use with the 100 million word British National Corpus, and several features of the system design necessarily reflect this.
The SARA system has four key parts:
- the indexing program, which generates an index of tokens from an SGML marked-up text
- the server program, which accepts messages in the Corpus Query Language (see below) and returns results from the SGML text
- the SARA protocol, a formally defined set of message types which determines legal interactions between the client and server programs; this protocol makes use of a high-level query language known as CQL (for Corpus Query Language)
- one or more client programs, with which a user interacts in any appropriate platform-specific way, and which communicate with the server program using the protocol
Links:
- See the main BNC entry
- SARA Documentation
- SARA (SGML Aware Retrieval Application) Workshop 29th June 1994 [mirror copy]
Ispell for SGML
[CR: 19970225]
- Announcement from R. Alexander Milowski of Copernican Solutions Incorporated for a utility that 'spell-checks' SGML documents: Ispell for SGML. Sources are available as a patch to the standard distribution; binaries are also available for Solaris 2.5, and a WIN32 port will be provided in the future. The brief description on the COPSOL WWW site says [970225]: "Ispell for SGML is a version of the ispell spell checker distribution that has been patched to understand and ignore SGML markup. This version is a simple markup scanner that does not assume any further knowledge of the DTD. It purely relies on markup mode scanning as specified in the SGML standard."
Syntext -- the SGML Grammar Grapher
[CR: 19960521]
"SYNTEXT is an SGML DTD providing elements and attributes to mark up text in English for: (1) syntactic structure, including (a) X-bar based parsing, with Government and Binding-style PRO and t, (b)grammatical relations a la Quirk et al. marked as attributes; (2) cohesion ; (3) coreference; (4) conjunctive relations as attributes of sentence specifiers; (5) lexical cohesion as attributes of lexical items; (6) rhetorical figures. Any text marked up for these features and identifying itself as DOCTYPE SYNTEXT is an SGML document and can be browsed in a SGML browser or viewer such as SoftQuad's free Windows browser Panorama or the costwish viewer for X Windows being developed by Peter Murray-Rust. It is an SGML application, the purpose of which is to provide markup for the analysis of syntactic and textual structure; a marked up text can viewed as a tree and in other modes and can be searched with context sensitive and contingent scans, making it very powerful for stylistic analysis (once a passage is marked up!)."
Links:
- the text of the announcement
- Home Page: http://weber.u.washington.edu/~syntext/; [mirror copy, incomplete missing graphics causa]
- SYNTEXT Documentation, including DTD
- syntext DTD Quick Reference; [mirror copy]
MtSgmlQL, the SgmlQL interpreter
[CR: 19971216]
"The SGML query language SgmlQL was developed in the context of the MULTEXT project. It is a functional language based on SQL, which enables complex operations on SGML documents, for instance: (1) extraction of parts of an SGML document that satisfy given criteria; (2) tests, counts, and various other computations on SGML elements in a document; (3) construction of new elements and documents using the result of queries. Because SgmlQL is a functional language, all data and program statements are expressions, or queries, which are recursively evaluated. It allows for manipulation of numbers, strings, (SGML) names, elements, attribute-value sets, documents, and (mixed content) lists. A free alpha version for UN*X of MtSgmlQL, the SgmlQL interpreter, can be downloaded to your system for non-commercial, non-military purposes (see the user agreement).
Links:
- Announcement for alpha release of the SGML/HTML query language
- SgmlQL - SGML Query Language
- Examples of SgmlQL usage
- The Multext SGML Query Language interpreter - reference; [archive copy, reference documents in HTML format]
- MtSgmlQL manual (including downloading instructions)
- SgmlQL reference: The Multext SGML Query Language
- See: Le Maitre, Murisasco, and Rolbert, "From Annotated Corpora to Databases: The SgmlQL Language." Papers presented at a conference held March 23-24, 1995, University of Groningen, 1995.
- Multext Home Page
'sgrep' grep-like searching of structured documents
[CR: 19981210]
Description: 'sgrep' (structured grep) "is a tool for searching text files and filtering text streams using structural criteria. The data model of sgrep is based on regions, which are nonempty substrings of text. Regions are typically occurrences of constant strings or meaningful text elements, which are recognizable through some delimiting strings. Regions can be arbitrarily long, arbitrarily overlapping, and arbitrarily nested. Sgrep is a convenient tool for making queries to almost any kind of text files with some well kown structure. These include programs, mail folders, news folders, HTML, SGML, etc... With relatively simple queries you can display mail messages by their subject or sender, extract titles or links or any regions from HTML files, function prototypes from C or make complex queries to SGML files based on the DTD of the file." Sgrep is distributed under GNU General Public License.
[December 10, 1998] Jani Jaakkola has announced the availabilty of "sgrep-1.90a - An SGML and XML Search and Indexing Tool." Sgrep is a tool to search and index text, SGML, XML and HTML files using structured patterns. New features in Sgrep version 1.90a include: 1) query operators that support direct containment, so that one may query children and parents of given elements; 2) the sources are available under GPL-license for those interested in compiling sgrep; 3) Sgrep now uses GNU autoconf, so compiling sgrep under Unix-systems should be easy; 4) bug fixes. This version of Sgrep contains the sources, Win32 binaries, and binaries for HP-UX, Linux, OSF1 and Solaris. The Win32 binary also includes the m4 macro processor. For more information on Sgrep, see README file or the overview.
[August 29, 1998] Jani.Jaakkola@cs.helsinki.fi (Department of Computer Science, University of Helsinki) posted an announcement for the release of sgrep version 1.71a as the first prerelease of sgrep-2. Sgrep is a tool to search and index text, SGML, XML and HTML files using structured patterns. Features new in version 1.17 include: "1) Indexing of both structure and content; 2) SGML/XML/HTML scanner; 3) both Win32 and i386-Linux binaries; 4) compatibility with older versions of sgrep; 5) no dependence upon 'sgtool'. Features announced for inclusion in sgrep-2 are: 1) Support for querying notations, element type declarations and attribute list declarations inside SGML/XML document prolog; 2) Parsing of all well-formed XML-documents; 3) Proper documentation.
Links:
- Announcement for version 1.90a
- Announcement for first prerelease of sgrep-2 [1998-08-28]
- HTML version of sgrep manual page
- Latest version of sgrep via FTP
- sgrep example queries
- "Using sgrep for querying structured text files" (Technical Report)
- README document; [local archive copy]
- Announcement for version 0.99 of 'sgrep' (April 30, 1996)
- Comments on sgrep by users (positive)
- Announcement for preliminary/test version of 'sgrep' (April 18, 1996)
- Home Page [Jani Jaakkola] also: Pekka Kilpeläinen
- DocMan - The Document Management Research Group (the Department of Computer Science at the University of Helsinki)
- Tarred and gzipped sgrep distribution version "sgrep-0.99.tar.gz"; [mirror copy]
- See the bibliographic entry for a technical report: Jani Jaakkola and Pekka Kilpeläinen. "Using sgrep for querying structured text files," In J. Saarela, editor, Proceedings of SGML Finland 1996, Espoo, Finland, October 1996. SGML Users Group Finland, pages 56-67.
- Email contact: Pekka.Kilpelainen@cc.helsinki.fi or Jani.Jaakkola@cc.helsinki
Inside & Out, from ZGDV
[CR: 19970522]
Inside & Out is a graphical DTD editor created by Hans Holger Rath and Ulrich von Engelberg, of the Computer Graphics Center (ZGDV) in Darmstadt, Germany. It runs under MS-Windows 3.1 (386 PC) with 4 MB RAM. The editor is designed to build SGML DTDs interactively, providing a graphical presentation of the DTD in the shape of a a syntax (or railroad) diagram. Every element and parameter entity definition is shown in a single diagram. All definitions are alphabetically sorted (first all entity, second all element definitions)"
Links:
- Main Page (FTP directory)
- The README document; [mirror copy]
- Sample document DTD
- Binary Executable; [mirror copy, entire package]
- Contact: iout@igd.fhg.de
MU: Forms Assisted SGML Markup
"MU is a perl-based program that builds fill-out forms for SGML editing, based on simple templates. It supports lock files (for networked workgroups), and it is distributed with a TEI-lite template. Demonstrations, source code, help files, and an email list for bug reports and developers are available. . .Features: (1) Helps to automate the SGML markup process; (2) Quite general - works on various types of DTD templates; (3) Version 1.1 deals quite nicely with attributes; (4) Allows for multi-user editorial communication through the use of remarks; (5) Supports internet workgroups via lockfiles."
Markus Hoenicka's SGML/DSSSL Setup for Windows NT
[CR: 19981014]
"These pages describe how to set up a free integrated SGML editing and publishing system running under Windows NT - and, with a few modifications of the installation procedure, also on Windows 95/98 boxes." The documentation provides instructions for the installation of Emacs, Jade, PSGML, Ghostscript, Acrobat, MiKTeX, AucTeX, Jadetex, DocBook, etc.
Links:
- Main Page
- Introduction
- Email contact: Markus Hoenicka
SGML Data Conversion, Transformation, and Manipulation
At SGML'96, Boston, November 1996, Tony Graham (Mulberry Technologies, Inc.) presented "Free SGML Transformation Tools." "The criteria for selecting an SGML transformation processing tool are discussed, and the details and SGML-processing features of several free SGML transformation tools are listed."
Rainbow
Several companies have collaborated on the design of an SGML interchange language for word-processing formats. Rainbow makers produce SGML from the supported word-processing formats, preserving as much information about document structure as can be deduced reliably. The Rainbow SGML format can then be used as input to other applications. See further explanation on EBT's server or on the mirrors in the file 'rainbow.why'. Rainbow makers are now available (free) for FrameMaker/FrameBuilder MIF, RTF, Interleaf, and (possibly) Ventura. Authoritative files for the Rainbow distribution are located on EBT's FTP server (SGML Rainbow via ftp.ebt.com/pub/nv/dtd/rainbow/
Other sources for Rainbow makers include:
- Announcement for Rainbow 2.01
- [mirror copy]
- Information: rainbow@ebt.com
- FTP from the SGML Repository
- FTP from Exeter
ICA: Integrated Chameleon Architecture
The ICA (Release 1.6, February 1994) is a toolset for generating data translators. In particular, the toolset can be used to generate translators to and from a constrained subset of instances of SGML Document Type Definitions (DTDs). There are several example translators included in the distribution. The first is a book DTD and includes specific translators for the LaTeX book documentstyle and a specific troff macro package. The second is a bibliographic DTD and includes specific translators for BibTeX and refer bibliographic database formats. Please note that the ICA is for developing translators and not providing translators. The ICA runs in the Unix environment, using the X Window System for the basis of the graphical user interfaces.
A new user's manual for ICA is also available. Published by Prentice Hall, the book is entitled The Integrated Chameleon Architecture: Translating Documents with Style, by Sandra Mamrak, Conleth S. O'Connell, and Julie Barnes. ISBN 0-13-056418-4. This book contains much new and revised material over the previously available online documentation, including a chapter on the ICA and SGML. See also description in excerpts from the release notes.
See further description in the ICA toolkit anouncement, and see network addresses for supporting mailing list. The sources for ICA on the Internet are:
- FTP from SGML: ICA Chameleon: Remote file archive.cis.ohio-state.edu/pub/chameleon/
- FTP from the SGML Repository
- FTP from Exeter
STIL - `SGML Transformations in Lisp'
STIL is a stylesheet language developed by Joachim Schrod (Computer Science Department Technical University of Darmstadt, Germany). "STIL (`SGML Transformations in Lisp') is a style sheet language to create structure-controlled SGML applications. In these applications you have neither access to the DTD nor to the original document source, instead you operate on a tree representation of the document. If you know CoST (the tree mode version) or SGMLSpm, STIL uses the same concept as these style sheet languages. The most obvious difference is the use of Common Lisp instead of Tcl or Perl5.
You define classes for elements that appear in a document, instances of these classes are the inner nodes of the tree. Automatic transformation of attributes to data structures more appropriate in your task domain than simple strings is available. Elaborate handling of PCDATA is supported, too.
The document tree is traversed, you can specify operations (`callbacks') that are triggered at certain points in that traversal. Within these callbacks, you have access to the full tree." [from the README, 1995/09/09]
Links:
- README [mirror copy, November 1995]
- STIL 1.0 Manual, by Joachim Schrod and Christine Detig [mirror copy, November 1995]
- stil-1.0.tar.gz
CoST (Copenhagen SGML Tool, UNIX)
[CR: 19990628]
[June 28, 1999] Joe English has announced the release of Cost version 2.2, which now provides 'preliminary support for XML'. Cost is a free "structure-controlled SGML application programming tool. It is implemented as a Tcl extension, and works in conjunction with James Clark's nsgmls and/or sgmls parsers. Cost provides a flexible set of low-level primitives upon which sophisticated applications can be built. These include: (1) A powerful query language for navigating the document tree and extracting ESIS information; (2) An event-driven programming interface; (3) A specification mechanism which binds properties to nodes based on queries. Cost can be dynamically loaded into a Tcl application with the usual package mechanism, or it can be statically linked into a custom Tcl interpreter. There is also a command-line interface, costsh, which can be used interactively or as part of a command pipeline. A windowing interface, costwish, is also available for building GUI applications with Cost and Tk. New features in Cost version 2.2 include: (1) It should compile and install out-of-the-box on most Unix platforms, with any Tcl release from 7.5 through 8.1.1 - courtesy autoconf; (2) One can load more than one document at a time, and switch between them with the new 'selectDocument' and 'withDocument' commands; (3) It allows comments at certain places in specifications. (4) It provides preliminary support for XML, courtesy expat by James Clark. Note: XML support is largely untested and has a few known deficiencies (and probably several unknown ones!); I'd appreciate any feedback/bug reports. (5) It is released under a Tcl-style license instead of the 'Artistic' license. (6) Cost can now be loaded as an extension into multiple Tcl interpreters without conflicts. (7) Many minor bugfixes, enhancements, and cleanups."
[1997] "What is CoST? CoST (Copenhagen SGML Tool) is a structure-controlled SGML application programming tool. It is built on top of a public domain SGML tool: the SGMLS parser made by James Clark. With CoST you can write translation specifications for SGML document instances. CoST is purely structure driven, i.e. it gives you access to the structure of the SGML document instance. It won't, however, let you access the lexical and syntactical details in the SGML entities that represent the document instance in storage. You can write CoST programs that will translate SGML document instances or perform other processing in response to SGML documents. You program CoST using TCL - Tool Command Language." [from the Manual Introduction [March 1995]
CoST was written by Klaus Harbo (Klaus.Harbo@euromath.dk) and is maintained by Joe English (joe@flightlab.com).
Links:
- Copenhagen SGML Tool - Cost Home Page.
- Cost reference manual
- [June 16, 1998] Boris Tobotras posted a CoST patch for multiple document instance support. "Some other fixes available, contact me if you're using CoST."
- READMEdescription, '28 May 1998'. [local archive copy]
- Announcement for CoST version 2.0 beta, September 14, 1995, including a draft version of the manual in Postscript format. Version 2.0 [2.0a2, October 13, 1995] contains a new query language.
- Searching SGML structure with CoST 2.0 (Peter Murray-Rust), April 1996
- Frequently Asked Questions about CoST
- CoST 2 Reference Manual
- Sources for the CoST package; [local archive copy, snapshot 980616 for 'cost-2.1a0.tar.gz' of May 30, 1998]
- Or FTP the package from Exeter.
- Related work: ExCost ExCost is for 'Expat and Cost'. Uses an extension to TCL that allows it to parse ESIS file and handle output in a event or tree driven behaviour. It provides about the same functionality as Cost, but for XML.
- See the Documentation of the Copenhagen SGML tool at http://laurel.euromath.dk/test.html.
costwish - SGML postprocessor and renderer
"Costwish is a graphical interface (SGML postprocessor and renderer) for Joe English's CoST-2 tool. From the README: "costwish is a generic graphical interface to Joe English's CoST SGML/ESIS post-processing tool. It is aimed at those who wish to: (1) run sgmls (or other ESIS-based parser) under a graphical interface; (2) browse their documents graphically (3) customise their postprocessing easily, powerfully and flexibly; (4) construct powerful searches of SGML-based documents; (5) and manage the results interactively; (6) develop interfaces to helper applications (e.g. graphical renderers)." [from the README, April 1996]
Links:
- Index Page: http://www.venus.co.uk/omf/costwish/
- Costwish Home Page
- Index of /omf/packages/binaries/
- Overview [mirror copy, April 1996]
- README file
- Documentation
- Review comments (compliments) by Len Bullard
SGMLS.pm and sgmlspl: A Simple Post-Processor for SGMLS and NSGMLS
[CR: 19980423]
SGMLS.pm and sgmlspl were written by David Megginson, and were maintained by him through 1995. The current maintainer [1998] of the SGMLS.pm Perl package is Ingo Macherius (Ingo.Macherius@tu-clausthal.de).
David's description: "SGMLSpm is a free perl5 object-oriented postprocessor for James Clark's SGMLS and NSGMLS parsers. The main part of this release is a library, SGMLS.pm, which repackages the ESIS output of (N)SGMLS into perl5 objects. On top of this, I have built a script, sgmls.pl, for formatting or processing SGML documents quickly using event patterns. Like CoST (which is several times slower), and unlike QWERTZ (etc.), SGMLSpm is a general-purpose package which can be used with any DTD. It even includes a script, skel.pl, which will write a skeleton conversion script for your document automatically!"
"sgmlspl is a sample application distributed with the SGMLS.pm perl5 class library -- you can use it to convert SGML documents to other formats by providing a specification file detailing exactly how you want to handle each element, external data entity, subdocument entity, CDATA string, record end, SDATA string, and processing instruction. sgmlspl also uses the Output.pm library (included in this distribution) to allow you to redirect or capture output."
- SGMLSpm source: http://home.sprynet.com/sprynet/dmeggins/SGMLSpm-1.03ii.tar.gz
- Source: local archive copy 1.03ii
- SGMLSpm documentation - from CPAN module documentation [Comprehensive Perl Archive Network]
- "Developing SGML Applications with Perl. Perl 5 and SGMLS.pm." [Chapter 5 and pages 260-276 in] SGML CD: A Complete SGML Toolkit, by Bob DuCharme.
- SGMLS::Handler - code to modularize SGMLS.pm's commandline driven sgmlspl tool, written by Ingo Macherius
- Home Page: http://home.sprynet.com/sprynet/dmeggins/
- Email: dmeggins@microstar.com (w); ak117@freenet.carleton.ca (pers)
- The original SGMLS.pm announcement (July 1995)
- Comparison of OmniMark and SGMLS-PM, by Lou Burnard (February 1996)
OmniMark LE
[CR: 19970923]
[September 23, 1997] Announcement for the OmniMark LE, available "at no charge for a limited time." OmniMark is a flagship industry software product -- a leading SGML based "hypertext programming language for development of on-line, Web, CD-ROM and print-on-demand publishing applications, being used for SGML conversion by a wide range of industry-leaders, including over 700 companies in 34 countries." OmniMark LE is a free product which runs utility-sized OmniMark programs. It is described as useful for: "(a) small-sized utility programs; (b) program development on the road away from your commercial licenses (since OmniMark LE will compile a large program -- it won't just run it); (c) evaluating OmniMark V3's capabilities before licensing V3." OmniMark LE is available on many platforms, including Windows NT/95 and many varieties of UNIX. See the program description for other information, or the main database entry.
"OmniMark LE will compile and execute programs that contain 200 or fewer actions in the program source. An action is a statement that OmniMark executes, distinguished from a "rule header" (e.g. an element rule) which describes an event. Within each element rule, one action is not counted towards the 200-action limit. The action count is performed at compile time, not run time; this means that any given action in a 200-action program could execute millions of times."
Links:
LT NSL and NSL (Normalised SGML Library)
[CR: 19970128]
From the Language Technology Group, Human Communication Research Centre, University of Edinburgh: the "Normalised SGML Library (NSL version 2.0 ) . . .consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents."
"LT NSL is a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation. It consists of a C-based API for accessing and manipulating SGML documents and an integrated set of SGML tools. The LT NSL initial parsing module incorporates v1.1.1 of James Clark's SP software, arguably the best SGML parser available. The basic architecture is one in which an arbitrary SGML document is parsed once, yielding two results: (1) An optimised representation of the information contained in the document's DTD; (2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc.
Links:
- The main entry for the LTG in the "Academic Applications" area of this database
- January 28, 1997: Announcement from David McKelvie for the HCRC Language Technology Group's public release of LT NSL --- Normalised SGML Library, version 1.4.6. The toolkit offers significant enhancements over version 1.4.4. "LT NSL is an integrated set of SGML querying/manipulation tools and a C-language application program interface (API) designed to ease the writing of C programs which manipulate SGML documents. Its API is based on the idea of using 'normalised' SGML (i.e. an expanded, easily parsable subset of SGML) as a data format for inter-program communication of structured textual information. The API defines a powerful query language which makes it easy to access (either from the shell or in a program) those parts of an SGML document which you are interested in. Both event based and (sub-)tree based views of SGML documents are supported."
- LTG Home Page [or, no frames: ]
- LT NSL main page
- The LT NSL documentation
TclYasp SGML toolkit
Extracts from the announcement by David Durand: "TclYasp integrates a conforming SGML parser with the TCL scripting language. . . Unlike CoST 1.1, it uses an simplest-possible procedure call interface, rather than an eloborate object-oriented interface. . . TclYasp does have a few unique features: it's based on YASP, which is more easily portable (it's in ANSI C and not C++) and was designed to be integrated with an application. Since Yasp is fully re-entrant, more than one parser can be active at a time. It is not restricted to the informationd efined by the ESIS, as the full parser data is available. . . TclYasp/Mac includes a command shell, multiple-pane windows, limited on-screen text formatting, and a variety of interface features as well as the SGML processing stuff."
Links:
- Announcement, April 17, 1996 by David G. Durand
- Sources: ftp://ftp.stg.brown.edu/pub/sgml", [Files: 1394391 Apr 16 21:00 TclYasp-Mac.sit.Hqx OR 2373632 Apr 16 04:24 tcl_yasp-1.0.tar] via the Scholarly Technology Group at Brown University
- Tclyasp-Mac.sit.Hqx [mirror copy]
- tcl_yasp-1.0.tar.gz, archived by the Scholary Technology Group at Brown University
- Email contact: David G. Durand [dgd@stg.brown.edu], maintainer of the SGML archives
- More information about the use of SGML by the Scholarly Technology Group at Brown University
Python for XML/SGML Processing
[CR: 19981103]
A few people (at least) believe that Python is well suited for SGML text processing. Sean McGrath wrote that it "beats any other language I know for SGML processing hands down", and Paul Prescod said: "Python is a really easy, incredibly powerful scripting language. . . [it] combines the best features of other scripting languages and borrows many neat features from the Great Languages from history (Simula, SmallTalk, Lisp, Algol)."
Links [provisionally]:
- Documents on Paul Prescod's Home page: "SGML Processing in Python"; "Using SGML Groves from Python, Visual Basic and other OLE client scripting languages"; "PySgml: A Module (under development) for SGML Processing in Python"; "An Introduction to Groves for Python Programmers."
- Announcement from Paul Prescod for a series of documents on SGML processing using Python
- XML and Python - Database section in the XML page.
- See ParseMe.1st, by Sean McGrath: several chapters illustrate the Python framework for processing SGML information objects; [bibliographic entry].
- Python module for XML. "Extensible Markup Language Scanner, Checker, and Utilities," from Dan Connolly, May 1997; [local archive copy]
- Python XML SIG. As of March 17, 1998, a mailing list "has been created for discussing XML and Python, with the goal of developing a set of Python tools for processing XML documents."
- [November 03, 1998] Python and SGML" - By W. Eliot Kimber. ". . .Its easy-to-use object orientation, its built-in list semantics, and the fact that it's interpreted make it really easy to create the same sorts of programs you might use DSSSL or Balise for, but with a general-purpose programming language that is easy to learn and much more familiar that DSSSL or Omnimark. Python is a free, publicly-developed language, not a commercial product. . ."
- [February 12, 1998] "XML Programming in Python," by Sean McGrath. In Dr. Dobb's Journal February 1998 [Scripting Languages]. Abstract: "XML brings to the document world what the database world has had for a long time -- interoperability via open systems. Sean shows how you can use Python as a development platform for XML programming. Additional resources include the Python web page, and PXML.TXT (listings from DDJ)." See also the bibliography entry for "XML Programming in Python."
- XMLParser class in the Python [1.5] distribution. 11.10 Standard Module
xmllib: "This module defines a class XMLParser which serves as the basis for parsing text files formatted in XML (eXtended Markup Language)." [from lmg]
I4I S4-Desktop V2.1 SGML middleware
[CR: 19970212]
Educational Support Program:
- Announcement from Infrastructures for Information Inc. for an Educational Support Program. "Infrastructures for Information Inc., . . . announces the no-cost availab

