Section 2.6: Parsers

`Parsing' is the process of taking an SGML-tagged data stream and (a) confirming that it obeys the structure defined in the associated DTD, and (b) expanding entity references and inserting omitted tags in such a way that a following process is able to function. Apart from using a parser to confirm the correctness of the SGML-tagged document, a parser is not used on its own but as part of a larger processing sequence. For this reason, most commercial SGML-aware products contain their own parser, which may or may not be based on one of the parsers listed below.

In addition to the parsers described below, there are other public domain parsers that are available from the Exeter ftp server. These additional parsers are listed in Annex B.


Product:
Mark-It
Associated Products:
Write-It
Developer:
SEMA Group, Belgium
UK Supplier(s):
Apply directly to SEMA Belgium
Price:
n/a
Platforms:
MS-DOS, MS-Windows, Unix, VAX, Seimans
Description:
Mark-It is an SGML-based programmable document processor which uses documant models defined in SGML in order to provide context-sensitive canversion and markup facilities.

Mark-It can be used to:

The kernel of Mark-It is an SGML parser developed by Sema Belgium. It supports all features of ISO 8879, including all kinds of markup minimisation, all link types, concurrent document types, sub-documents, multiple character sets, multi-byte characters and any varient concrete syntax.

To enhance the markup capabilities, the Regular Expression Language allows user-defined rules to identify where implicit structural elements start and end. The declarative Application language allows manipulation of attribute values and content, definition of multiple input and output channels, and the transformation of data for output to other applications (eg, where LINK processes do not offer sufficient functionality.


Product:
sgmls
Associated Products:
Developer:
James Clark
UK Supplier(s):
Available by anonymous ftp from `ftp.ex.ac.uk'
Price:
Public Domain
Platforms:
Unix
Description:
sgmls was developed from ARCSGML, the original SGML parser donated to the community by Dr Charles F. Goldfarb. ARCSGML is still available from the Exeter ftp server (and others), but as sgmls has fixed some of the known deficiencies of ARCSGML, its use cannot be recommended.

Assessment:
An ideal parser for individuals or organisations who are developing their own products and do not wish to develop a parser of their own. As the software can be run in isolation from any other products, it can also be used to parse SGML tagged source files that have been created using a SGML-unaware editor and that obey standard Unix file conventions.