A software architecture for simple, efficient SGML applications

Introduction

Normalised `SGML` defined

We define semi-valid SGML to be a sequence of text and/or SGML elements, each of which is a valid subtree of a named cached DTD. For example, if the DTD defines an element <name>, then a file consisting of a sequence of <name> elements is semi-valid SGML, even though it may not be valid SGML. The reason for this increased flexibility is to allow the output of tools which select parts of the input stream to be processed by subsequent NSL programs without requiring an explicit change of DTD.

A file is in Normalised SGML (nSGML) format if it satisfies the following conditions:

Document is semi-valid SGML, as defined above.
Document is coded using one of the ISO-LATIN character sets, with embedded character entities where necessary. SP can handle larger character sets, e.g. UNICODE, and we intend to extend NSL similarly.
Reference concrete syntax is used.
No capacity/length restrictions.
No short refs or tag minimisation.
No SUBDOCs.
No marked sections.
All end-tags present (except for empty elements).
All entity references terminated with ``;''
No entity references except (a) of type SDATA or (b) external references with explicit NOTATIONS.
All NAMEs (i.e. start and end tag GIs, attribute names, or attribute values of type NAME or NAMELIST) contain no lower case letters.
No CDATA or RCDATA element content.

nSGML is used as a format for communicating data between different NSL programs.

Compound Annotation and Links

NSL Corpus components can be hyper-documents, with low-density (i.e. above the token level) annotation being expressed indirectly in terms of links. In the first instance, this is constrained to links with an incorporation semantics, that is situations where element content at one level of one document is entirely composed of elements from another document. Suppose, for example, we already had segmented the 3rd component of an English corpus resulting in a single document d marked up with TEI-compliant headers and paragraph marking, and with the segmentation marked with <w> tags:

      . . .
      <p id=en.c3.p4>
      <w id=en.c3.p4.w1>
      Time
      </w>
      <w id=en.c3.p4.w2>
      flies
      </w>
      <w id=en.c3.p4.w3>
      .
      </w>
      </p>
      . . .

The output of a phrase-level segmentation might then be stored as follows:

      . . .
      <p id=en.c3.p4>
      <phr id=en.c3.p4.ph1 type=n doc=file1 from='id en.c3.p4.w1'>
      </phr>
      <phr id=en.c3.p4.ph2 type=v from='id en.c3.p4.w2'>
      </phr>
      </p>

Linking is specified using one of the available TEI mechanisms, details are not relevant here, suffice it to say that doc=file1 resolves to d and establishes a default for subsequent links. At a minimum, links are able to target single elements or sequences of continguous elements. The NSL implements a textual inclusion semantics for such links, inserting the referenced material as the content of the element bearing the linking attributes. Note that although the example above shows links to only one document, this is artifactual, and it is possible to link to several documents, e.g. to a word document and a lexicon document:

      <word>
      <source doc=file1 from='id en.c3.p4.w1'>
      </source>
      <lex doc=lex1 from='id en.lex.40332'>
      </lex>
      </word>

Note that the architecture is recursive, in that e.g. sentence-level segmentation could be expressed in terms of links into the phrase-level segmentation as presented above.

Versions

The data architecture needs to address not only multiple levels of annotation but also alternative versions at a given level. Since our linking mechanism uses the SGML entity mechanism to implement the identification of target documents, we can use the entity manager's catalogue as a means of managing versions. For our example above, this means that the connection between the phrase encoding document and the segmented document would be in two steps: the phrase document would use a PUBLIC identifier, which the catalogue would map to the particular file d. Since catalogue entries are interpreted by tools as local to the directory where the catalogue itself is found, this means that binding together groups of alternative versions can be easily achieved by storing them under the same directory.

Subdirectories with catalogue fragments can thus be used to represent both increasing detail of annotation and alternatives at a given level of annotation.

Note also that with a modest extension of functionality, it is possible to use the data architecture described here to implement patches, e.g. to the tokenisation process. If alongside an inclusion semantics, we have a special empty element <repl> which is replaced by the range it points to, we can produce a patch file, e.g. for a misspelled word, as follows (irrelevant details omitted):

      <nsl>
      <repl doc=original from='id hdr1'>
      <!-- to get the original header-->
      <text>
      <repl from='id p1' to='id p324'>
      <!-- the first swatch of unchanged text -->
      <p id=p325>
      <repl from='id p325.t1' to='id p325.t15'>
      <!-- more unchanged text -->
      <corr sic='procede' resp='ispell'>
      <!-- the correction itself -->
      <token id=p325.t16>
      proceed
      </token></corr>
      <repl from='id p325.t17' to='id p325.t96'>
      <!-- more unchanged text-->
      </p>
      <repl from='id p326' to='id p402'>
      <!-- the rest of the unchanged text-->
      </text>
      </nsl>

Whether such a patch would have knock-on effects on higher levels of annotation would depend, inter alia, on whether a change in tokenisation crossed any higher-level boundaries.

Entities in `nSGML`

Should entities (especially character entities) be expanded by the stream interface before they are handed to the tool, and if so, should they be replaced in the stream before they are ouput?

Consider the following fragment of SGML marked up text:

      <s>Fran&ccedil;ois Martin said yesterday 
      that the following companies announced quarterly 
      results: IBM; AT&amp;T; &Xerox;</s>

Let us assume further that the character set is 7-bit ASCII, and therefore that there is no expansion of the entity ç. Which, if any, of these entities should be expanded? Now, there are some properties of these entities that a tool will need to know about if it is to do its job. For example, it may need to know that ç is a lower-case letter, while & is an ampersand sign, and &Xerox; is a string which expands to a word, or a string of words. If markup is to be preserved throughout the process of, for example, a tokeniser and a word segmenter, then clearly these are all potential problems.

We therefore propose that tools must know about character entities such as ç and &. These will not be expanded, and can be reincorporated into the markup by the stream interface. All other entities, whether or not they contain markup, will be expanded and lost. This is necessary since markup may need to be added within their expansions (e.g. if &Xerox; expanded to ``The Rank Xerox Corporation'', then word markup would have to be added within the entity. In general, if a string entity is expanded then markup may have to terminate in the entity which was started outside it. This would be obscure and not licensed by SGML in any case under its ``obfuscatory markup'' rules).

The current implementation (version 1.4.4) achieves this by expanding all entity references except those of type SDATA (assumed to be un-coded characters in the document character set) or external entities with explicit NOTATION declarations.

Overview of the NSL API

There are two common ways of looking at SGML documents. First, as a linear stream of text with embedded markup tags, and secondly as a hierarchic tree structure. Both of these views can be useful in some circumstances, and accordingly, the NSL API provides data structures and access functions to support both these views.

Stream View

To view SGML documents as a linear stream of text plus markup, we use the NSL_Bit data structure. A NSL_Bit is either an SGML start tag, an end tag, an empty tag, an SGML processing instruction or text with no SGML elements in it. For example, the SGML text:

This is some text with a name, <NAME>Fred Bloggs</NAME> and a page break <PB> in it.

would be split into the following NSL_Bits (one per line).

   <P>                              type=NSL_start_bit
   This is some text with a name,   type=NSL_text_bit
   <NAME>                           type=NSL_start_bit
   Fred Blogs                       type=NSL_text_bit
   </NAME>                          type=NSL_end_bit
   and a page break                 type=NSL_text_bit
   <PB>                             type=NSL_empty_bit
   in it.                           type=NSL_text_bit
   </P>                             type=NSL_end_bit

The NSL function GetNextBit reads the next NSL_Bit from an nSGML file. Function PrintBit writes NSL_Bits to an output nSGML file.

Tree View

In order to view an SGML document as a hierarchic structure, the NSL API constructs a C data structure made up of NSL_Item and NSL_Data data structures which mirrors the tree structure of the document.

The NSL_Item type describes an SGML element and all its contents in a document, i.e. it represents a complete subtree of the document structure. So that, in the above example, there would be one NSL_Item for the element, with an nested NSL_Item for the <NAME> element.

The NSL_Data data structure represents a chunk of SGML element content, i.e. either an SGML element or a piece of text without element structure. NSL_Data structures for the contents of an NSL_Item are organised into a linked list of mixed NSL_Items and text in mixed content, or all NSL_Items in element-only content. In the above example, the NSL_Item corresponding to will point to five NSL_Data structures corresponding to ``This ... name,'', <NAME>, ``and ... break'', <PB>, and ``in it.''

NSL API data structure

Figure 1

Figure 1 shows the NSL_Item and NSL_Data data structures created for the text

   <name>David<surname>McKelvie</surname></name>

simple.c -- A model NSL application

This model application program (simple) has been written to demonstrate the use of the NSL API. The simple program reads an nSGML file containing paragraph and word markup. It assumes that each word element has an attribute which contains part of speech (POS) information. The program then outputs a modified version of the input file where the text of each word element has been replaced by some text which shows the word and the POS tag associated to the word. For example, if the input file looks like:

   <HEADER>blah blah</HEADER>
   <TEXT><P>
   <W TYPE=det>The</W>
   <W TYPE=nn>cat</W>
   </P></TEXT>

then the output file will look like:

   <HEADER>blah blah</HEADER>
   <TEXT><P>
   <W TYPE=det>The/det</W>
   <W TYPE=nn>cat/nn</W>
   </P></TEXT>

Simple.c is not intended to be a particularly useful program, rather to be an example of the use of the NSL API. The program can be called as follows:

   simple [options] nsgmlfile
           -------  ---------

   Allowed options (all of which are optional) are:
        -d The name of the cached DOCTYPE .ddb file
        -t name of attribute containing the POS information 
	   (default TYPE)
        -w name of word element 
	   (default W)
        -t print format for output words and their POS tags 
	   (default ``%s/%s'')

The source of the simple program can be found in the NSL release file.

The annotated code of the simple program is as follows:

Include header file for character functions.

#include <ctype.h>

Include header file for string functions.

#include <string.h>

Include the header file for the NSL API.

#include "nsl.h"

Main program.

void main(int argc, char **argv) {
  NSL_Bit *bit;
  NSL_File inf, outf;
  NSL_Doctype dct=NULL;
  char *paraLabel,*wordLabel,*textLabel,*ptr,*label,*tagVal=NULL,buf[100];
  int in_para=0,in_text=0,ac=1,in_word=0,len;

  /* defaults for command line arguments */
  /* Name of attribute carrying tag -- set with -t */
  char* tagAttr=(char*)"TYPE";
  /* Name of word element -- set with -w */
  char* wordTag=(char*)"W";
  /* Format string for word, tag -- set with -f */
  char* textFormat=(char*)"%s/%s";

Initialise the NSL SGML API.

  NSLInit(0);

Read the command line options. The optional -d option allows one to specify a .ddb file externally. If not given, then we assume that the input file contains a <?NSL DDB ... > processing instruction.

  while (ac&lt;argc) {
    if (STREQ(argv[ac], "-d")) {
      dct=DoctypeFromDdb(argv[++ac]);
      ac++;
    }
    else if (STREQ(argv[ac], "-t")) {
      ptr=tagAttr=argv[++ac];
      ac++;
      /* need upper case for attribute comparison */
      while (*ptr) {
        *ptr=toupper(*ptr);
        ptr++;
      };
    }
    else if (STREQ(argv[ac], "-w")) {
      ptr=wordTag=argv[++ac];
      ac++;
      /* need upper case for tag comparison */
      while (*ptr) {
        *ptr=toupper(*ptr);
        ptr++;
      };
    }
    else if (STREQ(argv[ac], "-f")) {
      textFormat=argv[++ac];
      ac++;
    }
    else {
      break;
    };
  };

Open the input nSGML file, passing in the doctype declaration dct if any has been specified by a -d option. If the doctype declaration was not specified by a -d option, then it will be set by reading from the input file on opening.

  inf=SFFopen(stdin, dct, NSL_read,"");
  dct=DoctypeFromFile(inf);

Open the output nSGML stream using the same doctype declaration.

  outf=SFFopen(stdout, dct, NSL_write_normal,"");

Get the unique name of the elements we care about.

  textLabel=ElementUniqueName(dct,(char*)"TEXT",4);
  paraLabel=ElementUniqueName(dct,(char*)"P",1);
  wordLabel=ElementUniqueName(dct,wordTag,0); /* length will be computed */

Loop round, reading bits of the SGML input text. A bit is either a single piece of SGML markup ( a start tag, an end tag, or a processing instruction) or it is text without SGML element markup.

  while ((bit=GetNextBit(inf))) {

Now we perform a case statement on the type of the NSL_bit we have just got.

    switch (bit->type) {
    case NSL_start_bit:

Case 1: We have found the start tag for an SGML element. Note that the item value of this NSL_Bit is of type NSL_inchoate, meaning that unless you call ItemParse on it, it's got just the start tag info and no contents.

      if((label=bit->label)==textLabel) {
        in_text=1;
      }

If we are inside a <text> element, then note this fact.

      else if (in_text && label==paraLabel) {
        in_para=1;
      }

If we are inside a paragraph () inside text, then note this fact.

      else if (in_para && label==wordLabel) {

We have found a word in a text paragraph. Note this fact and save the POS tags which are the value of the SGML attribute of the <W> that we just found, given by the tagAttr variable.

        in_word=1;
        tagVal=GetAttrVal(bit->value.item,tagAttr);
      };

In any case, we fall through to the next case to print out the NSL_bit we just read.

    case NSL_empty_bit:

Case 2: We have found an empty SGML element. There is no point in comparing labels of empty items. Note that PrintItem is smart and will only print a start tag for empty or inchoate items (not an end tag), but will get into a different state depending on which.

      PrintItem(outf, bit->value.item);
      break;

    case NSL_text_bit:
      if (in_word) {

Case 3: If we are inside the text of a word element, then strip off trailing whitespace, and write the word and its POS in a user defined format. Note that we use PrintText, rather than printf to actually write the string to the output file. This is in order to keep the nSGML output state up-to-date.

        len=strlen(bit->value.body);
        while (strchr((char*)" \t\n",bit->value.body[len-1])) {
          bit->value.body[--len]='\000';
        };
        sprintf(buf,textFormat,bit->value.body,tagVal);
        PrintText(outf,buf);
      }
      else {
        /* text in some other context -- just print it */
        PrintText(outf, bit->value.body);
      }
      break;

    case NSL_end_bit:

Case 4: We have found an SGML end tag, so we keep track of where we are in the SGML markup tree.

      if (in_para) {
        if (bit->label==paraLabel) {
          /* no longer in para */
          in_para=0;            /* NOTA BENE assume no nested para's! */
        }
        else if (bit->label==wordLabel) {
          /* no longer in word */
          in_word=0;
        };
      };
      /* print it no matter what */
      PrintEndTag(outf,bit->label);
      break;

    default:
      SHOULDNT;
    }; /* end switch */

Note: processing instructions are not dealt by the above code. If this is required, then a new case should be added with label NSL_pi_bit. NSL_Bits are fixed but their contents are not freed unless we do it ourselves.

    FreeBit(bit);
  }; /* end while */

At the very end we need to close the output nSGML stream.

  SFclose(outf);
}

NSL queries

It will be noted that in the above program (simple.c), a largish part of the code is devoted to implementing a finite-state machine which keeps track of where we are in the SGML structure of the document (cases 1 and 4 in the annotated listing of simple.c). One feels that this tracking of where we are could be supported by the NSL API, and in fact this can be done.

The reason that this API support for tracking SGML positions was not used in simple.c, is that, for simple cases, it is much more efficient to spell out this processing explicitly rather than to call the API query processing mechanism, which, by necessity, must be able to deal with the general case. If however, ease of programming is more important than processing speed or you need to handle a wider range of document structures, then the NSL query processing functions may provide what you need.

NSL queries are a way of specifying particular nodes in the SGML document structure. Queries are coded as strings which give a (partial) description of a path from the root of the SGML document (top-level element) to the desired SGML element(s). For example, the query ".*/TEXT/.*/P" describes any element which occurs anywhere (at any level of nesting) inside a <TEXT> element which, in turn, can occur anywhere inside the top-level document element.

A query is basically a path based on terms separated by ``/'', where each term describes an SGML element. The syntax of queries is as follows:

  <query>  :=<term> ( '/' <term> )* 
  <term>   :=<GI> <cond>? '*'?    
  <GI>     :=<elementName> | '.'    
  <cond>   :='[' ( <index> | <atests> | <index> <atests> ) ']'
  <index>  :=<number>    
  <atests> :=<atest> ( ' ' <atest> )*     
  <atest>  :=<aname> ( '=' <aval> )?

That is, a query is a sequence of terms, separated by ``/''. Each term describes either an SGML element or a nested sequence of SGML elements. An item is given by an SGML element name, optionally followed by a list of attribute specs (in square brackets), and optionally followed by a ``*''. An item which ends in a ``*'' matches a nested sequence of any number of SGML elements, including zero, each of which match the item without the ``*''. For example ``P*'' will match a element, arbitrarily deeply nested inside other elements. The special GI ``.'' will match any SGML element name. Thus, a common way of finding a element anywhere inside a document is to use the query ``.*/P''. Aname (attribute name) and aval (attribute value) are as per SGML .

A condition with an index matches only the index'th sub-element of the enclosing element. Index counting starts from 0, so the first sub-element is numbered 0. Conditions with indices and atests only match if the index'th sub-element also satisfies the atests. Attribute tests are not exhaustive, i.e. P[rend=it] will match as well as . They will match against both explicitly present and defaulted attribute values, using string equality. Bare anames are satisfied by any value, explicit or defaulted. Matching of queries is bottom-up, deterministic and shortest-first.

Examples of NSL queries

In this section we show some examples of NSL queries, assuming the following DTD.

   <!ELEMENT CORPUS - - (DOC+)>
   <!ELEMENT DOC    - - (DOCNO,TITLE,BODY,IT,NI) >
   <!ELEMENT DOCNO  - - (#PCDATA) >
   <!ELEMENT TITLE  - - (s+) >
   <!ELEMENT BODY   - - (s+) >
   <!ELEMENT IT     - - (#PCDATA) >
   <!ELEMENT NI     - - (#PCDATA) >
   <!ELEMENT s      - - (#PCDATA|w)* >
   <!ELEMENT w      - - (#PCDATA) >
   <!ATTLIST BODY id ID #IMPLIED >
   <!ATTLIST IT   id ID #IMPLIED>
   <!ATTLIST w    rend CDATA #IMPLIED>

The SGML structure of a sample document which uses this DTD is shown in Figure 2.

The hierarchical structure of an example document.

Figure 2

CORPUS/DOC/TITLE/s

The query: "CORPUS/DOC/TITLE/s" means all s elements directly under TITLE's directly under DOC. This is shown graphically in figure 3. The NSL query functions return the indicated items one by one until the set denoted by the query is exhausted.

figure 3

CORPUS/DOC/.*/s

The query: "CORPUS/DOC/./s" means all s's directly under anything directly under DOC, as shown in figure 4.

Figure 4

"CORPUS/DOC/.*/s" means all s's anywhere underneath DOC. ``.*'' can be thought of as standing for all finite sequences of ``.'' For the example document structure this means the same as CORPUS/DOC/./s, but in more nested structures this would not be the case. An alternative way of addressing the same sentences would be to specify .*/s as query.

./.[1]/.[2]/.[0]

We also provide a means of specifying the Nth node in a particular local tree. So "./.[1]/.[2]/.[0]" means the 1st element below the 3rd element below the 2nd element in a stream of elements. as shown in figure 5.

Figure 5

.*/BODY/s[0]

This is also the referent of "CORPUS/DOC[1]/BODY[2]/s[0]" assuming that all our elements are s's under BODY under DOC, which illustrates the combination of positions and types. ".*/BODY/s[0] " refers to the set of the first elements under any BODY which are also s's. The referent of this is shown in figure 6.

figure 6

Additionally, we can also refer to attribute values in the square brackets: ".*/s/w[0 rend=lc]" gets the initial elements under any <s> element so long as they are words with rend=lc (perhaps lower case words starting a sentence).

As will be obvious from the preceding description, the query language is designed to provide a small set of orthogonal features. Queries which depend on knowledge of prior context, such as ``the third element after the first occurrence of a sentence having the attribute quotation'' are not supported. It is however possible for tools to use the lower-level API to find such items if desired. The reason for the limitation is that without it the search engine might be obliged to keep potentially unbounded amounts of context.

simpleq.c - a model NSL application using queries

The following program simpleq.c shows how the NSL API query functions can be used.

Include header file for character functions.

#include <ctype.h>

Include header file for string functions.

#include <string.h>

Include the header file for the NSL API.

#include "nsl.h"

Main program.

void main(int argc, char **argv) {
  NSL_File inf, outf;
  NSL_Doctype dct=NULL;
  NSL_Query qu;
  NSL_Item *item;
  char *ptr, *tagVal=NULL,buf[100],qustr[100];
  int ac=1,len;

  /* defaults for command line arguments */
  /* Name of attribute carrying tag -- set with -t */
  char* tagAttr=(char*)"TYPE";
  /* Name of word element -- set with -w */
  char* wordTag=(char*)"W";
  /* Format string for word, tag -- set with -f */
  char* textFormat=(char*)"%s/%s";

Initialise the NSL SGML API.

  NSLInit(0);

Read the command line options. The optional -d option allows one to specify a .ddb file externally. If not given, then we assume that the input file contains a <?NSL DDB ... > line.

  while (ac<argc) {
    if (STREQ(argv[ac], "-d")) {
      dct=DoctypeFromDdb(argv[++ac]);
      ac++;
    }
    else if (STREQ(argv[ac], "-t")) {
      ptr=tagAttr=argv[++ac];
      ac++;
      /* need upper case for attribute comparison */
      while (*ptr) {
        *ptr=toupper(*ptr);
        ptr++;
      };
    }
    else if (STREQ(argv[ac], "-w")) {
      ptr=wordTag=argv[++ac];
      ac++;
      /* need upper case for tag comparison */
      while (*ptr) {
        *ptr=toupper(*ptr);
        ptr++;
      };
    }
    else if (STREQ(argv[ac], "-f")) {
      textFormat=argv[++ac];
      ac++;
    }
    else {
      break;
    };
  };

  inf=SFFopen(stdin, dct, NSL_read,"");
  dct=DoctypeFromFile(inf);

Open the output nSGML stream using the same doctype declaration.

  outf=SFFopen(stdout, dct, NSL_write_normal,"");

Construct a query, which looks for words anywhere inside paragraphs anywhere inside a text.

    strcpy(qustr,".*/TEXT/.*/P/.*/");
    strcat(qustr,wordTag);
    qu=ParseQuery(qustr);

Read items of the SGML input text. When we find an item which matches the query we execute the body of the while loop. Items which do not match are written to the output stream by GetNextQueryItem. Each call of GetNextQueryItem creates a new item, which it is the responsibility of the programmer to free once it has been used.

    while( ( item=GetNextQueryItem(inf, qu, outf ) ) ) {

If we are inside the text of a word element, then strip off trailing whitespace (in the inner while loop), and write the word and its part of speech in a user defined format. Note that we use PrintItem to write the modified item to the output file. This is in order to keep the nSGML output state up-to-date. Note that the code here assumes that each word element contains only text and no embedded SGML markup. More complex code could cope with this more complex possibility.

        len=strlen(item->data->first);
        while (strchr((char*)" \t\n",item->data->first[len-1])) {
          item->data->first[--len]='\000';
        };
        tagVal=GetAttrStringVal(item,tagAttr);
        sprintf(buf,textFormat,item->data->first,tagVal);
        item->data->first = buf;
        PrintItem(outf, item);

NSL_Items are fixed but their contents are not freed unless we do it ourselves.

        FreeItem(item);
  }; /* end while */

At the very end we need to close the output nSGML stream.

  SFclose(outf);
}

We don't bother to explicitly free the NSL_Query since there was only one created.

A software architecture for simple, efficient SGML applications

The LT NSL software library

David McKelvie & Henry S. Thompson

Abstract

Keywords

Introduction

Normalised `SGML` defined

Compound Annotation and Links

Versions

Entities in `nSGML`

Overview of the NSL API

Stream View

Tree View

NSL API data structure

simple.c -- A model NSL application

NSL queries

Examples of NSL queries

The hierarchical structure of an example document.

CORPUS/DOC/TITLE/s

CORPUS/DOC/.*/s

./.[1]/.[2]/.[0]

.*/BODY/s[0]

simpleq.c - a model NSL application using queries

A software architecture for simple, efficient SGML applications

The LT NSL software library

David McKelvie & Henry S. Thompson

Abstract

Keywords

Introduction

Normalised SGML defined

Compound Annotation and Links

Versions

Entities in nSGML

Overview of the NSL API

Stream View

Tree View

NSL API data structure

simple.c -- A model NSL application

NSL queries

Examples of NSL queries

The hierarchical structure of an example document.

CORPUS/DOC/TITLE/s

CORPUS/DOC/.*/s

./.[1]/.[2]/.[0]

.*/BODY/s[0]

simpleq.c - a model NSL application using queries

Normalised `SGML` defined

Entities in `nSGML`