SGML: RS/RE Processing
Subject: Re: Treatment of RS/RE
Date: 30 Sep 1996 10:29:31 -0700
From: jenglish@crl.com (Joe English)
Newsgroup: comp.text.sgml
Chris Arena <ccarena@pims01.psf.lmco.com> wrote:
>
>I am having some trouble understanding the rules for treatment of RS/RE
>in SGML.
So are we all...
>Is there a good text, thread, or FAQ which unambiguously explains the
>rules?
=========================================================================
Michael Sperberg-McQueen, Charles Goldfarb and James Clark came up with a
very good summary of the rules on the w3c-sgml mailing list a while ago:
| RE is insignificant (i.e. not passed to any downstream application,
| not part of the grove plan) when it occurs in any of the following
| patterns:
|
| start-tag nondata* RE
| RE nondata* end-tag
| RS nondata+ RE
|
| where nondata is defined as:
|
| nondata ::=
| comment declaration
| | processing instruction
| | marked section declaration start
| | marked section end
| | included subelement
| | shortref use declaration
| | link set use declaration
|
| marked section declaration start ::=
| marked section start
| , status keyword specification
| , dso
Noting that the determination of when an RE is insignificant
is one of the last phases of parsing; it takes place after
all entity, character, and short references have been replaced
and after omitted start- and end-tags have been inferred.
The first pattern:
| start-tag nondata* RE
removes the first record-end in an element's content;
the pattern:
| RE nondata* end-tag
removes the last. The third pattern:
| RS nondata+ RE
discards record-ends from input lines containing nothing
but "nondata".
--Joe English
jenglish@crl.com