SGML: RS/RE Processing

SGML: RS/RE Processing


Subject: Re: Treatment of RS/RE
Date: 30 Sep 1996 10:29:31 -0700
From: jenglish@crl.com (Joe English)
Newsgroup: comp.text.sgml
Chris Arena <ccarena@pims01.psf.lmco.com> wrote: > >I am having some trouble understanding the rules for treatment of RS/RE >in SGML. So are we all... >Is there a good text, thread, or FAQ which unambiguously explains the >rules? ========================================================================= Michael Sperberg-McQueen, Charles Goldfarb and James Clark came up with a very good summary of the rules on the w3c-sgml mailing list a while ago: | RE is insignificant (i.e. not passed to any downstream application, | not part of the grove plan) when it occurs in any of the following | patterns: | | start-tag nondata* RE | RE nondata* end-tag | RS nondata+ RE | | where nondata is defined as: | | nondata ::= | comment declaration | | processing instruction | | marked section declaration start | | marked section end | | included subelement | | shortref use declaration | | link set use declaration | | marked section declaration start ::= | marked section start | , status keyword specification | , dso Noting that the determination of when an RE is insignificant is one of the last phases of parsing; it takes place after all entity, character, and short references have been replaced and after omitted start- and end-tags have been inferred. The first pattern: | start-tag nondata* RE removes the first record-end in an element's content; the pattern: | RE nondata* end-tag removes the last. The third pattern: | RS nondata+ RE discards record-ends from input lines containing nothing but "nondata". --Joe English jenglish@crl.com