Cover Pages Logo SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic

[Non-] deterministic content models


Postings from June, 2001. See the XML-DEV archives for other contributions. Note that RELAX NG does not share assumptions of SGML/XML "DTD" syntax which imposes some unusual constraints (and notions of ambiguity).


Date: Thu, 14 Jun 2001 09:30:27 -0700
From: Joe English <jenglish@flightlab.com>
To: xml-dev@lists.xml.org
Subject: Re: Non-deterministic content model

Arthur Rother wrote:

> I came across this one in my first lessons on learning DTD's, where I tried
> to write a DTD for a game of chess:
>
>          ((whitemove, blackmove)*, whitemove?)

That's a neat example.  It's well-known that ((a,b)*,a?)
is hopelessly ambiguous (i.e., there's no way to write
it as deterministic content model), but I've never
seen a real-world example where that construct was
useful before now.

> If this is the only endless case, one could hardcode this in the parser.

It's not -- there's an infinite number of "hopelessly ambiguous"
regular languages.

> How do [XML Spy and XMetal] deal with it?

There are plenty of algorithms for regular expression
matching besides the one prescribed for SGML.
You can convert to a DFA (lex), backtrack (Perl),
use derivatives (TREX), or simulate an NDFA,
to name a few.

The only real problem with nondeterministic
content models in SGML has to do with start-tag
inference (and even that can be worked around).
Since XML doesn't support this feature, it's
not as big of a problem for implementations.


--Joe English

  jenglish@flightlab.com


Date: Fri, 15 Jun 2001 14:08:18 +0100
From: Francis Norton <francis@redrice.com>
To: Jeni Tennison <mail@jenitennison.com>
Cc: xml-dev@lists.xml.org
Subject: Re: Non-deterministic content model

Jeni Tennison wrote:
> 
> RELAX NG doesn't require deterministic patterns; the following
> schema should do it:
> 
<snip/>

Yes, I got this to work with James Clark's jing
(http://www.thaiopensource.com/relaxng/):

<grammar xmlns="http://relaxng.org/ns/structure/0.9">
   <start>
      <element name="game">
         <element name="white">
            <ref name="move"/>
         </element>
         <zeroOrMore>
            <element name="black">
               <ref name="move"/>
            </element>
            <element name="white">
               <ref name="move"/>
            </element>
         </zeroOrMore>
         <optional>
            <element name="black">
               <ref name="move"/>
            </element>
         </optional>
      </element>
   </start>
   <define name="move">
      <text/>
   </define>
</grammar>

The grammar expanded a bit because you can't simply say <element
name="white/>, you have to put in some kind of content description, even
if only <text/> or <empty/>, so I created a common type, move. 
 
> XML Schema forces deterministic model groups, so you can't use that.
>
Interesting.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
   <xs:element name="game">
      <xs:complexType>
         <xs:sequence>
            <xs:element ref="white"/>
            <xs:sequence minOccurs="0" maxOccurs="unbounded">
               <xs:element ref="black"/>
               <xs:element ref="white"/>
            </xs:sequence>
            <xs:element ref="black" minOccurs="0"/>
         </xs:sequence>
      </xs:complexType>
   </xs:element>
   <xs:element name="black">
      <xs:complexType/>
   </xs:element>
   <xs:element name="white">
      <xs:complexType/>
   </xs:element>
</xs:schema>

is an XML Schema equivalent.

XSV and MSXML4 detect the ambiguous content model, XML Spy 4 beta 1
ignores it and validates correctly.

Francis.


Date: Fri, 15 Jun 2001 14:08:18 +0100
From: Francis Norton <francis@redrice.com>
To: Jeni Tennison <mail@jenitennison.com>
Subject: Re: Non-deterministic content model

Jeni Tennison wrote:
> 
> RELAX NG doesn't require deterministic patterns; the following
> schema should do it:
> 
<snip/>

Yes, I got this to work with James Clark's jing
(http://www.thaiopensource.com/relaxng/):

<grammar xmlns="http://relaxng.org/ns/structure/0.9">
   <start>
      <element name="game">
         <element name="white">
            <ref name="move"/>
         </element>
         <zeroOrMore>
            <element name="black">
               <ref name="move"/>
            </element>
            <element name="white">
               <ref name="move"/>
            </element>
         </zeroOrMore>
         <optional>
            <element name="black">
               <ref name="move"/>
            </element>
         </optional>
      </element>
   </start>
   <define name="move">
      <text/>
   </define>
</grammar>

Prepared by Robin Cover for The XML Cover Pages archive. See SGML/XML Notion of Ambiguity (non-deterministic content models).


Globe Image

Document URL: http://xml.coverpages.org/determinism200106.html