[This local archive copy is from the official and canonical URL, http://www.geocities.com/ResearchTriangle/Lab/6259/XTech99/xtech99.zip; please refer to the canonical source document if possible.]

1. Intersection

1) Source schema 1 in the DTD syntax

This DTD allows <Fnote> in <Preface>, but does not allow <Fnote> in <Body>.

C:\regularity\cross>type a1.dtd

<!ELEMENT DOC     (Preface?, Body)>
<!ELEMENT Preface ((Para|Fnote)*)>
<!ELEMENT Body    (Para*)>
<!ELEMENT Para    (#PCDATA)>
<!ELEMENT Fnote   (#PCDATA)>

2) Source schema 2 in the DTD syntax

This DTD allows <Fnote> in <Body>, but does not allow <Fnote> in <Preface>.

C:\regularity\cross>type a2.dtd

<!ELEMENT DOC     (Preface, Body)>
<!ELEMENT Preface (Para*)>
<!ELEMENT Body    ((Para|Fnote)*)>
<!ELEMENT Para    (#PCDATA)>
<!ELEMENT Fnote   (#PCDATA)>

3) Target schema in the DTD syntax

Here is an intersection DTD, which I constructed by hand. We want to automatically construct such a schema.

C:\regularity\cross>type a3.dtd

<!ELEMENT  DOC     (Preface,Body)>
<!ELEMENT  Preface (Para*)>
<!ELEMENT  Body    (Para*)>
<!ELEMENT  Para    (#PCDATA)>

4) Source schema 1 in the rule syntax

Before we show the construction, we have to introduce another syntax for expressing DTDs. In this syntax, the first DTD is represented as below:

C:\regularity\cross>type a1.rl

<!SCHEMA  (NT-4)>
<!ELEMENT Fnote>
<!ELEMENT Para>
<!ELEMENT Body>
<!ELEMENT Preface>
<!ELEMENT DOC>
<!VARIABLE #PCDATA>
<!RULE [NT-0]  Fnote   (NT-5*)>
<!RULE [NT-1]  Para   (NT-5*)>
<!RULE [NT-2]  Body   (NT-1*)>
<!RULE [NT-3]  Preface   ((NT-1|NT-0)*)>
<!RULE [NT-4]  DOC   ((NT-3|""),NT-2)>
<!RULE [NT-5]  #PCDATA>

Content models references to non-terminals rather than generic identifiers. For example, NT-4 is a start non-terminal. We have one rule (i.e., <!RULE [NT-4] DOC ((NT-3|""),NT-2)>) for this non-terminal. It has "DOC" as the generic identifier, and has a content model ((NT-3|""),NT-2). For the non-terminal NT-3, we have only one rule. It has "Preface" as the generic identifier and has ((NT-1|NT-0)*) as a content model.

4) Source schema 2 in the rule syntax

C:\regularity\cross>type a2.rl

<!SCHEMA  (NT-4)>
<!ELEMENT Fnote>
<!ELEMENT Para>
<!ELEMENT Body>
<!ELEMENT Preface>
<!ELEMENT DOC>
<!VARIABLE #PCDATA>
<!RULE [NT-0]  Fnote   (NT-5*)>
<!RULE [NT-1]  Para   (NT-5*)>
<!RULE [NT-2]  Body   ((NT-1|NT-0)*)>
<!RULE [NT-3]  Preface   (NT-1*)>
<!RULE [NT-4]  DOC   (NT-3,NT-2)>
<!RULE [NT-5]  #PCDATA>

5) Intersection schema

Let us automatically construct the intersection schema.

C:\regularity\cross>rcross a1.ha a2.ha | useful | renum | ha2sch | sch2rl

<!SCHEMA  (NT-4)>
<!ELEMENT DOC>
<!ELEMENT Body>
<!ELEMENT Para>
<!ELEMENT Preface>
<!VARIABLE #PCDATA>
<!RULE [NT-4]  DOC   (NT-3,NT-2)>
<!RULE [NT-2]  Body   (NT-1*|"")>
<!RULE [NT-1]  Para   (NT-0*|"")>
<!RULE [NT-3]  Preface   (NT-1*|"")>
<!RULE [NT-0]  #PCDATA>

Observe that this is equivalent to a3.dtd. Thus, we have successfully created the intersection schema.

2. Union

1) Source schema 1 in the DTD syntax

This DTD does not allow <Fnote>.

C:\regularity\union>type a1.dtd

<!ELEMENT DOC     (Preface?, Body)>
<!ELEMENT Preface (Para*)>
<!ELEMENT Body    (Para*)>
<!ELEMENT Para    (#PCDATA)>

2) Source schema 2 in the DTD syntax

This DTD allows <Fnote> in <Body>.

C:\regularity\union>type a2.dtd

<!ELEMENT DOC     (Body)>
<!ELEMENT Body    (Para*)>
<!ELEMENT Para    (#PCDATA|Fnote)*>
<!ELEMENT Fnote   (#PCDATA)>

3) Almost-but-not-quite-union schema in the DTD syntax

We want to construct the union schema. I manually constructed a DTD (shown below), but it fails to capture the union schema.

C:\regularity\union>type a3.dtd

<!ELEMENT DOC     (Preface?, Body)>
<!ELEMENT Preface (Para*)>
<!ELEMENT Body    (Para*)>
<!ELEMENT Para    (#PCDATA|Fnote)*>
<!ELEMENT Fnote   (#PCDATA)>

This DTD allows <Preface> containing <Fnote>. However, a1.dtd does not allow <Fnote> and a2.dtd does not allow <Preface>. Thus, documents matching the union schema never have <Preface> containing <Fnote>. Therefore, this DTD does not caputre the union. The union schema should allow <Para> to have <Fnote> only when <Para> is subordinate to <Body>.

4) Source schema 1 in the rule syntax

C:\regularity\union>type a1.rl

<!SCHEMA  (NT-3)>
<!ELEMENT Para>
<!ELEMENT Body>
<!ELEMENT Preface>
<!ELEMENT DOC>
<!VARIABLE #PCDATA>
<!RULE [NT-0]  Para   (NT-4*)>
<!RULE [NT-1]  Body   (NT-0*)>
<!RULE [NT-2]  Preface   (NT-0*)>
<!RULE [NT-3]  DOC   ((NT-2|""),NT-1)>
<!RULE [NT-4]  #PCDATA>

5) Source schema 2 in the rule syntax

C:\regularity\union>type a2.rl

<!SCHEMA  (NT-3)>
<!ELEMENT Para>
<!ELEMENT Body>
<!ELEMENT Preface>
<!ELEMENT DOC>
<!VARIABLE #PCDATA>
<!RULE [NT-0]  Para   (NT-4*)>
<!RULE [NT-1]  Body   (NT-0*)>
<!RULE [NT-2]  Preface   (NT-0*)>
<!RULE [NT-3]  DOC   ((NT-2|""),NT-1)>
<!RULE [NT-4]  #PCDATA>

6) Union schema in the rule syntax

We can automatically construct the union schema.

C:\regularity\union>runion a1.ha a2.ha | useful | renum | ha2sch | sch2rl

<!SCHEMA  (NT-5|NT-7|NT-9)>
<!ELEMENT DOC>
<!ELEMENT Body>
<!ELEMENT Para>
<!ELEMENT Fnote>
<!ELEMENT Preface>
<!VARIABLE #PCDATA>
<!RULE [NT-5]  DOC   (NT-1)>
<!RULE [NT-7]  DOC   (NT-4,NT-1)>
<!RULE [NT-9]  DOC   (NT-8)>
<!RULE [NT-1]  Body   (NT-2*|"")>
<!RULE [NT-8]  Body   (NT-2*,NT-6,(NT-2|NT-6)*)>
<!RULE [NT-2]  Para   (NT-0*|"")>
<!RULE [NT-6]  Para   (NT-0*,NT-3,(NT-0|NT-3)*)>
<!RULE [NT-3]  Fnote   (NT-0*|"")>
<!RULE [NT-4]  Preface   (NT-2*|"")>
<!RULE [NT-0]  #PCDATA>

Observe that <Para> for NT-6 requires at least one NT-3, which corresponds to <Fnote>, but <Para> for NT-2 requires #PCDATA only. NT-6 is referenced only from NT-8.

6) Loose mapping

The above union schema cannot be captured by the DTD syntax. However, we can automatically loosen the schema so that the result can be expressed in the DTD syntax.

C:\regularity\union>runion a1.ha a2.ha | useful | localize | ha2sch | sch2rl

<!SCHEMA  (NT-0)>
<!ELEMENT DOC>
<!ELEMENT Body>
<!ELEMENT Para>
<!ELEMENT Fnote>
<!ELEMENT Preface>
<!VARIABLE #PCDATA>
<!RULE [NT-0]  DOC   (NT-1|NT-4,NT-1)>
<!RULE [NT-1]  Body   (NT-2*|"")>
<!RULE [NT-2]  Para   ((NT-3|NT-5)*|"")>
<!RULE [NT-3]  Fnote   (NT-5*|"")>
<!RULE [NT-4]  Preface   (NT-2*|"")>
<!RULE [NT-5]  #PCDATA>

This schema is identical to the DTD we constructed by hand.

3. Difference

1) Source schema a1 in the DTD syntax

This DTD allows <Fnote>.

C:\regularity\diff>type a1.dtd

<!ELEMENT DOC     (Body)>
<!ELEMENT Body    ((Para|Fnote)*)>
<!ELEMENT Para    (#PCDATA)>
<!ELEMENT Fnote   (#PCDATA)>

1) Source schema a2 in the DTD syntax

This DTD does not allow <Fnote>.

C:\regularity\diff>type a2.dtd

<!ELEMENT DOC     (Body)>
<!ELEMENT Body    (Para*)>
<!ELEMENT Para    (#PCDATA)>

3) The difference schema in the DTD syntax

This DTD is constructed by hand. If a document has at least one <Fnote>, it is valid against the difference schema.

C:\regularity\diff>type a3.dtd

<!ELEMENT DOC   (Body)>
<!ELEMENT Body  (Para*,Fnote,(Fnote|Para)*)>
<!ELEMENT Para  (#PCDATA)>
<!ELEMENT Fnote (#PCDATA)>

4) The difference schema in the rule syntax

We can automatically construct the difference schema.

C:\regularity\diff>hadiff a1.ha a2.ha | useful | renum | ha2sch | sch2rl

<!SCHEMA  (NT-4)>
<!ELEMENT DOC>
<!ELEMENT Body>
<!ELEMENT Para>
<!ELEMENT Fnote>
<!VARIABLE #PCDATA>
<!RULE [NT-4]  DOC   (NT-3)>
<!RULE [NT-3]  Body   (NT-2*,NT-1,(NT-1|NT-2)*)>
<!RULE [NT-2]  Para   (NT-0*|"")>
<!RULE [NT-1]  Fnote   (NT-0*|"")>
<!RULE [NT-0]  #PCDATA>

5) Source schema b1 in the DTD syntax

This DTD allows <Fnote> in both <Preface> and <Body>.

C:\regularity\diff>type b1.dtd

<!ELEMENT DOC     (Preface, Body)>
<!ELEMENT Preface ((Para|Fnote)*)>
<!ELEMENT Body    ((Para|Fnote)*)>
<!ELEMENT Para    (#PCDATA)>
<!ELEMENT Fnote   (#PCDATA)>

6) Source schema b1 in the DTD syntax

This DTD has <Preface> and <Body>, but does not allow <Fnote>.

C:\regularity\diff>type b2.dtd

<!ELEMENT DOC     (Preface, Body)>
<!ELEMENT Preface (Para*)>
<!ELEMENT Body    (Para*)>
<!ELEMENT Para    (#PCDATA)>

7) Almost-but-not-quite-difference schema in the DTD syntax

I constructed this DTD by hand, but it fails to capture the difference schema. It mandates both <Body> and <Preface> to have at least one <Fnote>. However, if EITHER <Body> or <Preface> has <Fnote>, the document should be valid against the difference schema.

C:\regularity\diff>type b3.dtd

<!ELEMENT DOC     (Preface, Body)>
<!ELEMENT Body    (Para*,Fnote,(Fnote|Para)*)>
<!ELEMENT Preface (Para*,Fnote,(Fnote|Para)*)>
<!ELEMENT Para    (#PCDATA)>
<!ELEMENT Fnote   (#PCDATA)>

8) The difference schema in the rule syntax

We can automatically construct the difference schema.

C:\regularity\diff>hadiff b1.ha b2.ha | useful | renum | ha2sch | sch2rl

<!SCHEMA  (NT-7)>
<!ELEMENT DOC>
<!ELEMENT Body>
<!ELEMENT Para>
<!ELEMENT Fnote>
<!ELEMENT Preface>
<!VARIABLE #PCDATA>
<!RULE [NT-7]  DOC   (NT-6,NT-3|NT-6,NT-5|NT-4,NT-5)>
<!RULE [NT-3]  Body   (NT-2*|"")>
<!RULE [NT-5]  Body   (NT-2*,NT-1,(NT-1|NT-2)*)>
<!RULE [NT-2]  Para   (NT-0*|"")>
<!RULE [NT-1]  Fnote   (NT-0*|"")>
<!RULE [NT-4]  Preface   (NT-2*|"")>
<!RULE [NT-6]  Preface   (NT-2*,NT-1,(NT-1|NT-2)*)>
<!RULE [NT-0]  #PCDATA>

Note that the rule for NT-5 mandates <Fnote> in <Body> and the rule for NT-6 mandates <Fnote> in <Preface>. The rule for NT-7 (which is the start non-terminal) references to at least one of NT-6 and NT-6.

9) Loose mapping

The above union schema cannot be captured by the DTD syntax. However, we can automatically loosen the schema so that the result can be expressed in the DTD syntax.

C:\regularity\diff>hadiff b1.ha b2.ha | useful | renum | localize  | ha2sch | sch2rl

<!SCHEMA  (NT-4)>
<!ELEMENT DOC>
<!ELEMENT Body>
<!ELEMENT Para>
<!ELEMENT Fnote>
<!ELEMENT Preface>
<!VARIABLE #PCDATA>
<!RULE [NT-4]  DOC   (NT-3,NT-2)>
<!RULE [NT-2]  Body   ((NT-0|NT-1)*|"")>
<!RULE [NT-1]  Para   (NT-5*|"")>
<!RULE [NT-0]  Fnote   (NT-5*|"")>
<!RULE [NT-3]  Preface   ((NT-0|NT-1)*|"")>
<!RULE [NT-5]  #PCDATA>

This schema is identical to the DTD we constructed by hand.