SGML: Arch engine examples

Architecture engine examples


Subject:      Arch engine examples? [was: SP 1.1 released]
From:         James Clark <jjc@jclark.com>
Date:         1996/06/10
Message-Id:   <9606101513.AA03030@jclark.com>

       ------------------------------------------------------

> Newsgroups: comp.text.sgml
> Date: 07 Jun 1996 01:05:27 -0400
> 
> In article <9606061215.AA16600@jclark.com> James Clark writes:
>  > 
>  > New in this release is an architecture engine. If you build an
>  > application with SP that works with documents conforming to some DTD,
>  > the architecture engine will allow it automatically to work with any
>  > document that conforms to the architecture which has that DTD as its
>  > meta-DTD.
> 
> I've seen obliq references to this, and now I've read the documentation.
> I think I get it, but I'd sure like to see some examples. Anybody
> got some I could look at?

Here's a toy example:

$ cat htmljjc.dtd 
<?ArcBase blink font>
<!entity % blinkDTD SYSTEM "htmlblnk.dtd">
<!notation blink system>
<!attlist #notation blink
  ArcDocF NAME #FIXED html
  ArcDTD CDATA #FIXED "%blinkDTD"
>
<!notation font system>
<!entity % fontDTD SYSTEM "htmlfont.dtd">
<!attlist #notation font
  ArcDocF NAME #FIXED html
  ArcDTD CDATA #FIXED "%fontDTD"
  ArcFormA NAME #FIXED htmlfont
>
 
<!ENTITY % font " TT | B | I | FONT ">
<!ATTLIST FONT 
  SIZE CDATA #REQUIRED
>
<!ENTITY % phrase "EM | STRONG | CODE | SAMP | KBD | VAR | CITE | BLINK">
<!ENTITY % strictDTD PUBLIC "-//IETF//DTD HTML Strict//EN">
<!ATTLIST BLINK
  HTMLFONT NAME #FIXED EM
>
%strictDTD;
$ cat htmlblnk.dtd 
<?ArcBase strict>
<!ENTITY % strictDTD PUBLIC "-//IETF//DTD HTML Strict//EN">
<!notation strict system>
<!attlist #notation strict
  ArcDocF NAME #FIXED html
  ArcDTD CDATA #FIXED "%strictDTD"
>
<!ENTITY % phrase "EM | STRONG | CODE | SAMP | KBD | VAR | CITE | BLINK">
<!ATTLIST BLINK strict NAME STRONG>
%strictDTD;
$ cat htmlfont.dtd 
<?ArcBase strict>
<!ENTITY % strictDTD PUBLIC "-//IETF//DTD HTML Strict//EN">
<!notation strict system>
<!attlist #notation strict
  ArcDocF NAME #FIXED html
  ArcDTD CDATA #FIXED "%strictDTD"
>
<!ENTITY % font " TT | B | I | FONT ">
<!ATTLIST FONT 
  SIZE CDATA #REQUIRED
>
%strictDTD;
$ cat html.sgm 
<!-- Try this with:
sgmlnorm -d
sgmlnorm -d -Ablink
sgmlnorm -d -Ablink -Astrict
sgmlnorm -d -Afont
sgmlnorm -d -Afont -Astrict
-->
<!DOCTYPE HTML SYSTEM "htmljjc.dtd">
<TITLE>A demo HTML document</TITLE>
<P>
Some <BLINK>blinking</BLINK> text.
<FONT SIZE="+3">Bigger size</FONT> text.
$ sgmlnorm -d -Ablink -c../pubtext/html.soc html.sgm 
<!DOCTYPE HTML SYSTEM "htmlblnk.dtd">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <BLINK>blinking</BLINK> text.
Bigger size text.</P>
</BODY>
</HTML>
$ sgmlnorm -d -Ablink -Astrict -c../pubtext/html.soc html.sgm 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <STRONG>blinking</STRONG> text.
Bigger size text.</P>
</BODY>
</HTML>
$ sgmlnorm -d -Afont -c../pubtext/html.soc html.sgm 
<!DOCTYPE HTML SYSTEM "htmlfont.dtd">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <EM>blinking</EM> text.
<FONT SIZE="+3">Bigger size</FONT> text.</P>
</BODY>
</HTML>
$ sgmlnorm -d -Afont -Astrict -c../pubtext/html.soc html.sgm 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <EM>blinking</EM> text.
Bigger size text.</P>
</BODY>
</HTML>
$

> I suspect there are some things (like cross reference processing)
> which are not expressable,

Only fairly simple transformations are expressable:

- you can omit elements

- you can omit data

- you can omit attributes

- you can rename elements

- you can rename attributes

- you can map the content of an element to an attribute

- you can map an attribute to content

That's more or less it.

An architecture engine is certainly not a general SGML transformation
system.  One the other hand if a transformation is expressable, it's
easy to write, because the architecture engine knows about the result
DTD and defaults things appropriately.  Also the transformation
requires only limited lookahead, so you can easily do it in a browser
as you're reading in the document.

To get from typical SGML to HTML, there are at least a couple of
things you need that you can't do with an architecture engine.  One,
which you mentioned, is cross references.  Another is splitting an
SGML document into chunks that are convenient for delivery and
generating some sort of TOC to allow navigation of the chunks.

One way to deal with this would be to define a super-HTML which has
some additional element types corresponding to typical SGML usage
patterns such as ID/IDREF based linking, and perhaps some variant of
DIV which has a title element associated with it.  Use an architecture
engine to transform SGML documents into this super-HTML and then have
a custum tool that goes from super-HTML to regular HTML.

James