SGML: Arch engine examples?
Subject: Arch engine examples? [was: SP 1.1 released]
Date: Mon, 10 Jun 1996 16:13:19 +0000
From: James Clark <jjc@jclark.com>
Newsgroups: comp.text.sgml
--------------------------------------------------------------------------
> From: connolly@w3.org (Dan Connolly)
> Newsgroups: comp.text.sgml
> Date: 07 Jun 1996 01:05:27 -0400
>
> In article <9606061215.AA16600@jclark.com> James Clark <jjc@jclark.com> writes:
> >
> > New in this release is an architecture engine. If you build an
> > application with SP that works with documents conforming to some DTD,
> > the architecture engine will allow it automatically to work with any
> > document that conforms to the architecture which has that DTD as its
> > meta-DTD.
>
> I've seen obliq references to this, and now I've read the documentation.
> I think I get it, but I'd sure like to see some examples. Anybody
> got some I could look at?
Here's a toy example:
$ cat htmljjc.dtd
<?ArcBase blink font>
<!entity % blinkDTD SYSTEM "htmlblnk.dtd">
<!notation blink system>
<!attlist #notation blink
ArcDocF NAME #FIXED html
ArcDTD CDATA #FIXED "%blinkDTD"
>
<!notation font system>
<!entity % fontDTD SYSTEM "htmlfont.dtd">
<!attlist #notation font
ArcDocF NAME #FIXED html
ArcDTD CDATA #FIXED "%fontDTD"
ArcFormA NAME #FIXED htmlfont
>
<!ENTITY % font " TT | B | I | FONT ">
<!ATTLIST FONT
SIZE CDATA #REQUIRED
>
<!ENTITY % phrase "EM | STRONG | CODE | SAMP | KBD | VAR | CITE | BLINK">
<!ENTITY % strictDTD PUBLIC "-//IETF//DTD HTML Strict//EN">
<!ATTLIST BLINK
HTMLFONT NAME #FIXED EM
>
%strictDTD;
$ cat htmlblnk.dtd
<?ArcBase strict>
<!ENTITY % strictDTD PUBLIC "-//IETF//DTD HTML Strict//EN">
<!notation strict system>
<!attlist #notation strict
ArcDocF NAME #FIXED html
ArcDTD CDATA #FIXED "%strictDTD"
>
<!ENTITY % phrase "EM | STRONG | CODE | SAMP | KBD | VAR | CITE | BLINK">
<!ATTLIST BLINK strict NAME STRONG>
%strictDTD;
$ cat htmlfont.dtd
<?ArcBase strict>
<!ENTITY % strictDTD PUBLIC "-//IETF//DTD HTML Strict//EN">
<!notation strict system>
<!attlist #notation strict
ArcDocF NAME #FIXED html
ArcDTD CDATA #FIXED "%strictDTD"
>
<!ENTITY % font " TT | B | I | FONT ">
<!ATTLIST FONT
SIZE CDATA #REQUIRED
>
%strictDTD;
$ cat html.sgm
<!-- Try this with:
sgmlnorm -d
sgmlnorm -d -Ablink
sgmlnorm -d -Ablink -Astrict
sgmlnorm -d -Afont
sgmlnorm -d -Afont -Astrict
-->
<!DOCTYPE HTML SYSTEM "htmljjc.dtd">
<TITLE>A demo HTML document</TITLE>
<P>
Some <BLINK>blinking</BLINK> text.
<FONT SIZE="+3">Bigger size</FONT> text.
$ sgmlnorm -d -Ablink -c../pubtext/html.soc html.sgm
<!DOCTYPE HTML SYSTEM "htmlblnk.dtd">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <BLINK>blinking</BLINK> text.
Bigger size text.</P>
</BODY>
</HTML>
$ sgmlnorm -d -Ablink -Astrict -c../pubtext/html.soc html.sgm
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <STRONG>blinking</STRONG> text.
Bigger size text.</P>
</BODY>
</HTML>
$ sgmlnorm -d -Afont -c../pubtext/html.soc html.sgm
<!DOCTYPE HTML SYSTEM "htmlfont.dtd">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <EM>blinking</EM> text.
<FONT SIZE="+3">Bigger size</FONT> text.</P>
</BODY>
</HTML>
$ sgmlnorm -d -Afont -Astrict -c../pubtext/html.soc html.sgm
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
<HTML>
<HEAD>
<TITLE>A demo HTML document</TITLE>
</HEAD>
<BODY>
<P>Some <EM>blinking</EM> text.
Bigger size text.</P>
</BODY>
</HTML>
$
> I suspect there are some things (like cross reference processing)
> which are not expressable,
Only fairly simple transformations are expressable:
- you can omit elements
- you can omit data
- you can omit attributes
- you can rename elements
- you can rename attributes
- you can map the content of an element to an attribute
- you can map an attribute to content
That's more or less it.
An architecture engine is certainly not a general SGML transformation
system. One the other hand if a transformation is expressable, it's
easy to write, because the architecture engine knows about the result
DTD and defaults things appropriately. Also the transformation
requires only limited lookahead, so you can easily do it in a browser
as you're reading in the document.
To get from typical SGML to HTML, there are at least a couple of
things you need that you can't do with an architecture engine. One,
which you mentioned, is cross references. Another is splitting an
SGML document into chunks that are convenient for delivery and
generating some sort of TOC to allow navigation of the chunks.
One way to deal with this would be to define a super-HTML which has
some additional element types corresponding to typical SGML usage
patterns such as ID/IDREF based linking, and perhaps some variant of
DIV which has a title element associated with it. Use an architecture
engine to transform SGML documents into this super-HTML and then have
a custum tool that goes from super-HTML to regular HTML.
James