DocBook and Jade for Literate Programming

Date:      Mon, 02 Nov 1998 10:42:02 -0600
From:      "W. Eliot Kimber" <eliot@isogen.com>
To:        dssslist@mulberrytech.com
Subject:   RE: DocBook and Jade for Literate Programming
           (The DSSSList  Digest V2 #178)

At 07:46 AM 11/2/98 -0800, Wroth, Mark wrote:

> Have you looked at W. Eliot Kimber's  "Using SGML Architectures and
> DSSSL to Do Literate Programming"
> (http://www.sil.org/sgml/kimberDSSSLLitProg.html)?  He appeared to be
> attacking the same basic problem, although I confess that his approach
> was beyond me (and appears specialized to DTDs, although I may be
> misinterpreting him badly).

Actually, it's for DSSSL specs, not DTDs. It takes advantage of the fact the DSSSL processors operate on the DSSSL architectural instance of the DSSSL spec, not its base markup, so you can put all sorts of things in your DSSSL spec, like documentation that will be ignored as a result of the normal architectural processing that Jade does when it reads a DSSSL spec.

This is relevant to literate programming only if you write your output processor as an architecture-based process, which might be a good idea. You could do this with Jade since you can do architecture-based processing of your input document at no extra cost.

For example, say you define an architecture that provides the fundamental elements you need to combine code and documentation, let's call it "LitProgArch". You then write a DSSSL spec in terms of this architecture (that is, in terms of the element types and attributes the architectural DTD defines). You could then create "program documents" that use this architectural DTD directly or that specialize it.

For example, say our architecture defines three element types:

- LitProgDoc -- Document element
- Code       -- Contains literal source code
- Doc        -- Contains documentation

Declared like so (litprogarch.dtd):

<!-- Literate programming architectural DTD: -->
<!ELEMENT LitProgArch
  - -
  (Code | Doc)*
>
<!ELEMENT Code
  - -
  (#PCDATA)*
>
<!ELEMENT Doc
  - -
  (#PCDATA)*
>
<!-- End of architectural DTD -->

An instance might look like this:

<!DOCTYPE LitProgDoc SYSTEM "litprogdoc.dtd">
<LitProgDoc>
<Doc>This is a program</Doc>
<Code>
print "Hello world"
</Code>
</LitProgDoc>

However, you want more stuff in your documentation, so you create a specialized DTD derived from the LitProgArch architecture:

<!DOCTYPE MyLitProgDoc [
 <!-- Declare use of Literation Programming architecture: -->
 <?IS10744 arch name="LitProgArch"
   public-id="+//IDN isogen.com//DOCUMENT Literate Programming
Architecture//EN"
   dtd-system-id="litprogarch.dtd"
   doc-element="LitProgDoc"
 >
 <!ELEMENT MyLitProgDoc 
   - -
   (Header,
    CodeSection+)
 >
 <!ATTLIST MyLitProgDoc
   LitProgArch
     NAME
     #FIXED "LitProgDoc"
 >
 <!ELEMENT Header
   - -
   (#PCDATA)* 
 >
 <!ATTLIST Header
   LitProgArch
     NAME
     #FIXED "Doc"
 >
 <!ELEMENT CodeSection  -- NOTE: Not mapped to anything in LitProgArch --
   - -
   (FuncHeader, 
    FuncBody)+
 >
 <!ELEMENT FuncHeader
   - -
   (#PCDATA)*
 >
 <!ATTLIST FuncHeader
   LitProgArch
     NAME
     #FIXED "Doc"
 >
 <!ELEMENT FuncBody
   - -
   (#PCDATA)*
 >
 <!ATTLIST FuncBody
   LitProgArch
     NAME
     #FIXED "Code"
 >
]>
<MyLitProgDoc>
<Header>This is a program</Header>
<CodeSection>
<FuncHeader>A function</FuncHeader>
<FuncBody>
def foo():
  return 1
</FuncBody>
</CodeSection>
</MyLitProgDoc>

If you work out the mapping in your head, it should be clear that you'll get a document that looks like this when you resolve the mapping to the LitProgArch (as indicated by the LitProgArch attributes):

<LitProgDoc>
<Doc>This is a program</Doc>
<Doc>A function</Doc>
<Code>
def foo():
  return 1
</Code>
</LitProgDoc>

Note that the CodeSection start and end tags disappear because the element doesn't map to anything.

To process the specialized document with Jade, you'd do this:

jade -ALitProgArch -dlitprogarch.dsl -tSGML myprogram.sgml > out.py

The -A flag tells Jade to process the input document in terms of its mapping to the architecture named LitProgArch (the name used in the name= attribute of the architecture use declaration PI shown above). The rest is normal. The DSSSL spec might look something like this:


<!-- litprogarch.dsl -->
<!DOCTYPE dsssl-specification ...>
<dsssl-specification> 
&normal-sgml-output-stuff;
(element LitProgDoc
  (make formatting-instruction 
    data: (literal "# Program generated by litprogarch.dsl")))

(element Doc
  (make formatting-instruction
    data:
     (create-python-comment-block (current-node))))
 
(element Code
  (process-children))
</dsssl-specfication>

Now you have a framework and infrastructure for doing literate programming that people can specialize to their own ends without the need to completely re-implement the whole business.

Obviously, you'd want a much more sophisticated base architecture than I've shown here, but you get the idea.

Cheers,

<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004
www.isogen.com
</Address>

DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist