Erik Naggum's review of DSSSL (DIS)


Article: 6913 of comp.text.sgml
Newsgroups: comp.text.sgml
Date: 05 Dec 1994 23:07:38 UT
From: Erik Naggum <>
Organization: Naggum Software; +47 2295 0313
Message-ID: <>
Subject: DSSSL
[this copy selected from comp.text.sgml Digest 5/6] 


I have read the DSSSL spec.  I think I understand DSSSL.  I like DSSSL.

DSSSL is a fundamental standard, in the sense that you can build on it, and
in the sense that you have to build on it.  DSSSL is a standard that you
are supposed to work with.  it solves no problems on its own, but it makes
solving problems a task I think the right kind of people will enjoy.  it's
systematic in its approach.  it appears to be complete.

this is not a review -- I won't be able to write that until after the
voting officially closes 1994-12-25, although I have heard unofficial
statements to the effect that it will close on 1995-01-25.

this is a note of congratulations to James Clark, to Sharon Adler, and the
whole DSSSL team.  this is a suggestion that if you don't have time to
review the DIS, at least one die-hard SGML'er thinks this is the best thing
that happened to the world of SGML since SGML itself, maybe more than that.
this is a strong suggestion that if you don't know whether you should vote
yea or nay on DSSSL, you will not stand alone in voting yea, and you may
very well stand alone in voting nay.  this is good stuff.  this deserves to
become an International Standard.  (to put this statement in perspective: I
would not have suggested you vote for HyTime if I had reviewed it equally
carefully before being swayed by politically motivated argumentation that
it was important that it pass.  I don't think it should have passed.  it
was and is an immature standard.  DSSSL is a solid piece of work.)

DSSSL appears to be simple and easy to understand because it uses a very
clean, straight-forward, down-to-business language to express itself.  this
can be deceptive.  it's harder than it looks at first sight.  James Clark
gave a talk at an SGML seminar in Norway sponsored by the Norwegian SGML
Users' Group, wherein he cleared up many of the difficult points by giving
me the necessary key pieces to see that the design _is_ really clean and
neat, despite some apparent hairiness.

the query language is declarative, for instance.  that means you only tell
the system what you want it to have done, not what it should do.  its verbs
are a little too close to "what it should do" for me, but once you get the
hang of this adverbial approach to action, the pieces fall into place.
DSSSL follows in the path blazed by the LISP family of languages, which can
expressed in the aphorism: "you can have elegant interfaces to complex
implementations, or kludgey interfaces to simple implementations", that is:
this is not a language you sit down to implement right out of the book.
SGML is hairy and therefore hard to both use and implement.  DSSSL is
elegant and therefore hard to implement, but easy to use.  DSSSL has done
the Right Thing, and provides the programmer access to things he will need,
not things that naturally fall out of the implementation in the form of an
API.  (now I know why I hate API's and the whole business of standardizing
them: people can never get things Right if they focus too narrowly on the
implementation.)  now, note that DSSSL is a real _language_.  whether you
run the expressions in a DSSSL environment, or regard it as data to be
acted upon is not DSSSL decision.  you could do either.  (in fact, you will
do both, but that's only for advanced users.)

the choice of Scheme as the expression language offers several obvious
advantages, and only one disadvantage: it will include a large number of
open and close parentheses.  (note: this is the standard joke argument
against LISP-like languages, which unfortunately too many neophytes take
seriously.)  if you're used to a large number of open and close angle
brackets, or even open and close tags, this is not an issue.  in fact, the
similarities between SGML and LISP-like languages is quite pronounced,
although central figures would deny this.

the most obvious advantage of using Scheme is that the DSSSL team built on
the decades of experience that went into Scheme, not having to invent their
own language.  the second most obvious advantage of using Scheme is that
several of the large SGML vendors are already using languages from the LISP
family in their products, if not Scheme itself, and it has an inordinately
simple syntax that you learn in half an hour.  the language itself is one
of the most successful languages in terms of expressive power mastered per
unit of time expended in learning it.  this is programming language done
right.  unfortunately, getting it to be fast requires brains.  that leaves
your brain to think of other issues.  not everybody should care about how
to make computers fast -- some should also care what they should do.

well, enough marketing for Scheme.  once you try it, you'll understand.  be
warned that lots of people approach Scheme the same way they approach
controversial books (The Bell Curve comes to mind): they don't read it,
they don't know any of the things it actually says, but they have a hell of
a lot of opinions about it.  don't listen.  go read the book (i.e., DSSSL),
and get your paws on a Scheme system.  comp.lang.lisp and comp.lang.scheme
have FAQs that are two miles long, with an incredible number of LISP and
Scheme implementations, some as small as 64K of code on a SPARC (SIOD),
some as big as 10M of code on a SPARC (MIT-Scheme).

it has been said that there are two types of people: those who divide
people into two types and those who don't.  DSSSL will tell you which of
two types of SGML user you are.  EITHER you delve into the SGML Tree
Transformation Process (STTP), and you will remain there to play with SGML
transformation and manipulate documents and structures, and discover that
DSSSL makes it much easier (but still not easy) to compare SGML documents,
to merge deltas and fragments from several authors, to split SGML documents
into parts and to extract useful information from an SGML document.  I
wasn't thrilled about HyQ, because I never quite could figure out what kind
of result type it would return.  with DSSSL I know.  (well, I still have a
few questions in this department, but I think I know, as opposed to know I
don't.)  OR you delve into the SGML Tree Formatting Process (STFP) which
will enable you to attach formatting rules to queries, and you will find
yourself with an inordinately powerful language that can do very complex
things, but it still isn't PostScript.  the DSSSL Lite team has hooked onto
the STFP, and are doing wonders.  now, there's always a third alternative
when somebody tells you're there's two choices: you can COMBINE STTP and
STFP, and you can now do an internal tree transformation that will prepare
for the tasks that are too hard to in STFP alone, or, more correctly, are
too hard to do in one place.  this is synergy done right, too.  the mind
boggles at the possible outcomes of this combined process.  I don't yet
understand how I can use this to do all the things I'd like to do, but
that's because I have only been looking seriously at DSSSL for about two
months, off hours.  this is like getting a 500-piece LEGO set with all the
modern things in it and taking a time trip back to when you were still
creative and enthusiastic about new things.  (this is not a guarantee, so
no lawyers, please.)

the synergy between STTP and STFP is like the synergy between a DTD and an
LPD (Link Process Definition) in SGML, only this time with the semantics,
which I know from many failed experiments in telling people about how LINK
worked was the key abstraction that was too hard to grasp.  now, STTP and
STFP are what LINK should have been, and I really look forward to see them
become available in parsers and "SGML programming environments".

I'll leave it to others to do examples -- the ones I make are way to
abstract to tell people anything, anyway -- but I'm glad I've spent so much
time with DSSSL, and I look forward to see all the good stuff I think
people will do with this.  I'm not finished with DSSSL, either.  I'm only
really, really sorry that I got into it so late.  and to put this statement
in perspective as well: I have never been so dejected as when I found out
what people were actually _using_ SGML for, admittedly with a few bright
spots that did lighten up the whole dark winter night.  I think it will be
harder to screw up with DSSSL, because you can't really get anything useful
done if you don't grasp the core themes, which should take about two pages
of well-written prose by one who doesn't have to learn it thoroughly first
(i.e., not me).  best of all, I think DSSSL will make people write better
SGML.  as people perform better when there's a personally valuable reward
at suitable rest stops along the way and at the end, the fun that I think
DSSSL will bring back to SGML (at least for me), will nudge along those who
didn't really understand what SGML was about, and show them how they can do
wonders once they put things in container elements, and start using that
structure for something.  (that's with specific address to the HTML crowd,
if you didn't get it.)  you are encouraged to think in terms of structure,
not in terms of procedural layout commands when you work with DSSSL
queries.  you don't want to deal with all the state information that you
deal with in the other SGML processing languages, and you don't have to
fight state transitions.  with DSSSL, you don't think of them as such, the
fall out of the specifications.  that said, DSSSL is going to be a lot of
code to write, and I suspect that there's a market for graphical tools
here, too, but, please, don't cripple its elegance.  (take a look at
Apple's Dylan for how to make object-oriented language development
environments right, and still make them graphically oriented.)

I think the synergy of DSSSL with SGML is going to make SGML seem a lot
more attractive to those who have doubted and waited.  here's to cleaner
and more cost effective SGML!

The check is in the mail.  This won't hurt.  You'll find it on the Web.