[Mirrored from: http://www5conf.inria.fr/fich_html/slides/dday/sgml/all.htm]

An SGML-Based Web Server

Jon Bosak, SunSoft

What is SGML?

Major industry DTDs (markup languages)

ATA 2100aircraft industry
CALSmilitary, aerospace
CMCpharmaceuticals
PCISsemiconductors
DocBookcomputer software
IBMIDDocIBM software
SAE J2008automobile manufacturing
TMC T2008truck manufacturing
TIMtelecommunications
EDGARSecurities and Exchange Commission
ISO 12083journal, book, and magazine publishing
ICADDpublishing for the print-disabled
TEIacademic and scholarly publishing
UTFnews media
HTMLWorld Wide Web

HTML is just one of many standardized special-purpose SGML markup languages.

HTML vs. most other SGML languages

Basic document model from DocBook:

<!ELEMENT Book - - ((Title, TitleAbbrev?)?, BookInfo?, ToC?, LoT*, Preface*,
                (((%chapter.gp;)+, Reference*) | Part+ | Reference+ |
                Article+), (%appendix.gp;)*, Glossary?, Bibliography?,
                (%index.gp;)*, LoT*, ToC? ) +(%ubiq.gp;) >
[...]
<!ELEMENT Chapter - - (DocInfo?, Title, TitleAbbrev?, (%sect1.gp;), (Index |
                Glossary | Bibliography)*) +(%ubiq.gp;) >
[...previously defined:]
<!ENTITY % sect1.gp "((%component.gp;)+, (Sect1* | RefEntry*)) | Sect1+ |
                RefEntry+" >
[...]
<!ELEMENT Sect1 - - (Title, TitleAbbrev?, (%nav.gp;)*, (((%component.gp;)+,
                (RefEntry* | Sect2*)) | RefEntry+ | Sect2+), (%nav.gp;)*)
                +(%ubiq.gp;) >

Basic document model from HTML 2.0:

<!ENTITY % html.content "HEAD, BODY">
<!ELEMENT HTML O O  (%html.content)>

<!ENTITY % body.content "(%heading | %text | %block | HR | ADDRESS)*">
<!ELEMENT BODY O O  %body.content>

HTML documents differ from documents marked up in most other standard SGML languages in that they lack a controlled hierarchical structure.

Implications of the HTML content model

HTML is too limited to serve as an adequate data format for large-scale commercial publishing.

HTML tools vs. generic SGML tools

HTML as a server format

Advantages of HTML on the server

. . . but it does not scale

Example:

3125 manually created hypertext links are required to make a table of contents for this one book

And what about revisions?

HTML as a client format

An HTML-based Web server is limited to a flat, unorganized (or tediously handcrafted) document space. A generic SGML Web server, on the other hand, delivers the power of a hierarchical object-oriented document database.

Advantages of generic SGML on the server

Some SGML-based Web servers

http://occam.sjf.novell.com:8080/docs/toc.pubs_server.html

http://www.sgi.com/Technology/TechPubs/lib/display.cgi?4097

http://cobweb.sybase.com:8000/

And see:

http://www.w3.org/pub/Conferences/WWW4/Papers/112

Case study: Novell

Note: The speaker left Novell to work for SunSoft in January, 1996. All descriptions of Novell's document server are valid as of that date but should not be taken as necessarily descriptive of Novell's current direction. However, everything said in this presentation may be taken in a general way as applying to SunSoft's current direction.

Novell's problem (1991-1994)

They started with this...

And needed to get to all of these...

Wrong answer

This is an m x n solution.

Right answer

This is an m + n solution.

The Novell Publications Server (January 1996)

The Novell Publications Server (future)

The next step: generic SGML on the Web

Example:

http://www.ncsa.uiuc.edu/SDG/Software/WinMosaic/Viewers/panorama.htm

then

http://www.sq.com

The case for generic SGML on the Web

Examples of distributed document processing requiring generic SGML