[This archive copy from: "Overview: XML, HTML, and all that", by Jon Bosak, Sun Microsystems. Presented on April 11, 1997. Archive copy .ZIP. Note that this "text only" document is greatly impoverished - see the .ZIP archive if possible.]

Overview: XML, HTML, and all that

Jon Bosak
Sun Microsystems
April 11, 1997

For best display, set font size to 24pt.

What is text, really?

Structured markup: the basic idea

Why structured markup?

The separation of presentation from structure and content makes possible

Is HTML structured markup?

Specific features missing from HTML

HTML is not the optimum data format for database interchange or certain kinds of large-scale commercial publishing.

What is SGML?

Major industry DTDs (markup languages)

ATA 2100aircraft industry
CALSmilitary, aerospace
DocBookcomputer software
IBMIDDocIBM software
SAE J2008automobile manufacturing
TMC T2008truck manufacturing
EDGARSecurities and Exchange Commission
ISO 12083journal, book, and magazine publishing
ICADDpublishing for the print-disabled
TEIacademic and scholarly publishing
UTFnews media
HTMLWorld Wide Web

Advantages of generic SGML

Why not just put SGML on the Web?

SGML does provide the key features needed to support future large-scale data-intensive Web applications...

...but SGML is too big and far too complex for most Web software developers (not to mention site administrators).

The attempt to put SGML on the Web

W3C activity "Generic SGML on the Web" suggested at WWW5 in Paris and initial participants recruited at SGML Europe in Munich (May, 1996)

"The goal of the W3C SGML activity is to enable generic SGML to be served, received, and processed on the Web. As in the case of HTML, the implementation of SGML on the Web will require attention not just to structure and content, but also to the standardization of linking and display functions."

What actually happened

XML Part 1 is a self-contained easy-to-implement subset of SGML for use on the Web.

Current working draft of Part 1: http://www.w3.org/pub/WWW/TR/WD-xml-lang-970331.html

XML syntax vs. HTML syntax


Current overview of the activity

The XML activity can be summed up as the adaptation of existing international publishing standards for use on the Web.

Is XML intended to replace HTML?

In a word: No.

Major XML application area #1: Database interchange

Example: Health care data

Bogus Web solution

Real Web solution

Why HTML can't handle the health care example

Generalized data exchange

The role of a hub format

Other applications fitting the database interchange model

Major XML application area #2: Distributed processing

Example: semiconductor data

Why HTML can't handle the semiconductor example

  1. It requires industry-specific markup that cannot be implemented within the confines of the fixed HTML tag set.

  2. It requires that the data representation be platform- and vendor-independent so that data from a variety of sources can be used to drive a variety of distributed applications.

This is a widely applicable model!

Other applications fitting the distributed processing model

Major XML application area #3: user views of the data

Example: Views of our Solaris documentation, server-mediated...

... and client-mediated

Other applications that need user-controlled views

Major XML application area #4: Web agents

Example: the 500-channel TV guide

Once again, a category of applications depends on the ability to standardize on a form of data representation for a particular industry or problem domain.

Hyperlinking in XML

(Tim Bray will present this.)

XML semantics

How do we make XML documents do something?

  1. Stylesheet-based approaches

  2. Programmatic approaches

CSS (Cascading Style Sheets)

Come to this afternoon's XML technology demo session to see what can be done with CSS and XML.

Limitations of CSS in complex applications

Document Style Semantics and Specification Language (DSSSL)

Some DSSSL stylesheet snippets

(element emph
  (make sequence
    font-weight: 'bold))

(define monospace-font-family "Courier")

(element code
  (make sequence
    font-family: monospace-font-family))

(declare-initial-value writing-mode 'left-to-right)
(declare-initial-value font-size 12pt)
(declare-initial-value line-spacing 14pt)
(declare-initial-value font-family "Helvetica")

(define paragraph-indent 24pt)

(element p
  (make paragraph
    first-line-start-indent: (if (first-sibling?)

(define chapter-title-style
    font-size: 18pt
    line-spacing: 24pt
    quadding: 'center))

(element (chapter title)
    (make paragraph
      use: chapter-title-style
      span: 2
      (literal "Chapter "
               (format-number (ancestor-child-number) "1")))
    (make paragraph
      use: chapter-title-style
      span: 2)))

(define (expt b n)
  (if (= n 0)
      (* b (expt b (- n 1)))))

DSSSL: Mixed languages

DSSSL: Mixed scripts

DSSSL: Top-to-bottom languages

DSSSL: Margin attachments

DSSSL: Rotated text areas

DSSSL: Multiple columns

DSSSL: Variant column flows

DSSSL: Column spans and zones

DSSSL: Synchronized columns

DSSSL: Asian language features

DSSSL: Advanced mathematics formatting

DSSSL demonstration: Jade


XML/Java demonstration: Jumbo


Alternative source: