[This local archive copy is from the official and canonical URL, http://www.w3.org/TR/1998/NOTE-XSL-and-CSS-19980911; please refer to the canonical source document if possible.]
Copyright © 1998 W3C (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.
This document is a NOTE made available by the W3 Consortium for discussion purposes. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.
The note is published in the hope that it may provide a useful viewpoint for understanding the relation between various Web specifications.
Comments should be sent to the authors.
This W3C Note describes how XSL [1] and CSS [2] can be used together. In particular, it discusses how XSL can be used as a bridge between complex XML-based documents and the CSS formatting model. It gives an outline of a system for displaying documents in XML-based formats as human-readable, or human-audible, text. To use the CSS properties in the language of XSL, it is necessary to invent an XML-based syntax, compatible with XSL, to represent CSS's properties. No new CSS properties, or other formatting semantics, are defined in this document.
CSS is a powerful and easy to use formatting language. The two levels defined to date, CSS1 and CSS2, offer a wealth of formatting properties, and the next level promises to add even more. CSS is implemented by many programs and the experience from those implementations is being fed back into the development of more advanced formatting properties.
CSS, however, is only a formatting language: it attaches style properties to the elements of a source document. It lacks facilities commonly found in report generators, mail-merge programs, etc., for massaging a set of data into a human-readable format. It assumes that process has been done by an external program. In effect, that is how much of the information on the Web today is produced: information from a database at the server side is extracted and put into an HTML template, which is sent to a client (browser) and formatted and displayed.
With the advent of XML, the expectation is that in many cases the original data, rather than the HTML representation of it, will be sent to the client. This gives the client a richer data-set to work with, but data transformations may be necessary. XSL will be able to perform these transformations.
We can see several ways of using XSL and CSS together:
This note only concerns itself with the last method. This document shows how the set of "CSS objects" might be defined.
The XSL language is still under development. At the time of writing, it is a W3C Working Draft. All syntax shown here is therefore tentative, and only meant to introduce the concepts.
The bulk of an XSL sheet is a series of pattern-action rules. The patterns are similar to CSS's selectors (in function, not necessarily in syntax), but the action part may create an arbitrary number of "objects." The action part of the rule is called the "template" in XSL, and a template and a pattern together are referred to as a "template rule."
An author of an XSL sheet selects a suitable set of objects for his task. The set of objects could be anything for which a specification exists that defines their syntax inside XSL templates; below we show how that specification might look for CSS. The objects need not be formatting objects: they could, e.g., be objects that create SMIL elements or RDF elements. In principle, when an XML syntax for a data-format already exists, it should be fairly easy to derive an XSL template format from that.
An XSL sheet looks like an XML document with a mixture of two kinds of elements: those defined by XSL and those defined by the object language.
The result of applying all matching patterns to a document recursively is a tree of objects. The resulting tree of objects is then interpreted, top-down, according to the definition of each object. If they are (hypothetical) HTML objects, they will produce an HTML document, probably one HTML element for each HTML object. If they are, as in this note, CSS objects, they will produce a certain rendering on screen or in some other medium.
To give a simple example, the template rule below shows how one XML element ("partnumber") expands to a series of CSS objects, with the content of the XML element expanded inside it. "Process-children" indicates the place where the content of partnumber should be put. The elements "template" and "process-children" are defined by XSL; "chunk" is a CSS object (defined by this note). The non-XSL elements must be prefixed with a short string that ends in a colon, to distinguish them from XSL keywords; in this note we've used "css:".
<template match="partnumber"> <css:chunk display="block" font-weight="bold" margin-top="20px"> <css:chunk display="inline" color="red"> Part number:<css:space/> </css:chunk> <css:chunk color="green"> <process-children/> </css:chunk> </css:chunk> <!-- end of block --> </template>
CSS also supports aural renderings. A similar template that produces CSS objects for audio output might be:
<template match="partnumber"> <css:chunk speak="normal" voice-family="female" cue-before="partnumber-jingle.au" pause-after="15ms"> Part number: <process-children/> </css:chunk> </template>
These examples only serve to give the flavor of XSL. XSL supports much more powerful transformations than these two examples show.
As explained above, XSL is designed to be used with different sets of objects. The result-ns attribute at the top of the XSL sheet declares the short string that is used as the prefix (we've chosen "css" in this note, but we could have used "fo", or "r", or anything else), and another attribute then binds the prefix to the definition of the objects:
<stylesheet result-ns="css" xmlns:css="http://www.w3.org/TR/XSL-for-CSS"> ... rest of sheet, with CSS objects... </stylesheet>
Note: as explained earlier, the syntax of XSL is still being developed. Although there will be ways to write selectors (patterns) and templates, and declare the set of objects, the syntax will not be frozen until XSL is issued as W3C Recommendation.
CSS doesn't have an XML syntax, which makes defining its XSL template syntax slightly harder than it would be for, e.g., SMIL or MathML. Below are a few principles for the conversion, and some examples of the result.
CSS has properties like font and border, but also font-size and border-top, which allow small aspects of a font or border to be specified. The CSS objects for XSL could either maintain this redundancy, or allow only one way to set a property. To minimize the number of surprises for people using the CSS objects, allowing both the shorthand and the individual properties is probably advisable.
CSS properties become XML
attributes in the XSL syntax. Some CSS properties (font-family,
content) accept both quoted strings and keywords. In the
XSL syntax that would become font-family="'Times
Roman', serif"
, which invites errors. Some possible ways to
avoid double quoting are given below.
The main CSS object is
called chunk
. A chunk has properties and
usually some text content and/or embedded objects (often other
chunks). If the output medium is visual, a chunk typically produces a
single box, although it may also produce multiple boxes, if its
'display' property is 'inline', or no boxes at all, if 'display' is
'none'.
Some auxiliary objects may be necessary, either
variants of chunks with extra functionality (e.g., anchor
),
or objects to get around restrictions on XSL syntax (e.g., switch
).
A chunk
is reminiscent of the {...}-block
of a normal CSS rule. For example:
{ font-size: 10px; color: #FB9; text-indent: 1em }
would become:
<css:chunk font-size="10px" color="#FB9" text-indent="1em">
Pseudo-classes in CSS serve to select elements based on information other than what can be learned from the source document. Examples are ":active," ":visited," and ":hover." One way to handle them is with a switch object that contains chunks for all possible states, and let the renderer switch between the chunks, based on the truth of some condition attached to them. For example:
<css:switch text-decoration="underline" background="red" font-style="italic"> <css:chunk condition="active | hover" color="...">...</> <css:chunk condition="visited" color="...">...</> </css:switch>
The switch
object contains the properties common to
the alternatives, and each alternative has a condition attribute that contains the
condition under which this chunk is displayed. (If there is more than
one URL, a condition like "visited" also needs a way to indicate which URL is visited.)
Note that the "first-child" pseudo-class is handled by XSL's patterns directly.
Pseudo-elements in CSS refer to parts of a displayed element, for which there is no (or cannot be) mark-up in the source document. ":Before" and ":after" are used to insert new elements where the source didn't have any, and ":first-letter" and ":first-line" refer to the first letter/line of a block box as actually displayed on the screen.
Since XSL templates allow an arbitrary number of objects to be created, the ":before" and ":after" are automatically catered for. The ":first-letter" and ":first-line" probably need something like the switch object above. For example:
<css:compound font="12pt Times" line-height="1.2" text-align="left"> <css:first-line font-variant="small-caps" color="green"/> <process-children/> </css:compound>
The compound
object is like a normal chunk
, but it may have two special children,
first-line
and first-letter
, that hold the properties of the
pseudo-elements.
For paged media, CSS2
allows the characteristics of the pages to be described with @-rules.
Since XSL has no place for global declarations (at least not in the
August 1998 draft), the best place to put
them is probably near the root of the generated object tree. An @page-rule might translate to a
page
object:
<template match="/"> <css:page size="landscape" margin="1.5in 1in" marks="crop"/> <css:page name="left" .../> <css:page name="rotated" .../> </template>
Selection of output medium could be handled outside of XSL, like it is for CSS level 1. That means that to write an XSL sheet for two media, say print and screen, one has to write two sheets. They could still import a sheet with common rules.
Another @-rule in CSS is the one for defining Web-fonts. These, again, should probably become objects that are attached close to the root of the object tree:
<template match="/"> <css:font font-family="Pantani" panose1="4726402695" font-style="all" /> </template>
Replaced elements could be handled
with a replaced
object, which has the
combined attributes of the chunk
and the
object
element from HTML:
<css:replaced src="/Icons/w3c_home.png" type="image/png" params="..." border="solid red" .../>
Hyperlink source anchors
and form elements have not only style, but also a behavior when a user
activates them. In CSS that behavior doesn't need to be specified,
since the displayed boxes on the screen have a direct link to elements
in the source, and those elements come with their own semantics.
Because of the transformation that takes place in XSL templates, that
back-link is not available, and the transformation needs to carry any
behavior information forward from the source elements to the generated
objects. How this can best be done is still an open issue. Introducing
objects like anchor
(like chunk
, but with an extra href
attribute), and form, input,
etc. may be a solution.
To make the syntax
slightly easier to read, it may be possible to make counters into counter
objects. XSL allows literal text to be
inserted in the templates directly, obviating the need for the content property.
There are several possibilities for defining
whitespace handling inside templates. The easiest seems to be to
define that whitespace in templates is not significant, except as
separator between words. Another way to express this is to say that
whitespace is collapsed: leading and trailing whitespace is removed,
and any other sequences of whitespace characters replaced by a single
space. To get extra spaces, or newlines or tabs, they have to be
inserted explicitly. Below we will use space
to insert a space and newline
to insert a hard line break. XSL will
probably provide a generic text
object
for that (somewhat similar to the pre
of
HTML), in which case space
and newline
can be defined as XSL macros.
<template match="fig"> <css:chunk text-align="center"> Figure <css:space/> <css:counter name="figno" style="upper-alpha" font-weight="bold" color="blue" /> <css:newline/> <process-children/> </css:chunk> </template>
XSL provides a predefined object that inserts the value of an attribute. Using that, the attr(...) function of CSS would be replaced by
<value-of expr="attribute(...)" />
Font-family is another property that uses a mixture of quoted strings and keywords. It could be split into two. That would make the inheritance model different, but avoid quotes inside quotes:
<css:chunk font-family="Helvetica, gill sans" generic-font-family="sans-serif">...</>
text-align also accepts strings; a similar split might be possible.
It may be lead to more
easily readable sheets if a small number of convenience objects, in
the form of "subclasses" or "curried" versions of chunk
, is added. Obvious candidates are hr
(horizontal rule, a chunk
with certain border properties preset),
block
(a chunk
with 'display' preset to 'block'), and
inline
(analogous for 'inline').
Here is an example of a template rule that uses those derived
objects. It formats para
elements from
the source document as CSS block elements with a red pilcrow sign at
the end.
<template match="para"> <css:block margin-top="1.2em"> <process-children /> <css:inline color="#F00">¶</css:inline> </css:block> </template>
The authors wish to thank Tim Berners-Lee, James Clark, Martin Dürst, Chris Lilley, Vincent Quint and Steve Zilles for their comments to this Note. Still, any views expressed in this Note are solely those of the authors.