[This local archive copy is from the official and canonical URL, http://www.redrice.com/ci/generatingXslValidators.html, 1999-05-20; please refer to the canonical source document if possible.]
1999-05-20
An XSL stylesheet can be used to generate XSL validators from XML schemas. This document outlines the mechanics of this process and speculates on other uses for this technology.
This note builds directly on the idea of using XSL as a validator for XML Schemas described by Rick Jellife in Using XSL for Structural Validation.
It is possible to code by hand an XSL stylesheet that will validate an XML document against some or all constraints of an XML schema. This note presents the case for generating such Validator stylesheets automatically by transforming an XML schema through an XSL validator-generator. The resulting XSL validator can then be used at run-time to validate XML documents that claim to conform to the original XML schema, returning an XML document that contains a list of invalid elements or is, for a valid document, empty.
It is perfectly plausible to suppose that a single schema language such as xml-schema will emerge as a dominant standard, and that parsers from leading parser suppliers will come to have have built-in support for validating XML documents against schemas in these languages, in the same way that there is already support for xml-data in the IE5 XML parser.
However the XSL approach has a number of specific advantages, namely that it will provide:
This illustration generates an XSL validator for part of the element content of a DCD schema.
<?xml version="1.0"?> <DCD xmlns:RDF="http://www.w3.org/TR/WD-rdf-syntax#" > <ElementDef Type="Booking" Model="Elements" Content="Closed"> <Description>Describes an airline reservation</Description> <Element>LastName</Element> <Element>FirstInitial</Element> <Element>SeatRow</Element> <Element>SeatLetter</Element> <Element>Departure</Element> <Element>Class</Element> </ElementDef> <!-- example omits boring field declarations --> <ElementDef Type="SeatRow" Model="Data" Datatype="i1" Min="1" Max="72" /> <ElementDef Type="SeatLetter" Model="Data" Datatype="char" Min="A" Max="K"/> <ElementDef Type="Class" Model="Data" Datatype="char" Default="1"/> </DCD> |
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <!-- match the root element --> <xsl:template match="/"> <xsl:element name="zxsl:stylesheet"> <xsl:attribute name="xmlns:zxsl"> http://www.w3.org/TR/WD-xsl </xsl:attribute> <!-- the root processor --> <xsl:element name="zxsl:template" match="/"> <xsl:attribute name="match">/</xsl:attribute> <xsl:element name="zxsl:apply-templates"> <xsl:attribute name="select">*</xsl:attribute> </xsl:element> </xsl:element> <!-- stick the error processor in place --> <xsl:element name="zxsl:template"> <xsl:attribute name="match">*</xsl:attribute> <P> Error: <xsl:element name="zxsl:node-name"/> <xsl:element name="zxsl:value-of"/> </P> </xsl:element> <xsl:apply-templates select="*|@*|comment()|text()"/> </xsl:element> </xsl:template> <!-- priority doesn't work in IE5 - make this the lowest priority match by putting it first... --> <xsl:template match="*|@*|comment()|pi()|text()"> <xsl:apply-templates select="*|@*|comment()|pi()|text()"/> </xsl:template> <!-- pick up the top level ElementDefs --> <xsl:template match="DCD/ElementDef"> <xsl:element name="zxsl:template"> <xsl:attribute name="match"> /<xsl:value-of select="@Type"/> </xsl:attribute> <xsl:element name="zxsl:apply-templates"/> </xsl:element> <xsl:apply-templates select="*|@*|comment()|pi()|text()"/> </xsl:template> <!-- now the content models --> <xsl:template match="ElementDef/Element"> <xsl:element name="zxsl:template"> <xsl:attribute name="match"> <xsl:value-of select="../@Type"/>/<xsl:value-of/> </xsl:attribute> <xsl:element name="zxsl:apply-templates"/> </xsl:element> <xsl:apply-templates select="*|@*|comment()|pi()|text()"/> </xsl:template> </xsl:stylesheet> |
<zxsl:stylesheet xmlns:zxsl="http://www.w3.org/TR/WD-xsl"> <zxsl:template match="/"> <zxsl:apply-templates select="*" /> </zxsl:template> <zxsl:template match="*"> <P> Error: <zxsl:node-name /> <zxsl:value-of /> </P> </zxsl:template> <zxsl:template match="/Booking"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="Booking/LastName"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="Booking/FirstInitial"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="Booking/SeatRow"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="Booking/SeatLetter"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="Booking/Departure"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="Booking/Class"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="/SeatRow"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="/SeatLetter"> <zxsl:apply-templates /> </zxsl:template> <zxsl:template match="/Class"> <zxsl:apply-templates /> </zxsl:template> </zxsl:stylesheet> |
<Booking> <LastName>Bray</LastName><FirstInitial>T</FirstInitial> <SeatRow>33</SeatRow><SeatLetter>B</SeatLetter> <Departure>1997-05-24T07:55:00+1</Departure> <Offence>Speeding</Offence> </Booking> |
<P>Error: Offence Speeding</P> |
Although XSLT has supports string processing, pattern matching and the generation and processing of lists of nodes, it is not clear whether pure XSLT has enough programming features to validate all the constraints of any one schema language (especially since some useful features such as Regular Expressions may still be added). Using JavaScript or Java extensions would provide a possible solution in this case.
Performance is obviously an issue if validation is to be performed at run-time, particularly in the transaction-like environment of middleware. Using XSL may well be slower than using a parser's built-in schema support. However there is likely to be considerable investment in optimising the performance of XSL, and there is also the possiblity of pre-compiling an XSL validation stylesheet to Java (as with SAXON) or C, which would allow compilation or other optimisations.
These uses are possibly more plausible if only in that they have lower performance requirements. But we have to start somewhere - the more schemas add value, the more they will be used in tools and projects, the more ways of adding value will emerge.
Copyright (C) 1999 Francis Norton. Feel free to publish this in any way you like, but please keep my name on it and try to update it to the most recent version.