XSet -> RDF Schema via XSLT, EBNF Groves, Bonsai and more...
Subject: XSet -> RDF Schema via XSLT, EBNF Groves, Bonsai and more... From: Jonathan Borden <jborden@mediaone.net> To: XML-Dev Mailing list <xml-dev@xml.org> Date: Tue, 15 Aug 2000 15:46:27 -0400
I have placed a short working description of XSet: the XML EBNF property set at:
http://www.openhealth.org/XSet
this contains links to goodies like an XSLT which transforms XSet into an RDF Schema and the (still very buggy) beginnings of an XSLT which (will hopefully :-)) transform XSet into an ISO Property Set.
On the topic of using XSLT to transform RDF<->Infoset, Dan Conolly has posted links to a couple of nice XSLTs at:
http://lists.w3.org/Archives/Public/www-rdf-interest/2000Aug/0061.html
Eric van der Vlist and I have been having an offline discussion about the similarities between his technique of using a special SAX parser to "expand" entity declarations into "Common XML" content. The advantage of this approach is that XPath and XSLT can be used to process the resultant abstract document (which in his example preserves the entity reference).
This is very similar to the approach of XSet which logically "expands" an XML document into a full-fidelity grove. I have posted an example of an XSet "expansion" of Eric's sample document:
http://www.openhealth.org/XSet/ericvdvexample.xml
and the XSet expansion:
http://www.openhealth.org/XSet/ericvdvxset.xml
Note: this is not a full "grove" because I have pruned constant string and whitespace (S) nodes.
Eric has written an XT output handler to "compress" a resultant transformed tree back into its XML format. We have discussed that this handler as well as his parser (which is derived from Aelfred2) could serve as the basis for a full-fidelity XSet processor. A goal is to provide an XSet "bonsai" or pruning, twisting and compression document which directs the processor as to what level of detail to provide. For example: should it generate "element" events alone, or add STag, ETag and EmptyElementTag events.
XMTP (http://www.openhealth.org/documents/xmtp.htm) is an XSet expansion of a MIME document. In the same way that an XSet expansion of an XML document can be produced by a modified SAX parser, an XMTP expansion of a MIME message can be produced by a MIME parser which emits SAX events.
This technique provides a general mechanism for XPath/XPointer addressing of, and XSLT transformation of arbitrary syntaxes expressable in EBNF. This is the essence of the grove paradigm.
[Subsequent note:
Just to give an idea of how big a job a "full fidelity" property set is, consider the production S of the XML Recommendation, which matches one or more whitespace characters (space, tab, CR, LF). There are, by my eyeball count, 74 instances of S in the production rules. In order to make the Infoset suitable for generating an exact replica of the original, *at least* 74 new information item properties would be required for the representation of whitespace alone!
Neither I nor (I suspect) anyone else has even 1/74th of the patience required to specify such a thing. It has been done once for SGML, and an amazing tour de force it is (and with properties like this, one can see that subsetting via grove plans is a positive necessity). So if you need that level of detail, you know where to find it, since every XML document is an SGML document.
I would welcome a document specifying the Infoset as a grove plan of the SGML property set. If anyone can whip up such a thing by, say, mid-September, I will with great enthusiasm include it as another non-normative appendix, with full credit to the author.]
Jonathan Borden
The Open Healthcare Group
http://www.openhealth.org
Prepared by Robin Cover for The XML Cover Pages archive.