A communiqué from Dave Carlson (Ontogenics Corp., Boulder, Colorado) reports on creation of an XML Schema that covers all of XHTML Basic (this may be the first complete XML Schema for XHTML Basic). Details are given in the white paper Modeling XHTML with UML. Carlson writes: "There are a 3-4 situations where it is a bit lenient in accepting markup that it shouldn't, but overall it seems to work quite well. This model makes very heavy use of inheritance to capture the XHTML concept of content groups, such as Flow, Block, Inline, etc. I have generated two different schemas: one uses extension of complexType definitions, the other employs a copy-down strategy to avoid extension. Both schemas work with the XSV validator... What's interesting about this is that the schema was automatically generated from a UML model. The white paper includes all the UML class diagrams for the XHTML Basic modules. I've written a schema generator that produces schemas from any UML tool that can export an XMI 1.0 document representing the model. This model of XHTML was created using Rational Rose... the generated schema also provides a good stress test case for validation tools."
Description of 'Modeling XHTML with UML': "This white paper describes the first complete XML Schema for XHTML Basic, which was adopted as a W3C Recommendation in December 2000. The W3C Recommendation specifies XHTML Basic with a DTD implementation, principally because DTDs were the only recommendation in force at that time. However, we will soon reach a point when the W3C has two schema recommendations, and there are several other XML schema/validation languages that are competing for our attention (RELAX, TREX, and Schematron). Thus, a new approach was taken to produce the XML Schema described here: the XHTML Basic specification was manually reverse-engineered into a Unified Modeling Language (UML) class diagram, then the Schema was automatically generated from that UML model. The Schema generation tool was developed by Ontogenics Corp. (the creator of this XMLmodeling.com portal). Other schema languages can be produced in a similar manner; prototypes are under development for generation of DTD and RELAX..."
From the paper: "XHTML Basic, as its name suggests, represents the essential core of elements required for presentation of hypertext documents. XHTML Basic was designed to become the document format used by Web clients with limited display capabilities, such as mobile phones, PDAs, pagers, and television settop boxes. In addition to reformulating HTML as valid XML documents, XHTML Basic is also part of a broader effort for the Modularization of XHTML, which decomposes the previous monolithic HTML and XHTML 1.0 specifications into separable, reusable modules. Another useful application involves embedding XHTML content within other XML vocabularies. In fact, it is this requirement that created our original motivation for producing a UML model of XHTML elements. We are using UML to design XML vocabularies such as product catalogs, bibliographies, and e-learning content. In those applications, it's often necessary to support HTML presentation content within other elements; for example, within a product's description or within a mini-tutorial embedded in a training markup language. If XHTML elements such as <div>, <p>, or <table> are available as classes in a UML package, then including them within other vocabularies is a simple matter of drawing an association between classes in a UML diagram. The schema generator takes care of the rest, including generation of the necessary import statements for the XHTML schema definitions..."
Principal references:
- Modeling XHTML with UML. By Dave Carlson (CTO, Ontogenics Corp., Boulder, Colorado), WWW. March, 2001. 13 pages. [cache]
- Description of the white paper
- Post to 'xmlschema-dev@w3.org' mailing list
- Modeling the UDDI Schema with UML. White paper.
- Modeling XML Applications with UML, by David Carlson. April 2001.
- Contact: Dave Carlson
- "XML Schemas" - Main reference page.