Trang Multi-Format Schema Converter Supports DTD to W3C XML Schema Conversion
Trang: Multi-format schema converter
Date: Wed, 22 Jan 2003 18:48:32 +0700 From: James Clark <email@example.com> To: firstname.lastname@example.org Subject: ANN: Trang (multi-format schema converter)
I am happy to announce a new release of Trang, my multi-format schema converter. Trang is written in Java, and available under a BSD-style license. In this release, I have added an input module for DTDs based on my DTDinst program. This implies that Trang can now convert directly from DTDs to W3C XML Schema (XSD). This may make it of interest to people outside the RELAX NG community, which is why I am announcing it here.
Although there are other DTD to XSD converters available, Trang has some unique features
It is able to reliably turn parameter entities into the higher-level semantic constructs available in XSD (simple types, groups, attribute groups). It can do this even in the presence of arbitrarily deep nesting of parameter entity references within parameter entity declarations. At the same time, it accurately follows XML 1.0 rules on parameter entity expansion, so that any valid XML 1.0 DTD can be handled. If a parameter entity is used in a way that does not correspond to any of these higher-level semantic constructs (for example parameter entities used to allow a change of namespace prefix in the DTD), then references to that parameter entity are simply expanded.
It supports namespaces, including DTDs that mix multiple namespaces. It interprets DTDs in a namespace-aware way and can automatically create additional files and move declarations between files so as to create valid XML Schemas that accurately capture the intended semantics of the DTD.
It can create good-quality, idiomatic XSD, which takes advantage of features such as substitution groups.
Trang is not limited to converting from DTD to XSD. It supports the following schema languages for XML:
- RELAX NG (XML syntax)
- RELAX NG compact syntax
- XML 1.0 DTDs
- W3C XML Schema
A schema written in any of the supported schema languages can be converted into any of the other supported schema languages, except that W3C XML Schema is supported for output only, not for input.
Trang is constructed around an RELAX NG object model designed to support schema conversion. For each schema language supported for input, there is an input module that can convert from the schema language into this internal object model. Similarly, for each schema language supported for output, there is an output module that can convert from the internal object model in the schema language.
Trang aims to produce human-understandable schemas; it tries for a translation that preserves all aspects of the input schema that may be significant to a human reader, including the definitions, the way the schema is divided into files, annotations and comments.
Trang has a command-line user interface. It has no graphical user interface.
Trang can be downloaded from:
An example of its capabilities as a DTD to XSD converter is available (temporarily) at:
This contains a dtd subdirectory containing the unmodified W3C MathML DTD, and an xsd subdirectory containing the unmodified output from Trang.
If you study this example, you may wonder why Trang turns the PresInCont and Presentation parameter entities into groups rather than abstract elements. The answer is because, as written, the DTD uses multiple inheritance:
<!ENTITY % PresInCont "%ptoken; | %petoken; | %plschema; | %peschema; | %pactions;" > <!ENTITY % Presentation "%ptoken; | %petoken; | %pscreschema; | %plschema; | %peschema; | %pactions;">
If you change it to use single inheritance:
<!ENTITY % Presentation "%PresInCont; | %pscreschema;">
then Trang will use abstract elements.
Prepared by Robin Cover for The XML Cover Pages archive. See other details in the 2003-01-23 news item "Trang Multi-Format Schema Converter Supports DTD to W3C XML Schema Conversion."