This document describes the release notes and usage of the experimental DITA XML Schema on developerWorks(R).
This DITA release includes an implementation of the topic architecture in XML Schema. The DTDs are still the canonical representation of DITA. The design pattern for the DITA Schemas is based on the W3C XML Schema 1.0 Specification and may be subject to change in the future.
The specialization process and design pattern for the DITA XML Schema are still being developed and refined. As such, the specialization process stated below should not interpreted as "the definitive process", but simply one method to specialize information types and domains.
A public mailing/discussion list to support users of DITA is available at Yahoo!Groups 'dita-users'
There are a number of tools available to create, validate, or transform DITA XML Schemas. Here is a small list:
You can invoke Xerces XML document validation using SAX or DOM via the command-line:
java sax.Counter -v -s [xmlDocument] java dom.Counter -v -s [xmlDocument]
The DITA XML Schema architecture attempts to follow the naming convention established in the current DITA DTD architecture. Each element has its own named content model, i.e., topic.class for the topic element. Attributes that have an enumerated list of values in the DTD have their own class too, such as importance-att.class for the attribute important.
This version of the DITA XML Schema does not use W3C XML Schema inheritance to model the DITA Architecture. In previous attempts to use the more efficient inheritance model, various Schema processors have implemented the "particle restriction" rules inconsistently. In order to have the same functionality as substitutionGroups without inheritance, a new layer was added to the design pattern.
Creating a New Information Type
Here are some simple steps that will make specialization easier according to the present design pattern:
This schema document is a shell for the new information type. In it one includes parent information types and existing or new domains.
<xs:include schemaLocation="mySpec.mod" />
<xs:complexType name="myElement.class" mixed="true" > </xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded"> </xs:choice>
<xs:attribute ref="class" default="- topic/ph mySpec/mySpecElement "/>The class attribute value starts and ends with white space, and contains a list of blank-delimited values. Each value has two parts: The first part identifies a topic type, and the second part (after a /) identifies an element type. The class attribute must include a mapping for every topic type in the specialized type's ancestry, even those in which no element renaming occurred
This file is a new part of the specialization for this release. The main reason for this new file is to support mimic schema inheritance without using the inheritance model in W3C XML Schema 1.0 specification. The process is very similar to the DITA DTD design pattern. Each element has its owned named group content model.
Each information type has its own *.grp file. In it one defines a new group for each new specialized element in the information type. More will be explained
<xs:group name="myElement"> <xs:sequence> <xs:element ref="myElement" /> </xs:sequence> </xs:group>
<xs:redefine schemaLocation="mySpec.grp" /> <xs:group name="keyword"> <xs:choice> <xs:group ref="keyword"/> <xs:group ref="md-d-keyword" /> </xs:choice> </xs:group>balh
Creating a new Domain
<xs:complexType name="myDomainElement.class" mixed="true" > </xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded"> </xs:choice>
<xs:group name="md-d-pre"> <xs:choice> <xs:element ref="myDomainElement"/> </xs:choice > </xs:group >
<xs:attribute ref="class" default="+ topic/pre md-d/myDomainElement ">For a domain element, the value of the class attribute must start with a plus sign. Elements provided by domains should be qualified by the domain identifier.
Integrating a Domain in an Information Type Shell Document
Each type domain specialization integrates slightly differently with the information type shell document.
<xs:redefine schemaLocation="mySpec.grp" /> <xs:group name="pre"> <xs:choice> <xs:group ref="pre"/> <xs:group ref="md-d-pre" /> </xs:choice> </xs:group>
<xs:include schemaLocation="mySpec_domains.mod" />
<xs:include schemaLocation="topic_domains.mod"/> <xs:redefine schemaLocation="ui-domain.mod"> <xs:group name="ui-d-ph"> <xs:choice> <xs:group ref="ui-d-ph"/> <xs:group ref="md-d-mydomainelt" /> </xs:choice> </xs:group> </xs:redefine>
Using Sun Java Development Kit (JDK)1.4.X and Xerces 2.6.X
The Sun JDK 1.4.X has a built-in XML parser as part of the distribution. It is called Crimson. Unfortunately, Sun's Crimson parser only supports DTD validation.
Sun provides a mechanism to override the classes in the JDK. It's called the Endorsed Standards Override Mechanism
For example, copy a version of the xercesImpl.jar and xml-apis.jar to <java_home>/jre/lib/endorsed. Here <java-home> refers to the directory where the Sun JDK is installed. Check the Xerces version via the included Version application using the following command-line syntax.
java -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser org.apache.xerces.impl.Version
Note: Java 2 Platform, Standard Edition (J2SE) 5.0 and IBM Developer Kit includes Xerces 2.6.2 as the built-in XML parser.
You can use the Java application ValidateXMLDoc to validate an XML document using an external XML Schema. The XML parser must support JAXP 1.2.. In Xerces' case, version 2.6.x. For example:
java ValidateXMLDoc ../DITA-XS-readme.xml -s ditabase.xsd Usage: java ValidateXMLDoc xmlDoc [options] ----------------------------------------------------------------------------------------------------------- The application will attempt to validate the instance document using the DOCTYPE value by default. options: -s Validate the instance document using the defined noNamespaceSchemaLocation value. [xmlSchema] Validate instance document using an external XML Schema URI: The location of an external no namespace XML Schema relative to the xml document. This will override the DTD/XML Schema that is defined in the XML document" -----------------------------------------------------------------------------------------------------------
The Java application TransformUsingXMLSchema is used to transform an XML document using XML Schema validation. Most transform engines use DTD validation by default to build an in-memory document. The transformation engines need explicit instruction to use XML Schema validation instead of DTDs. The xsi:noNamespaceSchemaLocation attribute must be specified in the XML document for the application to works as expected. For example:
java TransformUsingXMLSchema ../lawnmower.xml ../../xsl\topic2html.xsl ../lawnmower.html Usage: java TransformUsingXMLSchema xmlDoc xsltDoc htmlDoc ----------------------------------------------------------------------------------------------------------- xmlDoc: The external URI location of an XML document to transform xmlSchema: The external URI location of an XSL stylesheet htmlDoc: The external URI location to write the resultant HTML document -----------------------------------------------------------------------------------------------------------
Note: The implied value for the class attribute, defined in the schema, is necessary for DITA XSLT scripts to work properly. If you see no output, or text-only output, this is usually an indication that the class attribute's default value has not been provided during parsing. Use the normalize.xsl transform to check the output of the parser: if the class attribute is missing, or the value is not bounded at the end by a space, the transforms cannot do class-based matching properly.
Eric Sirois esirois@ca.ibm.com IBM Corporation
Java is a registered trademark of Sun Microsystems, Inc..