DITA XML Schema Readme

This document describes the release notes and usage of the experimental DITA XML Schema on developerWorks(R).

Installing DITA XML Schema

The DITA XML Schema has the following directory structure within the standard DITA package:

 schema
   java

Using DITA XML Schemas

There are a number of tools available to create and validate DITA XML Schemas. Here is a small list:

  • WebSphere(R) Studio Advanced Developer 5.1
  • jEdit 4.1
  • Xerces 2.4.X

You can invoke Xerces XML document validation using SAX or DOM via the command-line

 java sax.Counter -v -s [xmlDocument]
 java dom.Counter -v -s [xmlDocument]

Information for DITA XML Schema

The DITA XML Schema architecture attempts to follow the naming convention established in the current DITA DTD architecture. Each element has there own named content model, i.e., topic.class for the topic element. Attributes that have an enumerated list of values in the DTD has their own class too, i.e. importance-att.class for the attribute important.

  • All the elements in the base topic module are global. This will allow any element to be specialized.
  • All the attributes are local to the element's content model, except for the class attribute.
  • All the parametric entities in the DTD are named element groups with the same name as the DTD,i.e., basic.ph and ph.cnt.
  • All elements in specialized schema module are declared globally as abstracted elements, except for the root element. The specialized element must also be declared locally in the derived content models.

XML Schema Specialization

Here are some simple steps that will make specialization easier:

  1. Create a new XML Schema document.
    • Copy the contents of the parent specialization into the new *.xsd file.
    • Add an include statement for the *.mod in the XML Schema document.
      <xs:include schemaLocation="mySpec.mod" />
  2. Create a new XML Schema module.
    • Create a new global element.
      • Add the substitutionGroup attribute. Its value is the parent specialization element.
      • If the new element in not a root element, set the abstract attribute to true.
    • Create a new content model for the root element. For most elements the following template will do.
          <xs:complexType name="myElement.class" mixed="" >
            <xs:complexContent>
                <xs:restriction base="myParentElement.class">
      
                </xs:restriction>
             </xs:complexContent>
          </xs:complexType>
    • Copy the content of the parent element's content model, every thing between the elements <xs:choice>, <xs:sequence>, or <xs:all>, into the new element's content model. For example:
          <xs:choice minOccurs="0" maxOccurs="unbounded">
      
          </xs:choice>
    • Local element substitution
      • Modify the class attribute default value
      • Optional - modify element content model

Validating the XML Schemas

The DITA XML Schema is an experiment to determine the merits of XML Schema to create and maintain specialized content. The XML schemas should be considered as "work in progress". Any document that is validated using the current DTDs in the DITA package will be valid using the XML schemas.

The schemas can be validated using Xerces, IBM's Schema Quality Checker from alphaWorks(R), or most XML editors.

Validating the XML schemas using Xerces or IBM's SQC will generate some errors related to particle restriction rules.

[Error] task.mod:157:37: rcase-Recurse.2: There is not a complete functional mapping between the particles.
[Error] task.mod:157:37: derivation-ok-restriction.5.3.2: Error for type 'step.class'.  The particle of the type is not a valid restriction of the particle of the base.

[Error] task.mod:96:41: rcase-MapAndSum.1: There is not a complete functional mapping between the particles.
[Error] task.mod:96:41: derivation-ok-restriction.5.3.2: Error for type 'taskbody.class'.  The particle of the type is not a valid restriction of the particle of the base. 

These errors will not affect document validation. One of the requirements for XML Schema 1.1 is to relax the particle restriction rules.

Java Application - ValidateXMLDoc

You can use the Java(TM) application ValidateXMLDoc to validate an XML document using an external XML Schema. The XML parser must support JAXP 1.2. In Xerces' case version 2.4.x. For example:

java ValidateXMLDoc ../DITA-XS-readme.xml -s ditabase.xsd
        
Usage: java ValidateXMLDoc xmlDoc [options]
-----------------------------------------------------------------------------------------------------------
The application will attempt to validate the instance document using the DOCTYPE value by default.
options:
-s             Validate the instance document using the defined noNamespaceSchemaLocation value.
[xmlSchema]    Validate instance document using an external XML Schema
            
URI: The location of an external no namespace XML Schema relative to the xml document.
       This will override the DTD/XML Schema that is
       defined in the XML document"
-----------------------------------------------------------------------------------------------------------

Using JDK 1.3 and Xerces 2.4.X

Add the xercesImpl.jar and xml-apis.jar files to the system CLASSPATH.

Using JDK 1.4.X and Xerces 2.4.X

The Sun and IBM JDK 1.4.X have a built-in XML parser as part of the distribution. Sun provides a mechanism to override the classes in the JDK. It's called the Endorsed Standards Override Mechanism

For example, copy a version of the xercesImpl.jar and xml-apis.jar to <java_home>/jre/lib/endorsed. Here <java-home> refers to the directory where the JDK is installed. Check the Xerces version via the included Version application using the following command-line syntax.

java -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser org.apache.xerces.impl.Version

Java Application - TransformUsingXMLSchema

The Java(TM) application TransformUsingXMLSchema is used to transform an XML document using XML Schema validation. Most transform engines use DTD validation by default to build an in memory document. The transformation engines need explicit instruction to use XML Schema validation instead of DTDs. The xsi:noNamespaceSchemaLocation attribute must be specified in the XML document for the application to works as expected. For example:

java TransformUsingXMLSchema ../lawnmower.xml ../../xsl\topic2html.xsl ../lanwmower.html
        
Usage: java TransformUsingXMLSchema xmlDoc xsltDoc htmlDoc
-----------------------------------------------------------------------------------------------------------
xmlDoc:       The external URI location of an XML document to transform
xmlSchema: The external URI location of an XSL stylesheet
htmlDoc:      The external URI location to write the resultant HTML document
-----------------------------------------------------------------------------------------------------------

Note: The implied value for the class attribute, defined in the schema, is necessary for DITA XSLT scripts to work properly. If you see no output, or text-only output, this is usually an indication that the class attribute's default value has not been provided during parsing. Use the normalize.xsl transform to check the output of the parser: if the class attribute is missing, or the value is not bounded at the end by a space, the transforms cannot do class-based matching properly.

Eric Sirois
esirois@ca.ibm.com
IBM Corporation

Java is a registered trademark of Sun Microsystems, Inc..