Researchers at several IBM labs are collaborating on enhancements to the DITA XML system. The Darwin Information Typing Architecture (DITA) is "an XML architecture for designing, writing, managing, and publishing technical documentation, whether in print, as online help, or on the Web. It implements the principles of information design, information typing, and information architecture." To the extent that a "topic" is the basic DITA architectural unit, the system has some affinities to Topic Maps; to the extent that it features modular content design and optimizes content reuse, it is similar to information mapping. DITA supports a unique transclusion mechanism that is validated under DTD processing rules. Michael Priestley (IBM Toronto Software Development Laboratory) has collaborated in the creation of a new DITA resource collection which features key articles and presentations, together with references to the DTDs and XSLT transformation specifications. Presentations on DITA will be made by Don Day and Mike Temple at upcoming May 2003 meetings, including the Society for Technical Communication (STC) 50th Annual Conference and the Arbortext User's Group International Conference (AUGI 2003).
Transclusion in DITA
DITA design focuses upon the "topic" as a conceptual unit of authoring. Larger documents can be created by aggregating topic units. A "content referencing" (conref) mechanism may be used at the markup level to combine several topics into a single document, or to share content among topics:
"The DITA design has a unified content reuse mechanism by which an element can replace itself with the content of a like element elsewhere, either in the current topic or in a separate topic that shares the same content models. The distinction between reusable content and reusing content, which is enshrined in the file entity scheme, disappears: Any element with an ID, in any DITA topic, is reusable by conref."
"DITA's conref 'transclusion' mechanism is similar to the SGML conref mechanism, which uses an empty element as a reference to a complete element elsewhere. However, DITA requires that at least a minimal content model for the referencing element be present, and performs checks during processing to ensure that the replacement element is valid in its new context. This mechanism goes beyond standard XInclude, in that content can be incorporated only when it is equivalent: If there is a mismatch between the reusing and reused element types, the conref is not resolved. It also goes beyond standard entity reuse, in that it allows the reused content to be in a valid XML file with a DTD. The net result is that reused content gets validated at authoring time, rather than at reuse time, catching problems at their source."
"Content referencing can be used at any scope of elements in a DITA document, from a keyword phrase that contains only PCDATA to a whole topic with other nested topics. Conref can cross file boundaries, using the same syntax as that of the href attribute on the xref element. If your authoring DTD allows topic nesting, you can create a set of minimal child topics and then use their conref attributes to pull in content from fully populated topics in other files..." [from the FAQ]
The DITA Architecture
Through topic granularity and topic type specialization, DITA brings the following benefits of the object-oriented model to information sets:
- Encapsulation: The designer of the topic type only needs to address a specific, manageable problem domain. The author only needs to learn the elements that are specific to the topic type. The implementer of the processing for the topic type only needs to process elements that are special.
- Polymorphism: Special topic types can be treated as more generic topic types for common processing. The class attribute preserves at all times the derivation hierarchy of an element. At any time, a topic may be generalized back to any earlier form, and if the class attributes are preserved, these topics may be re-specialized. One use of this capability would be to allow two separate disciplines to merge data at an earlier common part of the specialization hierarchy, after which they can be transformed into one, the other, or a brand new domain and set of infotyped topics.
DITA can be considered object-oriented due to:
- Data and processors that are separated from their environment and can be chunked to provide behaviors similar to object-orientation (such as override transforms that modify or redefine earlier behaviors).
- Classification of elements through a sequence of derivations that are progressively more specific, possibly more constrained, and always rigidly tied to a consistent processing or rendering model.
- Inheritance of behaviors, to the extent that new elements either fall through to behaviors for ancestors in their derivation hierarchy, or can be mapped to modified processors that extend previous behaviors.
Although with discipline and ingenuity, some of the benefits of topic information sets can be provided through a book DTD - in particular, techniques for chunking can generate topics out of a book DTD - in DITA, the converse approach is possible: A book can be assembled from a set of DITA topics. In both cases, however, the adaptation is secondary to the primary purpose of the DTD -- that is, if you are primarily authoring books, it makes the most sense to use a DTD that is designed for books. If you are primarily authoring topics, it makes sense to use a DTD that is designed for topics and can scale to large, processable collections of topics. [adapted from the Introduction]
Principal references:
- Update 2003-06-24: "IBM Development Team Publishes Updated DITA Toolkit and Language Reference."
- "Introduction to the Darwin Information Typing Architecture: Toward Portable Technical Information."
- [March 21, 2001] IBM's Darwin Information Typing Architecture (DITA).
- [May 15, 2002] IBM's Darwin Architecture Supports Enhancements for Domain Specialization, Content Reuse, and Linking Logic.
- "Darwin Information Typing Architecture (DITA XML)." New Reference collection for DITA resources. Michael Priestley assisted in the creation of this resource.