The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Created: May 15, 2002.
News: Cover StoriesPrevious News ItemNext News Item

IBM's Darwin Architecture Supports Enhancements for Domain Specialization, Content Reuse, and Linking Logic.

Communiqués from Don Day and Michael Priestley of IBM describe new features in the 2002-05 update of IBM's XML-Based Darwin Information Typing Architecture (DITA). The DITA XML-based architecture "provides a way for documentation authors and architects to create collections of typed topics that can be easily assembled into various delivery contexts. Topic specialization is the process by which authors and architects can define topic types, while maintaining compatibility with existing style sheets, transforms, and processes. The new topic types are defined as an extension, or delta, relative to an existing topic type, thereby reducing the work necessary to define and maintain the new type." Improving upon the original release of March 2001, DITA v1.0 features "a logical extension of specialization that has now been incorporated into DITA: the ability to extend existing content markup to represent domains of specialized markup that are common across particular sets of typed topics (hardware vs. software, for example)." The DITA design has a unified content reuse mechanism which enables one to combine several topics into a single document: "an element can replace itself with the content of a like element elsewhere, either in the current topic or in a separate topic that shares the same content models. The distinction between reusable content and reusing content, which is enshrined in the file entity scheme, disappears: any element with an ID, in any DITA topic, is reusable by 'conref' transclusion. The linking logic is also now supports type checking and takes advantage of the short description element to provide progressive disclosure."

The DITA designers report that they are "trying to take full advantage of the semantic awareness built into the specialization model, while at the same time making that model more flexible and extensible. [This yields] more flexibility at design time, and more rigorous validation at authoring and build time."

From the new 'Domain Specialization' documentation:

The Darwin Information Typing Architecture (DITA) is an XML architecture for extensible technical information. A domain extends DITA with a set of elements whose names and content models are unique to an organization or field of knowledge. Architects and authors can combine elements from any number of domains, leading to great flexibility and precision in capturing the semantics and structure of their information.

In DITA, the topic is the basic unit of processable content. The topic provides the title, metadata, and structure for the content. Some topic types provide very simple content structures. For example, the concept topic has a single concept body for all of the concept content. By contrast, a task topic articulates a structure that distinguishes pieces of the task content, such as the prerequisites, steps, and results.

In most cases, these topic structures contain content elements that are not specific to the topic type. For example, both the concept body and the task prerequisites permit common block elements such as p paragraphs and ul unordered lists.

Domain specialization lets you define new types of content elements independently of topic type. That is, you can derive new phrase or block elements from the existing phrase and block elements. You can use a specialized content element within any topic structure where its base element is allowed. For instance, because a p paragraph can appear within a concept body or task prerequisite, a specialized paragraph could appear there, too.

[Summary:] Through topic specialization and domains, DITA provides the following benefits: (1) Simpler topic design: The document designer can focus on the structure of the topic without having to foresee every variety of content used within the structure. (2) Simpler topic hierarchies: The document designer can add new types of content without having to add new types of topics. (3) Extensible content for existing topics: The document designer can reuse existing types of topics with new types of content. (4) Semantic precision: Content elements with more specific semantics can be derived from existing elements and used freely within documents. (5) Simpler element lists for authors: The document designer can select domains to minimize the element set. Authors can learn the elements that are appropriate for the document instead of learning to disregard unneeded elements. In short, the DITA domain feature provides for great flexibility in extending and reusing information types. The highlight, programming, and UI domains provided with the base DITA release are only the beginning of what can be accomplished..."

DITA content reuse is supported by the conref attribute: "you can throw a conref attribute on just about any element, to grab content from an equivalent element in another DITA topic..." This mechanism, said by Eliot Kimber to provide the equivalent of HyTime's #ELEMENT value reference facility, is also weakly expressed in ISO 8879 (SGML) by #CONREF attributes. Note that SGML's CONREF [content reference attribute] feature was highly controversial; it was dropped by XML.

DITA's conref "transclusion" mechanism is similar to the SGML conref mechanism, which uses an empty element as a reference to a complete element elsewhere. However, DITA requires that at least a minimal content model for the referencing element be present, and performs checks during processing to ensure that the replacement element is valid in its new context. This mechanism goes beyond standard XInclude, in that content can be incorporated only when it is equivalent: If there is a mismatch between the reusing and reused element types, the conref is not resolved. It also goes beyond standard entity reuse, in that it allows the reused content to be in a valid XML file with a DTD. The net result is that reused content gets validated at authoring time, rather than at reuse time, catching problems at their source.

Content referencing can be used at any scope of elements in a DITA document, from a keyword phrase that contains only PCDATA to a whole topic with other nested topics. Conref can cross file boundaries, using the same syntax as that of the href attribute on the xref element. If your authoring DTD allows topic nesting, you can create a set of minimal child topics and then use their conref attributes to pull in content from fully populated topics in other files.

From the FAQ document: "Darwin [is the name because] it uses the principles of specialization and inheritance. Information Typing capitalizes on the semantics of topics (concept, task, reference) and of content (messages, typed phrases, semantic tables). The architecture provides vertical headroom (new applications) and edgewise extension (specialization into new types) for information..."

Principal references:

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: