[This local archive copy is from the official and canonical URL, http://test.bellanet.org/xml/files/comment.doc?Ois=y;template=dml2.cfm; please refer to the canonical source document if possible.]



Main Page

Mailing List

Documents

Links

105 messages posted to this list.

The last message was posted on October 23, 1998.

Contact Info:
info@bellanet.org

About Bellanet?

How people without
web access can
use this website

Site updated
October 23, 1998


DML Documents


Development Markup Language

Version 0.01.00

Commentary

1998-10-22 
 

DML design goals 

1. DML will support the markup of information describing the development activities of organizations working in the area of international development.

Information describing development activities, which has been referred to as "process-related" information, includes descriptions of projects, programs, loans and credits. 

2. DML will support the valid markup of records that contain only the mandatory data elements described by the current CEFDA standard.

CEFDA provides a basic level of development activity description that has already been agreed upon by a wide range of development organizations. Allowing valid markup using only CEFDA mandatory elements will encourage development organizations currently using CEFDA to adopt DML. 

3. DML will support markup that extends CEFDA in areas that have already been formally or informally identified as requiring more complete, detailed or more extensive information.

A number of improvements or extensions to CEFDA have been discussed within the INDIX community or implemented within pilot projects such as the GK-AIMS project of Global Knowledge Partners program. These areas include the use of authorities for institutional names, authoritative description of sectors, increased detail in terms of financial information, and better description of programs. 

4. DML will allow for multilingual markup.

Development organizations work in a number of different languages. DML should support description in a variety of languages. 

5. DML will be developed quickly.

In order to avoid fragmentation of the development community in supporting competing Document Type Definitions, DML must be developed quickly. Version 0.01 of the DTD is a draft intended to stimulate comment, suggestions and discussion.

6. DML should be easily usable by a wide range of browser software, style sheets and other software.

Features of DML should be among those most likely to be quickly implemented in XML parsers, renderers, and stylesheets such as CSS and XSL. Reasonable use of the data should be possible even without sophisticated application software. 

7. DML will be easily implemented by development agencies.

Development agencies should be able to implement simple DML markup easily. For most agencies, this will mean producing DML as output from another information system, rather than marking up original text.  

8. To the extent possible, DML will be consistent with the other developing metadata schemes.

Other metadata schemes such as the Dublin Core standard for describing document-like information objects and the Government Information Locator Service (GILS) used for describing government information resources continue to evolve. As the means of integrating multiple metadata schemes (RDF, semantic maps) develops, DML should develop to be consistent with these schemes. 

9. DML will provide a base on which to develop richer and more powerful means of exchanging, sharing and using development activity information.

While a simple DML document should not require more than the required CEFDA data elements, provision must be made for the exchange of more complex and richer sets of data to ensure that the markup language will continue to meet the needs of the development community in the future. 

Comments on the DTD 

Naming Conventions 

Element names consisting of more than one word have been joined with an underscore (_). Element names consisting of more than one word where some of the words represent a hierarchy and where the hierarchy is reflected for clarity’s sake in the name of the element have the different parts of the name joined by a full stop (.). 

%optlink - Optional Link Elements 

In order to support both simple exchange of information elements such as organization names, and to support the use of networked sources of authoritative information such as organizational authority files, some elements in the DML markup have been defined to allow optional linking. The optional linking element can carry linking information if this is available to the organization creating the DML document. Optional linking elements include organization information (to allow for links to an organization authority file) and sector codes (to link to an authoritative source for sector codes). Elements with few possible values, such as Terms of Assistance, are not good candidates for optional linking; instead coded values should be converted into human-readable forms with stylesheets or specific application processors. Bibliographic references to documents also can use optional linking to access more complete descriptions. 

Examples:

1. An organization has no means of referencing an institutional authority file. The DML document contains only the name of the agency.  

<executing_org>

<org.name>

Canadian International Development Agency

<org.name>

</executing_org> 

2. An organization can and chooses to provide a link to an networked authority file of development organizations, located at DevOrgs International, a yet-to-be-established consortium constituted to maintain information about development organizations. Users can get additional information on that particular organization by traversing the link (e.g. by clicking on the organization name). 

<executing_org xml:link="simple" href="http://www.devorgs.org/auth?CIDA"

show="embed">

<org.name>

Canadian International Development Agency 
</org.name>

</executing_org> 

Traversing this link will embed information retrieved from the site of DevOrgs into the record, so that in fact the additional information available is embedded into the current document. The information about the executing agency might then appear as in the following example.. 

<executing_org xml:link="simple" href="http://www.devorgs.org/auth?CIDA">

<org.name lang=“en”>

Canadian International Development Agency

</org.name>

<org.acronym>

CIDA

</org.acronym>

<city>

Hull

</city>

<prov_state>

Quebec

</prov_state>

<country>

Canada

</country>

<contact.uri href=“http://www.acdi-cida.gc.ca/index.htm”>

http://www.acdi-cida.gc.ca/index.htm

</contact>

</executing_org> 

Developments in XML processors will need to be monitored for support for optional linking. 

%addr Address Information 

Address information may appear for organizations in any one of the roles in which they may appear (i.e. funding organization, cofunding organization, executing entity or reporting organization). Address information may also be associated with a particular contact. This entity allows the information to be standardized across these different elements. 

% contact Contact Information

Contact information may be associated with organizations in any one of the different roles they play in relation to a development activity. GK-AIMS and CEFDA take different approaches to this kind of information. In GK-AIMS, contact information is subsidiary to an organization unit, so that there is a fixed contact point for that organization. This approach should render the notion of a CEFDA Contact (210) redundant, since the Contact is associated with the Funding Organization. However the contact for further documentation on a project or program may not necessarily be the contact designated for the whole organization in which case the CEFDA Contact would provide this role. In this version of DML, flexibility built into the DTD allows for both approaches, but at the cost of some increased complexity and potential redundancy. This approach needs to be validated with development organizations. The GILS standard provides for a Contact sub-element "Hours of Service" which is not present in CEFDA or GK-AIMS, but which has been added here (contact.hours). 

%org Organization

Organizations may play a number of different roles in relation to a development activity: that of funding organization, cofunding organization, executing entity, funding source or organization reporting the development activity. This entity allows the format of organizational information to be standardized across these different roles. Following the GK-AIMS format, a contact is allowed for each organization/role combination. 

%mixed Mixed Content

Some organizations may provide extensive abstracts which can be separated into paragraphs, headings or unnumbered lists. This mixed content allows text in these elements to be marked up with some basic display-oriented markup to improve readability and clarity of long texts. 

activity Activity

<activity> is the main element of the DML. Attributes include “language”, which corresponds to the machine-usable version of the Language of Record data element, and “id” which serves to provide a unique identification for this element to permit linking of other resources to this resource. The “id” element in particular is limited in terms of its form; it must begin with an alphabetic character and contain only alphabetic characters or the symbols ‘.’ or ‘-’, i.e. full stop or hyphen. Given that documents may be assembled with activities from different organizations, it would be wise to encourage some standardization in the format of the activity id attribute; one suggestions would be to use the agency-assigned activity identifier, substituting ‘-’ for any non-alphabetic character, and prefixing this identifier with the acronym for the agency itself followed by a ‘.’. One can argue that there is little apparent need in the human-readable elements for either what CEFDA terms “Record Identifier” or (perhaps less convincingly) “Record Language”; both these elements have not been included in this version of DML. 

The Activity element contains four elements which correspond to the four categories of fields specified in CEFDA. These categories serve to group the different CEFDA fields, but could be dispensed with in DML if the tags are not useful in formatting record displays. Of the four categories, CEFDA places administrative information first. However this administrative information is less important to the user than the descriptive part of the record, which includes critical information elements such information as title. Since XML documents will frequently be processed in a one-pass, sequential fashion, the most important information should appear first in the document. For this reason, the administrative information in DML has been moved to appear as the last element contained within <activity> instead of the first as CEFDA would suggest. 

title and trans_title Title and Translated Title

Title and translated title could be merged into a single field with the attribute "xml:lang" used to distinguish between different language versions of the title. However it is clear from the definition in CEFDA that a distinction is being made between the title in one of the official languages of the funding organization (hence "the official title" or at least one of the official titles) and a title which has been translated in order to make the information more widely available or more easily understood. Both elements have been retained in the Development Markup Language to continue this distinction.  

Coded Values

Many of the fields in the CEFDA format, such as Terms of Assistance, Type of Activity and Country/Region were designed to carry coded values. Additional elements discussed but never implemented include Sector. Coded values were used with these data elements to provide a controlled vocabulary that would unambiguously identify a particular value, and to provide a way to share information as independently as possible of the original language of description.  

The intention was always that the coded value would be translated into one or more easily comprehended formats in one or more languages when the information was being delivered to the end user. For example, the language codes of the CEFDA format are expanded into country/region names in the INDIX Development Activity Information (DAI) database. DML is intended not only for computer-to-computer transfer but also for end user display, where this information is readable by the end user. A variety of approaches to this kind of data are possible. 

1. The code is an attribute of an empty element.

The XML application or XML stylesheet must convert the attribute value into a human-readable and comprehensible format, such as "grant".  

Example:

<terms_assistance code="1"/> 

2. The element includes a readily understandable description of the value in a given language, while still carrying the code as an attribute value. 

Example:

<terms_assistance code ="1">Grant</terms_assistance> 

In this case the language would be assumed to be the language of the parent element (“language of record” in CEFDA parlance), but even this could be specified: 

<terms_assistance xml:lang="en" code="1">Grant</terms_assistance> 

3. The element could be used as a link to embed a description of the link. 

<terms_assistance href="http://www.bellanet.org/terms/en?1" actuate="auto" show="embed" lang="en" code="1">Grant</terms_assistance> 

In the latter case, the actual presence of a descriptive string (e.g. "Grant") is superfluous since the element will be replaced with the result of the lookup of the hypertext reference, at least if the hypertext reference returns a valid value. 

With elements with large numbers of possible values, and frequently changing information (such as organizational information), the overhead of traversing a link to get further information is worth the processing involved. With coded values with small numbers of values, such as Terms of Assistance and Status, the overhead of providing a link to an authoritative source is too great; these elements have been defined as empty elements, and style sheets or XML software will have to render the attribute code into a human-readable form. Elements with several hundred values, such as Country/Region or Sector (when eventually standard sectors are defined for activity information) would fall somewhere in between. In this third case, arguments could be made both for or against providing a link to an authoritative source. In this version of the DTD, both possibilities have been provided for, with a code to allow the XML processor to provide the relevant information, and attributes to support linking. 

cofunding_org Co-funding organization

This element has been drawn from GK-AIMS, though its lack in the CEFDA format has been noted. 

budget Budget Information

CEFDA was intended primarily for use in text-based information retrieval systems, where certain fields would be searchable, but display capabilities were simple. The ability within XML to specify more specific data elements, as well as the importance of budget information, requires greater detail or granularity in the description of budget information. Two possible elements are found here: a budget total, for organizations that wish to report only a single, total amount of the activity budget; and budget line for organizations that wish to provide more detailed information, broken down by budget year, sub-activity, or currency.  

Source documentation

DML has been designed for the markup of metadata about development activities, including references to documents that describe those metadata activities. Other related metadata schemes or profiles currently in development and use include Dublin Core (a fifteen element set designed for describing a wide range of document-like objects), and the Government Information Locator Service (GILS), an ISO 23950 profile for the description of government information resources. Dublin Core in particular is of interest because of its possible application in searching a wide range of information over the Internet. GILS is of interest in that government agencies, such as bilateral development agencies, may be encouraged to make information available in this format, though actual implementations of GILS are still relatively rare. While the semantics of both these formats are weak, because of the widespread support for Dublin Core, the Dublin Core elements have been used for the description of source documentation. However the way in which metadata sets will be related is still in development with a number of different proposals, and issues to be settled. (For cross-domain searching, for example, DML might wish to present an Activity Title as a Dublin Core Title; however in describing a documentary reference, the Dublin Core Title is more appropriately and precisely used to describe the title of the document describing the resource.) In view of these uncertainties, this version of DML simply adopts Dublin Core semantics for bibliographic descriptions. These DML elements can be converted to more explicit Dublin Core elements in the future when the mechanisms for doing so are better established.