Metadata Registries: Averting a Tower of XML Babel


Submitted to: XTECH'99
March 7-11, 1999


XML DTD's/schemas facilitate the development of community-specific XML dialects (MathML, ChemML, ...). However, the ease of DTD/schema development raises the specter of a tower of XML Babel. Shared metadata registries (a.k.a. repositories) are essential for development of common XML dialects for deployment of applications (such as E-commerce) among heterogeneous user communities. Such shared registries are essential to interoperability at both syntactic and semantic levels.

Current and proposed practice in the XML community (i.e., DTD's and XML Schema proposals) do not address the administration, maintenance,, integration, and standardization of data element definitions, nomenclature, value encodings, etc. Current mechanisms for metadata (e.g., database schemas) traditionally use terse definitions of terms (data elements) written by database administrators, with few references to external citations. These mechanisms, as currently used, are inadequate for applications such as electronic commerce and systems integration. Data element definitions need to be specified by subject area specialists (lawyers, accountants, etc.), accompanied by detailed references to external documents (e.g., legal decisions and codes, accounting standards, etc.) which may not be available on the world wide web.

We discuss the requirements of metadata registries and the adequacy of various existing and proposed registry standards (ISO 11179, ANSI X3.285), schema standards (RDF Schema, XML Schema, XML query languages, KIF, XMI) and other related standards (measurement units, naming standards) to address these problems. Specifically, we consider issues of expressiveness vs. computational tractability, the ability to reference/query schema fragments, support for measurement units and dimensionality, specification of schema mappings, metadata support for aggregation (summarizability), value encodings and translation, etc. We also describe some current efforts underway to implement these standards using XML.

Frank Olken and John L. McCarthy

Computer Scientists


Lawrence Berkeley National Laboratory

Shared metadata registries (or repositories) are necessary to specify shared semantics to support large scale XML Internet applications. We discuss the adequacy and implementation status of various standards efforts to address these needs, and the need to tie data element definitions to external specifications (legal codes/decisions and accounting standards) that are not yet on the World Wide Web.

Persons concerned with large scale Internet-base XML applications such as electronic commerce, medical records, environmental data systems.

