[This local archive copy is from the official and canonical URL, http://www.lbl.gov/~olken/mendel/w3c/papers/xtech99/abstract.html; please refer to the canonical source document if possible.]


Metadata Registries: Averting a Tower of XML Babel

Authors

Submitted to: XTECH'99
March 7-11, 1999

Abstract

XML DTD's/schemas facilitate the development of community-specific XML dialects (MathML, ChemML, ...). However, the ease of DTD/schema development raises the specter of a tower of XML Babel. Shared metadata registries (a.k.a. repositories) are essential for development of common XML dialects for deployment of applications (such as E-commerce) among heterogeneous user communities. Such shared registries are essential to interoperability at both syntactic and semantic levels.

Current and proposed practice in the XML community (i.e., DTD's and XML Schema proposals) do not address the administration, maintenance,, integration, and standardization of data element definitions, nomenclature, value encodings, etc. Current mechanisms for metadata (e.g., database schemas) traditionally use terse definitions of terms (data elements) written by database administrators, with few references to external citations. These mechanisms, as currently used, are inadequate for applications such as electronic commerce and systems integration. Data element definitions need to be specified by subject area specialists (lawyers, accountants, etc.), accompanied by detailed references to external documents (e.g., legal decisions and codes, accounting standards, etc.) which may not be available on the world wide web.

We discuss the requirements of metadata registries and the adequacy of various existing and proposed registry standards (ISO 11179, ANSI X3.285), schema standards (RDF Schema, XML Schema, XML query languages, KIF, XMI) and other related standards (measurement units, naming standards) to address these problems. Specifically, we consider issues of expressiveness vs. computational tractability, the ability to reference/query schema fragments, support for measurement units and dimensionality, specification of schema mappings, metadata support for aggregation (summarizability), value encodings and translation, etc. We also describe some current efforts underway to implement these standards using XML.


Presentation Title:

Metadata Registries:
Averting a Tower of XML Babel

Author's:

Frank Olken and John L. McCarthy

Author's Job Title:

Computer Scientists

Organization:

Lawrence Berkeley National Laboratory

Postal Address:

Frank Olken and John McCarthy
Lawrence Berkeley National Laboratory
Mailstop 50B-3238
1 Cyclotron Road
Berkeley, CA 9720

Email Address:

 olken@lbl.gov , JLMcCarthy@lbl.gov

Telephone:

510-486-5891 (Olken), 510-486-5307 (McCarthy)

Fax:

510-486-4004

Two-sentence description for brochure:

Shared metadata registries (or repositories) are necessary to specify shared semantics to support large scale XML Internet applications. We discuss the adequacy and implementation status of various standards efforts to address these needs, and the need to tie data element definitions to external specifications (legal codes/decisions and accounting standards) that are not yet on the World Wide Web.

Target audience:

Persons concerned with large scale Internet-base XML applications such as electronic commerce, medical records, environmental data systems.

Biographical information:

Frank Olken holds a Ph.D. in Computer Science from UC Berkeley and has worked on statistical and scientific databases for 15 years. He is also interested in computational biology, bioinformatics, knowledge representation, distributed (data acquisition and control) systems, workflow, and electric power grid computations.

One of the first persons to write about "metadata," John McCarthy has worked on metadata issues and database design for over 25 years. He has participated in development of databases and metadata standards for various different kinds of scientific and administrative information, including multi-dimensional census tables, material properties, and genomic applications.

Maintained by Frank Olken at Lawrence Berkeley National Laboratory. Email: olken@lbl.gov Last updated: January 8, 1999