Open Forum on Metadata Registries Draft Agenda

This is a draft agenda. All topics, speakers, times, and duration of presentations are "proposed". The speaker arrangements are being made. The agenda may continue to change as the organizers complete their work.

Dress for the conference is business casual.

The schedule for each day will be:
8:30 AM - Noon, Sessions
Noon - 2 PM, Lunch
2:00 - 6:00 PM, Sessions
There will be 20 minute breaks at approximately 10:00 AM and 3:50 PM.


Monday, January 17, 2000

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

TWO CONCURRENT TRACKS—Tutorials and Presentations:
TRACK A: The International Standards Organization (ISO)/International Electrotechnical Commission (IEC) 11179 Family of Standards and
TRACK B: Tutorials on Topics Relating to Metadata Registries—Terminologies, Thesauri, Ontology, Syntax, and Extensible Markup Language (XML).

This day has two tracks of concurrent sessions. A series of tutorials will describe each part of the ISO/IEC 11179 family of standards. Another series of tutorials will introduce the fundamentals of related standards that may help to extend the semantic management capabilities of ISO/IEC 11179 metadata registries. The presentations are intended to describe the practical use of 11179 standards and to introduce concepts and terms that will be used throughout the Open Forum.

CONCURRENT TRACKS
TUTORIAL TRACK A: Tutorials on the ISO/IEC 11179 Family of Standards

What are the major concepts in ISO/IEC 11179? What terms are used to express the concepts? How can I use 11179? As different disciplines come together, we find that we use the same information technology terms to mean entirely different things, while labeling identical concepts with different terms. This tutorial day will describe the fundamental concepts in the ISO/IEC 11179 family of standards and tune your ear to the terms used in the standard. Standards developers and practitioners will describe how each part of the standard is used in practice. This day also lays out the basic discipline needed for a data standardization/data administration program in any organization or industry.

8:30 - 8:40 AM

Welcome & Introductory Remarks
Work Group (WG) 2 Convener

8:40 - 9:30 AM

Overview of ISO/IEC 11179 Family of Standards and Tutorial on ISO/IEC11179, Part 1: Framework for the Specification and Standardization of Data Elements
Daniel W. Gillman, Mathematical Statistician
U.S. Departmet of Commerce, Bureau of the Census

What are the primary concepts in the ISO/IEC 11179 family of standards? How do the parts of 11179 fit together? This presentation gives an overview of the standard and a framework for the major concepts. This gives the "big picture" of the multi-part standard.

9:30 - 10:00 AM

ISO/IEC 11179-4: Rules and Guidelines for the Formulation of Data Definitions
Tommie Curtis
Science Applications International Corporation (SAIC), Systems Development Center (SDC)

What does the data mean? Only the programmer knows. Or maybe the computer analyst knows. Sometimes if you read the input manual, you can guess what the data might mean. Those were the bad old days. This presentation describes the Part 4 rules and guidelines, which help people to write definitions for data. Definitions are essential for managing the semantic content of data. Definitions help secondary users to understand what the data means. Definitions help primary users to be clear about what data they want to capture. Definitions are a primary means for distinguishing between different data and for establishing consistency between data trading partners. Part 4 is useful wherever data semantics should be understood: in data standards, case tools, data dictionaries, repositories, warehouses, models, message designs…. This Part is sometimes used, even when no other part of 11179 is needed.

10:00 - 10:20 AM
Break

10:20 - Noon

ISO/IEC 11179-3: Basic Attributes of Data Elements (also covers part 2 as classified components)
Joe Christensen, Head, National Information Development Unit
Australian Institute of Health and Welfare

Metadata, which describes data, is used for many purposes. What are the minimum essential metadata attributes to collect? How are the attributes related? What is does an expanded set of attributes look like? How can I classify my data? Part 3 specifies a minimum set of attributes and also provides a model of a broad number of attributes that can serve many purposes--system design, documentation for data users, data administration--to name a few. Rather than just dumping all of this information into a large text document, each piece of metadata has a place. This facilitates metadata management, makes it possible to develop useful human interfaces, and makes it possible for a wide range of software to access and utilize the specific metadata attributes as needed. It is also possible to draw profiles of the Part 3 metamodel for to build registries for particular purposes. This presentation describes the basic attributes and the full Part 3 metamodel.

Noon - 2 PM
Lunch

2:00- 2:30 PM

ISO/IEC 11179-5: Naming and Identification Principles for Data Elements
Judy Griffin, Project Manager
HAZMED

How do I unambiguously identify data? How do I name it? Definitions and the potentially valid values capture the essential semantics of data. A non-intelligent identifier is assigned to unambiguously identify data. To make it easy to talk about it, humans like to name data elements. But organizations, disciplines, program offices, researchers and various data management systems and programming languages all like different names. Here is how to identify data and prepare naming conventions so that everyone gets what they need.

2:30 - 3:20 PM

ISO/IEC 11179-6: Levels of Compliance for ISO/IEC 11179 Registries
Phong Ngo, Assistant Vice President
Science Applications International Corporation (SAIC)

There is no mother of all metadata registries. Here is how to establish your own Registration Authority and metadata registry. Here is how to combine multiple metadata registries into a higher-level metadata registry. Fit your registry to your own purpose. Small metadata registries can be sent with data products. Large metadata registries can standardize data for whole industries. Organizational metadata registries can describe the data holdings of an organization and encourage consistency within the organization. Levels of compliance are described. These enable metadata to be broadly accessed and shared. This presentation also describes quality control in metadata registries--a series of status levels that indicate the quality of the metadata.

3:20 - 3:50 PM

Capturing Business Rules in a Metadata Registry
Anju Rathi, Senior Information Specialist
Electronic Data Systems (EDS)

A user finds a "standardized" data element--so what? There are many kinds of business rules that might be associated with data, particularly standardized data. Are all systems required to implement this data element? By when? This presentation explores ways to capture selected types of business rules in a metadata registry.

3:50 - 4:10 PM
Break

4:10 - 4:50 PM

ISO/IEC TR 15452 Data Value Domains
Judith Newton, Computer Specialist
National Institute of Standards and Technology
(NIST)

How can you record and manage the potentially valid values for a data element? How can you map between different representations (California = CA = 06)? ISO/IEC Technical Report 15452 Information Technology - Specification of data value domains addresses practical issues encountered in documenting and sharing data value domains. It describes a standardized process of definition and application of sets of possible valid values for data elements to assist in the sharing and reuse of information across national and international organizations. This document complements and extends the attributes of data elements specified in ISO 11179.

4:50 - 6:00 PM

Content Consistency - How to Populate an ISO/IEC 11179 Metadata Registry
Larry Fitzwater, Registrar, Environmental Data Registry (EDR)
U.S. Environmental Protection Agency
Judith Newton, Computer Specialist
National Institute of Standards and Technology (NIST)
Lois Fritts, Standards Analyst
Science Applications International Corporation (SAIC), Systems Development Center (SDC)

If you build a metadata registry, acquire a freeware version or sometime in the future acquire a commercial registry package. How do you register metadata of interest? ISO/IEC 11179 parts 2 through 6 go into detail about how to fill in particular attributes. Other attributes are not covered so extensively. A technical report is being written to describe a process for entering metadata. In addition to general explanations, it provides "cookbook" examples of how to register types of data elements that are frequently found. A major concern of the report is to help users enter metadata in such a way that it can easily be shared between metadata registries. This presentation describes a work in progress.


Monday, January 17, 2000

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

CONCURRENT SESSIONS
TUTORIAL TRACK B: Tutorials on Topics Relating to Metadata Registries—Terminologies, Thesauri, Ontologies, Syntaxes, and  XML.

Semantics management--what is the source of this river? Whether we are concerned with documents on the web or databases, or electronic data interchange--virtually any text or data--how can we register the meanings? How can we specify concepts? How can we record the terms associated with concepts? And how can we organize the concepts into structures (thesauri, taxonomies, ontologies) that convey additional meaning beyond definitions? Managing the semantics of data involves keeping track of definitions for data elements, definitions for data element concepts and value meanings for the permissible values. The permissible values may be structured such as into a taxonomy for biological critters. The names associated with data elements may contain meaning. Work is underway to extend 11179 to better manage the semantics that underlie data elements. Presentations look at standards and techniques that help with the fundamentals of semantic management for metadata registries. Other presentations show techniques being used for particular subject areas.

Extensible Markup Language (XML) is the latest Web wave to surf, extending the powers of HyperText Markup Language (HTML). Metadata registries can use the powers of XML to facilitate access to metadata registries, to transport metadata between registries, to transport metadata to human users, and to display the metadata on the screen in attractive ways. There is collaboration between SC 32 and the World Wide Web Consortium (W3C) to specify what is needed for metadata registries. The work is taking place in the W3C XML-Schema Work Group and the XML-Query Work Group. A presentation reports on expected XML capabilities and how to use them.

8:30 - 8:40 AM

Welcome & Introductory Remarks
Bruce Bargmeyer, Computer Scientist
U.S. Environmental Protection Agency

Chair, SC 32

8:40 - 9:30 AM

Overview of ANSI 12620 and ISO 1087 as they relate to ISO 11179
Sue Ellen Wright, Chair, American National Standards Institute (ANSI) 12620
Kent State University

What insights can we gain from ISO 1087 - Terminology Vocabulary and from ISO 12620   Terminology - Computer Applications - Data Categories? This presentation covers the fundamental of these two standards and how they can be useful for registering the semantics of data and text.

9:30 - 10:00 AM

Cross walk between ANSI 12620, ISO 1087 and the Part 3 of  ISO/IEC 11179
Douglas Mann, Senior Research Scientist
Battelle Memorial Institute

There are many similar notions in the ISo standards for terminology, adata category terminologyk, and the basic attributes of data elements.  This session will present these notions in the context, their standard, and show how they relate.

10:00-10:20 AM
Break

10:20 - 11:30 AM

Terminological analysis of information models according to the CEN/TC251/WG2 approach
Angelo Rossi Mori
Consiglio Nazionale delle Ricerche, Italy

URL: http://zeus.eulogos.it/itb/

11:30 - Noon

MetaModel for Terminology Reference System (TRS)
Shawn Jones, Technical Team Leader, Software Development
Indus Corporation
http://www.epa.gov/trs

Working with SC 32/WG 2, the U.S. Environmental Protection Agency has developed a testbed for capturing terminology and thesaurus constructs for use in metadata registries. It is being tested with terms for the environment including terms used in various state environmental agencies, terms found in environmental thesauri, and terms in the General Multilingual Environmental Thesaurus (GEMET). This presentation describes the metamodel used in the testbed implementation. It is hoped that this provides a starting point for extending the 11179 metamodel to better address fundamental aspects of semantics management.

Noon - 2:00 PM
Lunch

2:00 - 3:10 PM

ANSI Ad Hoc Committee on Ontologies, Overview of proposed Ontology Standards as They Relate to ISO 11179
Adam Farquhar
Schlumberger

What is an ontology? How can one be used? How can metadata registries facilitate the development and use of ontologies? This presentation describes the fundamentals of ontologies and describes work of the ANSI Ad Hoc Committee on Ontologies and a worldwide group of ontology experts. There is talk of incorporating some or much of this work into SC 32 Working Group 2. This presentation helps to find the common ground between ontologies and metadata registries in pursuit of semantic management.

3:10 - 3:50 PM

11179 for United Nations/Electronic Data Interchange for Administration, Commerce, and Transport (UN/EDIFACT)
TBA

This presentation describes how 11179 is used in the development of EDIFACT messages. It looks at how terms may be used in analysis and naming of data elements.

3:50 - 4:10 PM
Break

4:10 - 5:10 PM

XML Schema Language & Metadata Registries
Frank Olken, Staff Scientist
Lawrence Berkeley National Laboratory
John McCarthy, Computer Scientist
Lawrence Berkeley National Laboratory

The XML Schema Language is intended to replace XML DTDs (Document Type Definitions) for the specification of the structure of XML documents (both conventional and data exchange documents generated from databases). We anticipate the release of a proposed recommendation for the XML Schema Language for balloting by members of the World Wide Web Consortium shortly before the Open Forum on Medata Registries. In this tutorial we will discuss the major features of the proposed XML Schema Language, its anticipated uses, and its relevance to metadata registries. Specifically, we will discuss: the basic types supported, the use of facets for type specifications, derived types, type constructors for composite types, inheritance, and the use of XML namespaces in XML schemas. We wil also briefly describe the Object Management Group's XMI (XML Metadata Interchange Format) standard, which may be used to exchange metadata to and from ISO 11179 Data Registries.

The audience is presumed to have familiarity with data administration, HTML, and some database schema language (e.g., SQL DDL). No familiarity with XMI or XML Schema Language is assumed.

The instructors, Frank Olken and John McCarthy, are both members of the XML Schema Working Group of the World Wide Web Consortium. This work is supported by the U.S. EPA.

5:10 - 6:00 PM

NASA Data Entity Dictionary Specification Language
Lou Reich
Computer Science Corporation

The NASA Data Entity Dictionary Specification Language (DEDSL) can be viewed as a subset of 11179-3. It addresses both the abstract level (as does 11179) and a concrete syntax level (2 forms- parameter value language and XML). This presentation describes the current draft version and its content and relationships to 11179-3.


DAY 2

Tuesday, January 18, 2000

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

IMPLEMENTATIONS OF ISO/IEC 11179-BASED METADATA REGISTRIES

8:30 - 8:50 AM

Implementing ISO/IEC 11179: Bits and Pieces or the Whole Enchilada
Bruce Bargmeyer, Computer Scientist
U.S. Environmental Protection Agency

Chair, SC 32

ISO/IEC 11179 can be used in part or in whole. Many people only use Part 4 for data definitions, or Part 5 for identifying and naming data. Most organizations will not implement the entire 11179 metamodel. Many will draw profiles of the 11179 attributes, for particular metadata trading communities. They may extend the attributes as needed. Presentations on this day describe metadata registry implementations, planned and operational as well as describing other uses of ISO/IEC 11179.

8:50 - 9:50 AM

Metadata Registries for the Environment: Environmental Data Registry
Marian Cody, Team Leader for the Information and Data Management Team
U.S. Environmental Protection Agency
Kathleen Gundry, Technical Project Lead
Science Applications International Corporation (SAIC), Systems Development Center (SDC)
Shawn Jones, Technical Team Leader, Software Development
Indus Corporation

URL: www.epa.gov/edr

The EPA Administrator established the Environmental Data Registry (EDR) as the Agency resource for describing and standardizing environmental data. The EDR supports several strategic goals of EPA, including One Stop reporting, the Reinvention of Environmental Information, and the Public Right to Know. It is used for describing environmental data found inside and outside of EPA It is used by state environmental clean-up program offices to facilitate sharing data among themselves--data that is held only by states and not reported to EPA. The EDR is used to record the results of discussions that rage between program offices about data content and design. It is populated with metadata describing a wide spectrum of environmental data including data in   environmental information systems, environmental EDI messages, an environmental data warehouse, environmental regulations, etc. Well formed data is registered for voluntary use. Mandatory data standards are registered for agencywide implementation. The EDR is accessible from the World Wide Web and each month serves up hundreds of thousands of pages. Users download metadata for data elements and groups of data elements. Users also download the entire registry contents.

9:50 - 10:10 AM
Break

10:10 - 11:30 AM

United States Health Information Knowledgebase
Joe Christensen, Head, National Information Development Unit, Australian Health Information Knowledge Base
Australian Institute of Health and Welfare

Glenn M. Sperle, Computer Specialist
Health Care Financing Adminis
tration (HCFA), Office of Clinical Standards and Quality, Information Systems Group
http://hmrha.hirs.osd.mil/registry/

The United States Health Information Knowledgebase (USHIK) is a joint project of the Department of Defense, Health Affairs and the Department of Health and Human Services, Health Care Financing Administration (HCFA). The goals of the USHIK project is to build, populate, demonstrate, and make available for general use an ISO/IEC 11179 based data registry to assist in cataloging and harmonizing data elements across multi-organizations. Various methods of viewing and searching for data descriptions will be discussed or demonstrated.

Gregg Seppala, Data Administrator
Department of Veteran Affairs

11:30 - Noon

Developing Partnerships: The 11179 Metadata Registry Implementers Coalition
Captain Robert W. Mayes, R.N., Director, Information Systems Group
Health Care Financing Administration (HCFA)


This presentation describes the 11179 Metadata Registry Implementers Coalition.  This is a forum for information exchange on the implementation of metadata registries based on the ISO/IEC-11179. Coalition members are interested in developing implementations, influencing commercial vendors to support ISO/IEC-11179 in their tools, developing methods to support metadata exchange between metadata registries, sharing information and lessons learned on implementation approaches, being an advocate and clearinghouse for metadata registry issues, and developing partnerships to support data management across organizations.

Noon - 2:00 PM
Lunch

2:00 - 2:45 PM

Metadata Registries for Aeronautics and Space
John Garrett, Senior Analyst
Ratheon-STX , National Aeronautics and Space Administration (NASA) Goddard Space Flight Center

The NASA Control Authority data description registration, while broader than just data elements, includes the registration of data elements in data dictionaries. The presentation will describe this system, and its relation to 11179.

2:45 - 3:30 PM

Metadata Registries for Intelligent Transportation Systems (ITS)
Thomas M. Kurihara, Program Manager, ITS, Standards Activities
Institute of Electrical and Electronics Engineering (IEEE) Intelligent Transportation System (ITS)

Summary of the status and content of the ITS Data Registry and IEEE P1489, draft Metadata for ITS Data Registry.

URL: http://grouper.ieee.org/groups/scc32/index.html

3:30 - 3:50 PM
Break

3:50 - 4:30 PM

Metadata Registries for Electronic Commerce
Terry Allen, Information Architect
Commerce One

Organization for the Advancement of Structured Information Standards (OASIS) and the XML.org DTD and schema repository.

4:30 - 5:15 PM

Metadata Registries for Census and Demographics
TBA

Description of the content, design, population, query, maintenance, and implementation of a statistical metadata registry and the tools to use it.

5:15 - 6:00 PM

Metadata Registries – Come and Get it: Freeware Versions
11179 - Lite
11179 - Pro
Shawn Jones, Technical Team Leader, Software Development
Indus Corporation

Freeware versions of 11179 metadata registries are being developed for distribution as freeware. The freeware will include metadata entry and query capabilities and will com pre-loaded with some metadata. The freeware metadata registries may be used to initiate registry use within an organization, may be used to distribute metadata along with data products, etc. This presentation tells what will be available and when.

Other freeware versions ?
TBD

6:00 PM
End of Sessions

6:30 - 8:30 PM
Reception and No Host Bar


DAY 3

Wednesday, January 19, 2000

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

CONCEPT MANAGEMENT WITHIN METADATA REGISTRIES

The capability of the Internet to access, interchange and display information, is creating an enormous demand to understand the semantics, the meaning, of the text and data. Semantics management, in turn opens up new possibilities for access, interchange and presentation. Wednesday's sessions address the management of concepts, terms, and terminology structures. Within the context of metadata registries, this level of semantic management can extend the benefits of the metamodel described in 11179, Part 3. Efforts to use meaning to group and identify data content will be described, especially methodologies addressing standardization and systematization. Concept collection techniques, interoperability, mapping scenarios and structural guidelines will also be discussed.

8:30 - 8:50 AM

Semantics Management - Terminology Extensions for 11179
Bruce Bargmeyer, Computer Scientist
U.S. Environmental Protection Agency
Chair, SC 32

This presentation discusses possible interrelationships between metadata registries, terminology, and various concept structures, such as controlled vocabularies, taxonomies, thesauri, data elements, and ontologies. Deployment technologies are suggested for each type of concept structure. The concept structures and deployment technologies are discussed and demonstrated during other presentations in this Open Forum.

8:50 - 9:30 AM

General Environmental Multilingual Thesaurus (GEMET)
Paolo Meozzi
European Enviroment Agency
Sigfus Bjarnason
European Environment Agency
Bruno Felluga

Consiglio Nazionale delle Ricerche (CNR), Instituto Tecnologie Biomediche (ITBM), Italy

URL: http://www.itbm.rm.cnr.it/HTML/rrdafrme.htm

Wolf-Dieter Batschi
Umveltbundesamt Berlin

GEMET was started in Europe by drawing together national thesauri to form a common core of environmental concepts arranged into thesaurus and theme structures (topic trees). Each concept is given linguistic expression in 12 languages. The US has joined this effort, adding American English, and GEMET is also being extended to include Chinese, Arabic and other languages. This presentation describes the content and structure of GEMET , its uses and the development/management strategy.

9:30 - 10:15 AM

U.S. Environmental Protection Agency Terminology Reference System (TRS): A Repository of Environmental Concepts
Linda Spencer, Computer Scientist
U.S.  Environmental Protection Agency
Shawn Jones, Technical Team Leader, Software Development
Indus Corporation
Stuart Gagnon, Internet Librarian for Terminology Projects
Garcia Consultants Incorporated (GCI)

URL: http://www.epa.gov/trs

To increase the retrievability of environmental information available only as data, the EPA has turned to managing semantic context and the conceptual definitions associated with terminologies as they relate to data elements in metadata registries. To facilitate the global interchange of environmental data elements, there must be a mechanism in place to enable the mapping between different languages and different terminology systems within languages. The Terminology Reference System serves as a registry for semantic concepts and their definitions. TRS functionality will include multilinguality and structural (hierarchical, including polyhierarchical) and relational (thematic, source-critical and alphabetical) arrangements. Associating, or indexing, TRS terminology to data elements within the Environmental Data Registry can help increase expectations for finding, understanding and sharing environmental metadata.

10:15 - 10:35 AM
Break

10:35 - 11:20 AM

Multilingual Thesauri
Michele Hudon, M.Bibl., Ph.D Ecole de bibliotheconomie et des sciences de l’information
University of Montreal 

Methodologies for accommodating the diversity of languages, naming schemes, discipline specific terminologies, and multiple thesauri within the data registry environment.

11:20 - Noon

Federated Thesauri
Ralf Nikolai
Forschungszentrum Informatik (FZI)

URL: http://computer.org/conferen/proceed/meta/199/papers/49/nikolai.html

Discussion of the loose coupling of autonomous thesauri to improve semantic interoperability between metadata registries. Analysis of missing information richness in many classical thesauri and suggested possibilities to overcome problems. References to thesaurus management and thesaurus browser functions for catalogue systems.

Noon - 2:00 PM
Lunch

2:00 - 2:50 PM

Making Standards Work in E-Commerce and Among Jurisdictions
IT-Enablement of Data Element-Based Standards
Jake Knoppers
Canaglobe International Inc.

Electronic commerce (or e-business), like commerce and business in general, consists of rule-based business transactions that make extensive use of code sets, often through tables, representing agreed upon and predefined possible choices of common aspects of such business transactions among autonomous organizations. These code sets represent a peculiar category of data-element based standards. Commonly known examples include code sets for countries, currencies, languages, commodities, commercial terms, payment methods, airports, dangerous goods, hazardous materials, etc. Information technology(IT) enablement is the term used to recognize the need for a standard to serve as a tool to transform such standards and currently accepted business conventions from their current manual to a computerized perspective. At the same time, it is important to incorporate localization and multilingual requirements from both a jurisdictional and human-interface perspective. Two new standardization work items to provide such tools, sponsored by JTC1/SC32, are being balloted by ISO/IEC JTC1 (see the documents N5846 and N5847 found in the JTC1 document register at <http://www.jtc1.org>). This session will summarize the need for and explain "IT-enablement", outline a pragmatic approach to meeting the jurisdictional and cultural adaptability challenges, present some practical examples and report on progress of standardization work on these two new work items.

2:50 - 3:40 PM

Thesauri and Metadata Schemata -- Vocabularies, Tools, Metrics
Speaker: TBA

Integrating existing thesauri, taxonomies, authority lists, and metadata schemata with a registry that will make necessary translation and connections among these various elements. 

3:40 - 4:00 PM
Break

4:00 - 5:00 PM

Where Does Terminology End and Metadata Begin?
Harold Solbrig, Technical Specialist, Senior Product Development Specialist
Mayo Foundation

A common conceptual model is necessary if computerized information is to be unambiguously and "meaningfully" exchanged. The discovery and definition of this model is often not a trivial task. There is, however, a shortcut available in many situations - the partitioning of the conceptual space of a trade or specialty is often named and defined by a specialized terminology. This talk discusses some of the information that can be contained within a specialized terminology and how that information can be integrated into a common metadata repository.

5:00 - 6:00 PM

Linking Standard Terminologies to the Health Level 7 (HL7) Reference Information Model (RIM)
Stanley M. Huff, M.D., Senior Medical Infomaticist,
Intermountain Health Care
Professor (Clinical),
Department of Medical Informatics, University of Utah
Chair-Elect, HL7

The goal of the HL7 organization is to create greater interoperability between heterogeneous medical information systems. One strategy for achieving this goal is the use of a common data model, the HL7 RIM, to establish a common understanding of objects that are the subject of communication between systems. This session will describe the HL7 RIM, including a description of the kind of metadata that it contains, the process by which it is maintained, and how it relates to the creation of standardized messages for the exchange of medical information. An important aspect of the RIM is the ability to link coded data elements in the model to standardized medical terminologies that have been registered with the HL7 organization. Linking terminologies to the model leads to the ability to define unambiguous messages for the exchange of data between networked systems.


DAY 4

Thursday, January 20, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 1
The Importance of Environmental Metadata

8:30 - 9:15 AM

Environmental Cleanup Data
Mike Cullen
U.S.  Environmental Protection Agency

Review the five major areas of environmental cleanup information – location; contaminants of concern; remedies; project schedule/progress; and site characterization.

9:15 - 10:00 AM

Using the Environmental Data Registry to Support Cleanup
Larry Fitzwater, Registrar, Environmental Data Registry
U.S. Environmental Protection Agency
Kathleen Gundry, Technical Project Lead
Science Applications International Corporation (SAIC), Systems Development Center (SDC)

Overview of the Environmental Track, Review of the Importance of Environmental Metadata, Environmental Data Registry design concepts and beginnings.

10:00 - 10:20 AM
Break

10:20 AM - Noon

Data Sharing Strategies for U.S.  Environmental Protection Agency and States
Kent Gray, Director,
Division of Environmental Response and Remediation,
Department of Environmental Quality

State of Utah

Discuss Association of State and Tribal Solid Waste Management Ofiicials/U.S. EPA data sharing strategies.

Noon - 2:00 PM
Lunch

2:00 - 2:55 PM

Data Standards and Associated Systems: Biological Taxonomy, Integrated Taxonomic Information System (ITIS) and Chemical Identification, Chemical Registry System (CRS)
Roy McDiarmid, Zoologist and Curator of Amphibians and Reptiles
Smithsonian Institute
http://research.calacademy.org/taf/proceedings/itis.html
Marian Cody, Team Leader for the Information and Data Management Team
U.S. Environmental Protection Agency

Discuss REI data standards for biological taxonomy and chemical identification and demonstration of the Integrated Taxonomic Information System and Chemical Registry System.

2:55 - 3:50 PM

Collaboration on Environmental Data Between Government Agencies
TBA
Department of Defense

Discuss opportunities for coordination and collaboration on environmental metadata between government agencies.

3:50 - 4:10 PM
Break

4:10 - 5:00 PM

Environmental Data Exchange Network Demonstration
Jerry Fowler, Member Technical Staff
Microelectronics and Computer Technology Corporation (MCC)

Greg Pitts, Director, Environmental Programs
Microelectronics and Computer Technology Corporation (MCC)

Demonstrate the Intelligent Information Services used to access to data held by three US Federal agencies and the European Union Environment Protection Agency. The presentation will demonstrate the use of query agents, brokers, mediators, resource agents, mapping facilities, and an ontology.

5:00 - 6:00 PM

Using the Web and XML for Environmental Data and Metadata
Frank Olken
Lawrence Berkeley National Laboratories
John McCarthy
Lawrence Berkeley National Laboratories

Using the Web and XML for Environmental Data and Metadata XML (eXtensible Markup Language) is the new standard successor language for both SGML (the 20 year old ISO Standard General Markup Language used by high-end publishers) and HTML (HyperText Markup Language), a dialect of SGML used for web documents. The XML family of standards developed by the World Wide Web Consortium (W3C) is now being adopted by most major software vendors (including Microsoft, Netscape,Oracle, IBM, Arbortext, Adobe, etc.) Lawrence Berkeley National Lab will demonstrate how XML and XSL (XML Stylesheet Language) can be used to facilitate retrieval, display, and exchange of environmental data and metadata. Three scenarios will be demonstrated using data from EPA's Envirofacts CERCLIS database and corresponding metadata from the Environmental Data Registry:

1. Select a specified report or set of data items from a particular database such as CERCLIS in Envirofacts, and then automatically retrieve metadata about the selected data items from the EDR and use XML to "package" the user-selected metadata components from EDR.

2. Show how XML Stylesheets can be used to format metadata thus retrieved from the EDR in different ways, including tables and name=value text.

3. Given a user-specified set of data from CERCLIS in Envirofacts and corresponding metadata from the EDR, automatically create a structured XML document containing that data and metadata, delimited by XML tags that specify each data and metadata component. Such dynamically created XML documents can be used to exchange data between different agencies, organizations, and databases.


Thursday, January 20, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 2

8:30 - 10:00 AM

Health Care Metadata Management Across Organizations
Government Computerized Patient Record (G-CPR)

Dave Riley

The Department of Defense, the Department of Veterans Affairs, and the Indian Health Service are collaborating to develop the framework for a standards-based, comprehensive exchange of medical record information across enterprises. Experiences learned from the G-CPR Reference Terminology Modeling Work Group.

10:20-11:05 AM

Health Care Terminology
ISO TC215 Healthcare Informatics WG3 Health Concept Representation

Christopher G. Chute, M.D., Dr.P.H.
Head, Medical Information Resources
ISO TE215 WG3 Vice Chair
Mayo Foundation

Emerging standards of metadata and content interface with the infrastructure of controlled terminologies and concept representation. The ISO WG on Health Concepts is defining a suite of meta-standards which provide consistency and structure to health concept systems, coordinating with the broader environment of meta-data representations.

11:05 - 11-15 AM
Break

11:15 - Noon

Quality Indicators for Controlled Health Vocabularies
Peter Elkin, M.D., FACP, Consultant,
Mayo Clinic Rochester

Mayo Foundation

As a summation of the last 10 years of Medical Informatics research around the subject of controlled health vocabularies, there have been a set of principles articulated which stipulate criterion for a well formed controlled health vocabulary. These criterion will be articulated with presentation of relevant literature to support their adaptation. The implementation of a well formed set of quality indicators will give terminology developers greatly needed information which will guide them in the performance of both internal and independed validation studies. These studies are essential to the proper evolution and maintenance of large-scale controlled health vocabularies.

Noon - 2 PM
Lunch

2:00 - 2:50 PM

Center for Disease Control Efforts in Data Integration
Ron Fickner, Director,
Prevention Informatics Office

National Center for Humanoimmunodeficiency virus (HIV), Sexually Transmitted Diseases (STD) & Tuberculosis (TB) Prevention/Centers for Disease Control

Center for Disease Control's  (CDC) efforts to integrate the many national surveillance systems they have and the central role of metadata in achieving that end.

2:50 - 3:50 PM

Issues to be Confronted to Enable the Implementation and Maintenance of a National Healthcare Industry-wide Data Registry
Peter Waegemann, Chair, ANSI Health Informatics Standards Board
Medical Records Institute

The concept of an industry-wide healthcare metadata registry providing for the collection, definition, and standardization of data elements, and the terms used to define them, from multiple participating independent health care standards organizations will be explored. Metadata registration roles, policies and procedures in a confederated environment will be discussed.

3:50 – 4:10 PM
Break

4:10 - 5:05 PM

Taking Care of Business: What is involved in setting up a metadata registry?
Captain Robert W. Mayes, R.N., Director, Information Systems Group
Health Care Finance Administration

An overview of what practical considerations are necessary to implement a registry in the real world.

5:05 - 6:00 PM

Metadata Registration Roles, Policies and Procedures in a Confederated Environment
Genevieve Speier
Office of Clinical Standards and Quality
Health Care Finance Administration

The presentation will examine how the various registration roles described in 11179 might be distributed in a confederated registry environment where the overall registry administrator has no organizational connection or control with or over various submitting organizations.


Thursday, January 20, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 3

Morning: Panel on Metadata Experiences and Plans for the Future

8:30 - 9:15 AM

The Census Bureau Corporate Metadata Repository
Sam Highsmith
U.S. Department of Commerce, Bureau of the Census

This presentation describes work underway at the Census Bureau to develop a Corporate Metadata Registry.

9:15 - 10:00 AM

Integrated Information Solutions
Mark Wallace
U.S. Department of Commerce, Bureau of the Census

The Integrated Information Solutions (IIS) Program articulates a vision and historic opportunity for the U.S. Census Bureau, the Department of Commerce, the larger federal statistical community, and the citizens and taxpayers of our Nation as we enter the 21st Century. The IIS Program will implement a modernized, customer-driven, cross-program, and cross-agency integrated data access and dissemination service capability at the Census Bureau. IIS will broaden information delivery, reduce data user burden, increase efficiencies, and reduce redundancies by providing standards, processes and tools in the administration of a corporate metadata repository; product conception, design, and development; and new disclosure techniques. IIS will serve as a model and catalyst for change in the federal statistical reporting community. Moreover, it will build critical capabilities in the Nation s emerging statistical and spatial data infrastructures that will support global, national, regional, local, and individual decision support systems. Steps are presently being taken to implement this new program at the U.S. Census Bureau.

10:00 - 10:20 AM
Break

10:20 - 11:05 AM

The Canadian Metadata Repository
Paul Johanis
Statistics Canada

11:05 - Noon

Metadata Approaches at Australian Bureau of Statistics (ABS)
Don Bartley
Australian Bureau of Statistics

Noon - 2 PM
Lunch

2:00 - 2:55 PM

Users and Metadata
Cathryn Dippo
U.S. Department of Labor, The Bureau of Labor Statistics

For several years, BLS has sponsored and conducted user-focused research related to the BLS, CPS, and FedStats websites. In the course of that research, numerous metadata-related issues have surfaced. One overarching issue is: what is the minimal set of metadata needed by a user? Obviously, this set varies by user expertise and the task at hand. But how? In this session, I will give a summary of our work to date, including the problems found and the results of some initial experiments on a particular task.

2:55 - 3:50 PM

A User's Perspective on Metadata
Ernie Boyko
Statistics Canada

3:50 - 4:10 PM
Break

4:10 - 5:00 PM

An Evaluation of Alternative Variable Naming Schemes
Cathryn Dippo
U.S. Department of Labor, The Bureau of Labor Statistics

Surveys often involve thousands of variables, and users face a daunting task of finding the ones that will meet their needs. The problem is more than just a search and retrieval one; the cognitive issues, especially user comprehension, need to be addressed. Is there a naming convention that helps users with their cognitive task? An exploratory experiment has been conducted using alternative naming conventions, some of which take into account the standard in Part 5 of ISO11179. In this session, the results of the experiment will be discussed and suggestions sought on next steps.

5:00 - 6:00 PM

Terminology for Statistics
Stephanie Haas, Associate Professor
University of North Carolina


Thursday, January 20, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 4

8:30 - 9:15 AM

Object Interface for an ISO/IEC 11179 Metadata Registry
Tom Culpepper, Senior Software Engineer
3M

Part 3 11179 is stated in the form of a conceptual data model that provides the attributes for identifying the characteristics of data that are necessary to clearly describe, inventory, analyze, and classify data. However, 11179 does not provide an interface that allows for interoperability in a distributed environment. A New Work Item will specify the behavioral aspects of a data registry. The behavior will be stated in the form of an interface specification better known as a Metadata Query Service. The Metadata Query Service will provide a way to access the information in a data registry in a well-defined way; thus, making it possible for applications to interoperate with another application by calling the services named in the interface. The utilization of such an interface could be realized in information technologies such as global information locator service's whose focus in geared at providing public, government, and industry sectors information discovery and retrieval facilities.

Information relating to this effort can be found at:
http://www.nist.gov/L8/sc32wg2/projects/11179obj/

 9:15 - 10:00 AM

Registering Business Objects
Hajime Horiuchi, Professor, Management Information Systems, Tokyo International University
Managing Director, Consortium for Business Object Promotion (CBOP)

This session will deal with experiences in standardization of Business Objects by a Japanese consortium. The presentation also discusses the basic mechanism and framework to standardize Business Objects at various business domains.

10:00 - 10:20 AM
Break

10:20 - 11:05 AM

Selected Community Architectures for Networked Information Systems (CANIS) projects and ISO/IEC 11179
Eric H. Johnson, Programmer
Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign

URL: http://csdl.tamu.edu/DL.95/papers/johncoch/johncoch.html

11:05 - Noon

European Territorial Management Information Infrastructure (ETeMII) and Open GIS Consortium
John Rowley
GEOBASE Consulting Ltd., United Kingdom

European Union Fifth Framework projects--working with geo-spatial data, technologies and methods--will be used to promote a contribution to a European Territorial Management Information Infrastructure (ETeMII). Those projects using reference data, metadata services and implementing data and interface standards will also be clustered. This will be a step towards the creation of a European Information Infrastructure.

 The key issues of the project are to:
· make the link between Territorial Management projects,
· provide user's needs feedback to standard makers
· raise awareness on the benefits of implementing standards
· to disseminate Information technologies for decision makers and citizens thus contributing making data accessible to all
· bring together European stakeholders of Territorial Management,
· give the international dimension that is mandatory in the framework of the Global market, i.e. NSDI in USA, NSDIPA in Japan, ANZLIC in Australia /New Zealand, etc.

 Noon - 2:00 PM
Lunch

2:00 PM - 2:55 PM

Geospatial Data and Metadata Registries
Doug Nebert, Clearinghouse Coordinator
U.S. Geological Survey

2:55 - 3:50 PM

Art and Architecture Thesaurus Browser; Thesaurus of Geographic names; vocabulary program
Patricia Harpring, Senior Editor, Vocabulary Program
Getty Research Institute

URL: http://shiva.pub.getty.edu/aat_browser and http://shiva.pub.getty.edu/tgn_browser

3:50 - 4:10 PM
Break

4:10 - 5:00 PM

NASA Search Service
Steve Hughes
NASA Jet Propulsion Laboratory

A planned implementation of a search service which makes use of DEDSL in the context of XML and CORBA techniques. This is a prototype effort to search across distributed systems at JPL, including the Planetary Data System, and they are attempting to use DEDSL to describe the data elements for searching.

5:00 - 6:00 PM

Lexicon Query Service
Tom Culpepper, Senior Software Engineer
3M

The Lexicon Query Service (LQS) is a read-only specification for accessing the content of medical terminology systems. The LQS is an adopted Object Management Group (OMG) standard. It was developed by the CORBAmed Domain Task Force under OMG. The Lexicon "L" portion of the LQS deals with terminology and the underlining terminology model. The Query Service "QS" portion of the LQS deals with accessing information in a well-defined way.


DAY 5

Friday, January 21, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 1

8:30 - 9:15 AM

Ontology Building Across Heterogeneous Databases
Michael Huhns, Professor 
University of South Carolina

URL: http://www.mcc.com/projects/infosleuth/publications/intranet-java.html

A discussion of how the 11179 model can be utilized through ontology agents to map across heterogeneous databases.

9:15 - 10:00 AM

Mapping Users, Words, and Terminology
Stephanie W. Haas, Associate Professor
School of Information and Library Science, University of North Carolina

Construction of mappings between Agency terminology and user terminology with an emphasis on where these crosswalks and mapping services should live in the agency metadata 11179 model.

10:00 - 10:20 AM
Break

10:20 - 11:10 AM

Toward Searching Dissimilar Metadata: Organizing Semantic Relationships in the 11179 Context
Frederic Gey
U.C. Berkeley
URL: http://www.clis.umd.edu/info/events/ieee.html

A wealth of databases, whose content is textual, numeric and mixed are now appearing on the internet through the World Wide Web. Indeed, classical bibliographic search companies such as DIALOG have now developed web interfaces (http://www.dialogweb.com). Search technology to locate and retrieve these databases is currently quite primitive, consisting primarily of internet search engines or local webcrawlers to find potentially relevant web pages which serve as entry points to complex (and heterogeneous) database applications. However, each of these database applications has its own idiosyncratic metadata which describes the structure and detailed content of the database, and whose description may not (indeed, usually will not) correspond to the ordinary language search terms submitted by the less experienced searcher.

This talk describes a DARPA-funded research project which is developing advanced search capabilities to discover and navigate unfamiliar metadata. Using what is known as "Entry Vocabulary Modules" we create associations between ordinary language and domain-specific technical metadata vocabulary used to describe databases. The process of developing the entry vocabulary utilizes both natural language processing modules as well as statistical language techniques for extracting key phrases (e.g. 'ink jet printer') to map to specialized classifications. The technique also has application to cross-language retrieval, where metadata classifications in one language can be mapped to documents in another language which have been indexed using the original language's metadata.

11:10 - Noon


DAY 5

Friday, January 21, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 2

8:30 - 10:00 AM

Forging National Health Information Agreements
Joe Christensen, Head, National Information Development Unit
Australian Institute of Health and Welfare
Hetty Khan, Health Informatics Specialist
National Center for Health Statistics/ Centers for Disease Control

Instituting a process for improving cooperation on the development, collection and exchange of data, and to facilitate access to uniform health information by community groups, health professionals, payers, and government and non-government organizations.

10:00 - 10:20 AM
Break

10:20 - 11:05 AM

The ESRD Health Information Agreement and the National Renal Data Dictionary
Speaker: TBA

11:05 - Noon

Unified Medical Language System (UMLS)
Speaker: TBA


DAY 5

Friday, January 21, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 3

8:30 - 9:15 AM

Working on Ontologies a Theoretical Approach
Steve Geraldo
Consiglio Nazionale delle Ricerche, Italy

URL: http://saussure.irmkant.rm.cnr.it/onto/index.htm
URL: http://www.ladseb.pd.cnr.it/infor/Ontology/ontology.html

9:15 - 10:00 AM

Ontologies
Eduard Hovy
Information Sciences Institute, University of Southern California

URL: http://www.isi.edu/nsf/papers/hovy2.htm

10:00 - 10:20 AM
Break

10:20 - 11:05 AM

Ontological Bridges
Fritz Lehmann
CYCorp

Building of ontological bridges between different knowledge bases, thesauri, and standard and the role of 11179 data registries

11:05 - Noon

HIKE (HPKB Integrated Knowledge Environment): An Integrated Knowledge Environment for HPKB (High Performance Knowledge Bases)
Albert D. Lin
Science Applications International Corporation
Barbara H. Starr
Science Applications International Corporation

Modern knowledge-based systems development is facing several major problems: high cost of knowledge base (KB) construction, knowledge base reusability, collaboration of knowledge base construction, and knowledge sharing. We address the above problems by introducing a research project sponsored by the Defense Advance Research Projects Agency (DARPA), called High Performance Knowledge Bases (HPKB). We first describe the individual technology components of the project, which provide solutions to the above problems. These components fall into a large variety of functional categories: (1) knowledge servers, editors, and KB composition tools; (2) advanced knowledge representation, reasoning methods, inference tools, and remote knowledge sharing; (3) knowledge acquisition tools, machine learning, and natural-language-information-retrieval techniques; and (4) problem-solving methods including pattern detection, situation monitoring processes, reusable problem-solving libraries, and formal descriptions of problem-solving tasks. Following the descriptions of technology components, the integrated architecture that we develop at Science Application International Corporation (SAIC) called the HPKB Integrated Knowledge Environment (HIKE) is introduced. The integrated system intends to provide a Knowledge-Based System development environment independent of technology components and problem domains. We then discuss how the integrated  system solves the challenge problems (CP). We conclude the paper by describing the current status of the project and the evaluation of HPKB.  

The Key Words: Knowledge construction, knowledge reuse, knowledge sharing, collaboration,  integrated knowledge-based system, problem-solving method, challenge problem, crisis management, battlespace understanding, infrastructure, distributed object environment, object-oriented, integrated knowledge environment.

URL: http://hike.saic.com/saic/documents/KDEX98/HPKB-KDEX98.htm


DAY 5

Friday, January 21, 2000
*Track Descriptions

Monday, Jan. 17, 2000 Tuesday, Jan.18, 2000 Wednesday, Jan. 19, 2000 Thursday, Jan. 20, 2000 Friday, Jan. 21, 2000

Track A | Track B

Sessions Sessions Room 1 | Room 2
Room 3 | Room 4
Room 1 | Room 2
Room 3 | Room 4

Presentations Scheduled for Room 4

8:30 - 9:15 AM

Library Classification Schema and ISO/IEC 11179
Gerard McKiernan, Associate Professor
Science and Technology Librarian and Bibliographer
Iowa State University

URL: http://www.public.iastate.edu/~CYBERSTACKS/CTW.htm


9:15 - 10:00 AM

Zthes
Mike Taylor
System Simulation Ltd.

URL: http://www.n-four.demond.co.uk/mirk/zthes-02.html

10:00 - 10:20 AM
Break

10:20 - Noon

Network Knowledge Organization Systems (NKOS), California Environmental Resources Evaluation System (CERES) and ISO/IEC 11179
Quinn Hart, Programmer/Systems Analyst III
Center for Spatial Technologies and Remote Sensing (CSTARS), University of California, Davis
Gail Hodge
Center for Spatial Technologies and Remote Sensing (CSTARS), University of California, Davis
Linda Hill