ISKO Working Group
"Knowledge Organization and the Internet"

| Request for discussion (RfD):            |
|   "Knowledge organization and management |
|   of heterogeneous subject data          |
|   with Topic Maps and ontologies"        | 
started: 1999-07-25

(This document can be retrieved via:

There is a rising interest in classical knowledge organization
issues within knowledge management for knowledge-intensive industries
(e.g. healthcare, pharmaceutics, chemistry, extensive technical
documentation, internet portals to virtual libraries). One indicator
for this trend are job postings in which information scientists with
expertise in ontology engineering are sought for commercial KM projects.

I would like to stimulate discussion about:
  all aspects of the fruitful interrelation between the
  foundations of knowledge organization and applied knowledge
  management of large, digitized, heterogeneous resources,
  especially using Topic Maps in combination with ontological tools.
and volunteer to summarize this emerging discussion later on.

Update 2000-04-05:

A follow-up to this document is my XML Europe 2000, Paris, talk on Knowledge Organization with Topic Maps. Topic Maps and other ontological tools are technical aids for knowledge organization, and their application gives again rise to such crucial issues as the comparability, compatibility and interoperability of knowledge order systems. Seen from the KO perspective, those tools are nothing really new, but provide "just" a new environment for old and mostly unsolved problems. Although one can model conceptual knowledge order languages (e.g. thesauri) and indexes with them, and they make maintenance and usage easier, the construction of those languages remains still an intellectual enterprise. It may therefore be worthwile to strengthen the link between the scientific field of knowledge organization with its principles and methods and application-oriented standards in KM. I am sure that KO can and should provide valueable insights, advice and consultancy to knowledge managers (CKO's) in charge of implementing corporate memories, and KO experts should proactively shape the advancement of applied knowledge organization. Large knowledge management efforts in multinational companies give often priority to the issues of technology (e.g. digitalization, intranet applications, CBR, yellow pages), to the strategic change of the business culture and to the improved management of human resources. The also crucial conceptual organization of the resources themselves, however, typically receives less attention, and is less well paid. Only recently, conceptual knowledge organization became a hot topic, especially via Topic Maps, ontologies, and the merging of heterogeneous subject data in metadata initiatives. Suppose the following situation: Several hundreds of thousands of documents of heterogenous structure and content are available in electronic form, and the pool is heavily growing. Fortunately they are all marked with XML, presented with XSL, and DTD's have been defined for all major document types. Note that this is not necessarily a library environment, but rather a business scenario, e.g. for a pharmaceutical research lab, or a financial investment group. Partially there exists structured metadata of varying quality, and its semantics differs, because it stems from various sources. The structure of the metadata conforms to a standard such as DC, embedded in the RDF mechanism. Interoperability may be achieved by XML vocabularies that make XML resources self-describing. The CKO directs a team that is acquainted with the recent literature on metadata, XML and RDF (e.g. see a current-cites-based bibliography on: Alternatively, ontological and conceptual tools and languages (such as ODE, Ontolingua, Ontosaurus, WebOnto, OML - Ontology Markup Language (a SHOE-encoding in XML), or CKML - Conceptual Knowledge Markup Language, or WONDEL) are used to model such metadata structures. (see e.g. and On top of that information warehouse, ontology-based intermediation and brokering services could be set up. Ontological information agents have been suggested that gather documents, extract parts according to semantic profiles (e.g. using XQL engines on user-defined DTD's) and index them. The agents could communicate with FIPA ACL or KIF/KQML performatives, and IDL/CORBA would also be appropriate. Topic Maps ---------- For major web references, see: Topic Maps (TOMs) are the online equivalent (or better) to printed back-of-the-book indexes: One can organize knowledge according to semantic categories and aid others in navigation. This metadata is a structured view over a set of information resources that itself need not to be structured. The structuring explicitly models an access structure to the knowledge contained in a collection. Topic Maps are a new ISO/IEC 13250 standard (with a long history) within the document description and processing languages. This standard allows - by defining a syntax - to interchange the information necessary to collaboratively build and maintain indexes. Semantic networks (such as thesauri and more formal ontologies) can be modeled and merged as XML structures, and raw data (documents) can be associated with TOMs (subject or metadata). As topic indexes can be merged for collections (the original motivation for Topic Maps), this aids in applications where heterogeneous subject data stemming from different indexes must be combined in a consistent manner. Because user communities can define their own semantics, it is no longer necessary that central authorities enforce their knowledge order language. Topic Maps constitute views on heterogeneous information repositories, and arbitrary views can be realized. "The same topic can be overlaid on different pools of information, just as different topic maps can be overlaid on the same pool of information to provide different views to different users." It is claimed that Topic Maps were "the most comprehensive and efficient solution to the problem of creating and maintaining consistent master indexes of sets of documents that have different owners and maintainers", and that this solution scaled very well. To quote from sgmlnews: TOMs are a standard "for layering multidimensional topic spaces on top of information assets. The Topic Map standard covers concepts like topics, associations, occurences and facets/metadata. Topic Maps are expected to have a major impact on future information systems". It is expected that e.g. publishers of encyclopedias, reference works and legal information or technical documentation will employ TOMs. More detailed material: The approved standard itself: A recent Topic Map workshop 6/99: (a very readable introduction by Michel Biezunski) Michel Biezunski: Topic Maps at a glance Steve Pepper: Euler, Topic Maps, and Revolution There already exists the very first and free implementation of a TOM processor (software that is capable of importing, exporting, querying and manipulationg TOMs), and examples are available: tmproc: (in python) First questions for discussion: ------------------------------- - What does the TOM standard mean for thesaurus standardization efforts? - Which sophisticated thesauri have been or will be enhanced and modeled with TOMs or other ontological tools? What are the experiences? - Do you know of good examples where the merging of heterogeneous subject data was aided by TOMs, and it worked? On which principles? - Do you know of planned applications of TOMs within large KM projects/consortia? - Are there other tools or examples available? - Should a tutorial for the 6th ISKO be prepared? - Should ISKO update its Warsaw research seminar recommendations on the comparability and compatibility of knowledge order languages to this environment, and thus position itself more visibly towards KM? - What could ISKO offer to CKOs? Best regards Alexander Sigel ---------------------------------------------- Alexander Sigel, M.A. Informationszentrum Sozialwissenschaften Lennéstr. 30, D-53113 Bonn, Germany +49 228 2281 170 tel, +49 228 2281 120 fax Homepage: pending PhD Project: Adaptive user-centered indexing in the social sciences. Combines conceptual indexing strategies, domain analysis, viewpoints, frame-based KR, user modeling, relevance reasons ... ============================================================ Discussion and Hints ==================== 1999-07-28: Workshop: SGML/XML-Einsatz in der Lexikographie 21.9.1999, Heidelberg GLDV-Arbeitskreise Hypermedia, Lexikographie und Texttechnologie, Forschungsstelle Deutsches Rechtswoerterbuch Diskussion u.a. über: Einsatzmoeglichkeiten des Topic-Map-Standards für die Modellierung lexikalischer Daten Kontakt: Ingrid Lemberg,