Topic Maps and RDF, having long ago agreed to agree, finally find themselves on the same page

Extreme Markup Languages 2005 opened with a bang Tuesday, August 2. The conference, which five years ago famously witnessed an RDF/Topic Maps shootout turned treaty-signing, showcased two dazzling efforts to make the alternative relationship technologies interoperable.

Not surprisingly, given the preferences each camp has in its own approach, one presentation gave RDF/OWL the dominant role, the other Topic Maps. On first examination, both seem viable routes to this long desired goal.

Interoperability has taken longer than first envisioned, as previous efforts foundered on the few crucial differences between Topic Maps and RDF. In the intervening years, Topic Maps have acquired an XML syntax and a formal data model, with constraint and query languages under development. RDF and its constraint language, OWL, have progressed to W3C recommendation status, with the SPARQL query language also appearing.

It was the maturation of the data models that provided the path taken by the author of the first paper, Anne Cregan [Building Topic Maps in OWL-DL]. Exploring the possibility of a mapping between RDF and TM, she observed that the Topic Map Data Model, published in January of this year, is an entity-relationship model with a couple of constraints. And OWL, she recognized, is a language that can represent entity-relationship models and constraints. Instead of creating a semantic mapping of core Topic Map constructs to core OWL constructs, she mapped them to an OWL ontology. Thus, to author a Topic Map in OWL requires only importing the TMDM ontology and populating it with instances.

Re-creating the TMDM as an OWL-DL ontology, Cregan said, opens up the advantages OWL has with its formal semantics and developed toolset for such things as visualization of and populating ontologies. It also brings opportunities for expressing constraints, constraint checking, automated reasoning, and querying, capabilities which are not yet formalized in TMCL and TMQL. In effect, this approach makes OWL-DL an alternative syntax for representing a Topic Map, with the benefit that the OWL ontology's constraints force compliance with the TMDM upon the Topic Map author. Seen this way, a Topic Map in OWL-DL syntax requires no real mapping to get to the XML syntax, unlike the approaches taken heretofore, just syntactic manipulation. Cregan even envisioned a day when any such widely accepted OWL ontology might be importable directly by Topic Map engines.

Cregan, who is a PhD student at the University of New South Wales and affiliated with the Knowledge Representation and Reasoning Program of the National Information and Communications Technology Australia (NICTA) Centre of Excellence, pointed to a few unresolved issues with this approach in her presentation. She noted that relaxing the absolute adherence to the TMDM makes it possible to use simpler OWL constructs such as InstanceOf and SubClassOf for Associations.

The second paper, by Lars Marius Garshol, argued that Topic Maps are higher-level than RDF, with more built-in semantics, and that while it makes more sense to represent Topic Maps in RDF rather than the other way around, this sort of object mapping will not suffice for interoperability [The Q Model: A unifying model for RDF and Topic Maps]. Instead he proposed a so-called Q model, which could be regarded as extending RDF triples with a fourth element representing the identity of the triple (thus avoiding the bloat of reification) — four elements, hence Q for quads. An insoluble dilemma in handling scope led Garshol to add a fifth element to represent the context. Q — which would then stand for quints — could be used to efficiently implement combined Topic Map/RDF engines. A concern kept in firmly mind but not explored in this paper is that models in Q be query

Garshol is the Development Manager at Ontopia, a Topic Maps vendor, and has been working on this problem since 2002. The problem with previous, object-mapping approaches, he said, is that while their results retain all the semantics of the Topic Map, they do so in an "unnatural" way, requiring far more triples than the similar information authored natively in RDF. (Using his 2002 approach on the widely known opera Topic Map yielded an RDF file with eleven times the number of triples as there were "TAO's" — Topics, Associations and Occurrences — in the original.) And this transmutated TM/RDF doesn't merge with native RDF and requires queries to be formulated differently.

Garshol's final Q Model contains subject, predicate, statement-id, context and object. The statement-id must be unique and may not be used as a predicate. In order to express Associations without "bloat" he proposes the technique of using Association templates, with the downside being that they prohibit Association-role reification. While going from triples to quints may be a barrier to the acceptance of the Q model, Garshol showed how Topic Maps and RDF both could move into and out of Q without loss of information. This too seems to be a way to provide a mechanism for utilizing RDFS/OWL inferencing to Topic Maps.

— Roger Sperberg