[This local archive copy mirrored from: http://www.ornl.gov/sgml/wg8/document/1937.htm; see the canonical version of the document.]

ISO/IEC CD 13250

ISO/IEC JTC1/SC18/WG8

Document Processing and Relating Communication --
Document Description and Processing Languages


Title:
ISO/IEC CD CD 13250 Information Processing -- SGML Applications -- Topic Navigation Maps
Source:
ISO JTC1/SC18/WG8
Project:
JTC1.18.67
Status of Document:
Committee Draft
Requested action:
Approval of Committee Draft
Summary of major points:
Revised CD
Distribution:
National bodies participating in or observing the activities of JTC1 SC18/WG8

Information Processing -- SGML Applications -- Topic Navigation Maps

Introduction

This standard provides facilities for creating, maintaining and interchanging topic-based navigational aids to large corpora of documents containing interrelated information. It can be applied to any form of electronic information archive, irrespective of the way in which the data is encoded.

If you want to find information in a library of printed publications there are three typical techniques you can use:

  1. Refer to the library catalogue to find publications on specific topics that have been written by a specific author or have a known title
  2. Use bibliographic entries in publications already found to identify other relevant documents
  3. Use the table of contents or index of a publication to find where the information you require is located within a particular publication.

To help supplement your understanding of the text you are reading you can:

  1. look up the meaning of an abbreviations in another publication
  2. look up the meaning of a word or phrase in a dictionary (which may be domain dependent)
  3. look up equivalent words in a thesaurus
  4. look up foreign equivalents to words or phrases in a multilingual dictionary.

As people switch from searching paper-based publications for information to using electronic search engines to find data in electronic archives there is a need to provide electronic equivalents of each of these techniques. At present, however, on-line document browsers are restricted to a small subset of the required functionality.

Setting up a catalogue of electronic 'documents' provided at a web site that consists of descrete files is not a particularly difficult task, though very few sites yet provide classified directories for accessing their documents. What is more difficult is to set up a catalogue that applies to a wide range of sites. Many web search engines seek to do this by classifying documents they reference using some locally determined classification scheme. The fact that these schemes are rarely based on widely known classification schemes, such as the Dewey Decimal Classification and the Library of Congress Classification used be many libraries, makes it difficult to determine the relationship between different web classification schemes.

Note: Setting up a catalogue for virtual sites, where pages are generated in response to user queries is generally not possible.

Existing Internet web sites, which are based on the limited set of data linking facilites provided by HyperText Markup Language (HTML), allow bibliographic references to on-line documents to be turned into activatable links to a particular Uniform Resource Locator (URL). While this technique works as long as the storage location of the identified data does not change, it is prone to error when documents are moved within an archive, or are copied to alternative archives.

Internet URLs can reference previously named points within documents. This allows the development of active tables of contents, where clicking on the relevant line takes you to the uniquely named heading of a section of the document.

Indexes are much harder to develop using HTML. Apart from the fact that every point referenced in the document has to be assigned a unique name, there is an associated problem in that there are no equivalent to printed page numbers that can be used to reference multiple points at which a term applies in the electronic version of the publication. For this reason few web documents have indexes. Instead users rely on full text search facilities to find all occurrences of a term, relying on the fact that the term should have been defined the first time it was used.

When it comes to finding the meaning of acronyms, words and phrases then a web-wide full text search engine can sometimes help, but typically such general purpose search tools will return a listing of thousands of documents that use the term rather than, say, identifying a dictionary that has a definition of the term.

When it comes to finding equivalent terms, either in the same language or in another language, the WWW is of no real help. You have to return to printed therauri or multilingual dictionaries to identify search terms to use in other languages as on-line equivalents are rarely available.

Both printed and electronic publications fail to deal with the real needs of many readers. If you ask a reader what help he actually wants to know about a subject he will typically give you answers such as:

Answering such questions requires expert knowledge of the subject. It also requires that the subject be tracked over time, both prior to and after the original publication. If such information is to be provided over any length of time the task of providing supplementary information related to a publication must be done collaboratively. For this to be possible there must be a clear record of the role any supplementary knowledge associated with a publication is designed to play so that those enhancing the knowledge base at a later date can fully understand the purpose of existing information.

Many existing proposals for providing intelligent search facilities on the WWW are based on the concept of adding metadata to a publication that will allow search engines to automatically associate it with appropriate classification categories. Different sets of metadata have been proposed to allow for different types of selection process. Commonly used examples include the Dublin Core metadata set for describing publications and the Open Software Description for describing software. In many cases the XML (eXtensible Markup Language) subset of ISO's SGML (Standard Generalized Markup Language) has been proposed as the markup language most suitable for encoding this metadata.

There can be many reasons why a particular piece of information needs to be referenced. Often publications and electronic archives are referenced by different communities for different reasons. This means that there can be many different sources of supplementary data associated with a file, and the same part of a file can be referenced from more than one such source. If all this data has to be added into the source document it will be constantly changing.

It is important that users have a mechanism for identifying which additional sources of information apply to which parts of a document. When looking at the document they need to be able to identify the parts of the publication of interest to them and then request a list of sources of additional information on that subject. This has to be done without necessarily having permission to change the original form in which the publication was stored.

ISO/IEC 10744 defines a Hypermedia/Time-based Struturing Language known as HyTime. HyTime incorporates techniques that allow additional information to be associated with publications without having to be stored as part of the source document. Like HTML and XML, HyTime is based on SGML. HyTime is not, however, confined to referencing publications coded in SGML. HyTime can reference any text, image, sound effect or spatial area within a multimedia publication.

An important aspect of HyTime is that it allows the addresses of information to be stored, in a system independent manner, independently from references to these addresses. This allows automatic archive management facilities to be built into systems. These facilites can be used to check which addresses need to be changed when an archive is restructured, or transferred to a new location.

Using HyTime it is possible to develop systems where expert knowledge, expressed in the form of topic navigation maps, can be associated with electronic information sets in a way that can be controlled by the reader. At the same time it becomes possible to create knowledge bases that can be shared by different information sets. This standard shows how HyTime can be used to create such 'topic navigation maps' to guide readers around sources of related information.

Scope

This standard provides a technique, based on facilities defined in ISO/IEC 10744:1997, for identifying information objects that share a common topic. It can also be used to define the relationships between sets of related topics.

The techniques defined in this standard can, among other uses, be used to define:

Normative references

Definitions

The terms and definitions given in ISO 8879:1986 and ISO/IEC 10744:1997 and the following terms and definitions apply to this standard.

Architectural forms
Rules for creating and processing components of documents. Four kinds of architectural form are defined in ISO/IEC 10744:1997 -- element form, attribute form, data entity form, and data attribute form.
Architectural Form Definition Requirements (AFDR)
The requirements, specified in Annex A of ISO/IEC 10744:1997, for the formal definition of the architectural forms by which an enabling document architecture governs the SGML representation of its documents.
Hub document
The HyTime document in which access to a hyperdocument begins.
HyTime hyperlink
An information structure, defined using architectural forms defined in ISO/IEC 10744:1997, that represents a relationship between two or more objects.
Occurrence role
A role identified by the name of an attribute defined as an anchor role of a HyTime hyperlink, or an element defined as an anchor specification within such a link.
Semantic assignment
A specialized HyTime hyperlink that connects the information objects sharing a common semantic.
TNM engine
An application that can process topic navigation maps.
Topic
The set of objects referenced by a semantic assignment element.
Topic relation
Relationship between two or more topics or sets of semantic assignments.

Purpose

This standard provides facilities for creating, maintaining and interchanging topic-based navigational aids to large corpora of documents containing interrelated information. The standard makes a distinction between the highly concentrated and independent 'topic navigation maps -- sets of relations between formally defined topics covered in a given corpus -- defined within this standard and the links to, and addresses of, occurrences of relevant information within the corpora themselves, which are defined using facilities provided by ISO/IEC 10744, which defines the Hypermedia/Time-based Structuring Language known as HyTime.

Topic navigation maps can improve the accessibility of information by facilitating, and to some extent automating, the task of providing, and imposing editorial consistency and maintainability on, navigational resources. Topic navigation maps are designed to simplify groupware-supported production of data for which navigational aids such as indexes, glossaries, tables of contents, lists and catalogs need to be generated. Topic navigation maps can also be used to enhance the navigability of very large information bases.

This standard provides an SGML architecture, defined according to the rules specified in the SGML Extended Facilities annex of ISO/IEC 10744, for creating and maintaining data that classifies information in documents according to topic, and classifies topics with respect to each other. The standard will help to increase consistency, and decrease redundancy, not only in navigational aids within documents, but also in navigational aids used with multiple documents, such as master indexes. The discipline that can be imposed by using the facilities provided in this standard will assist those who create and/or collect electronic information sets, and who wish to provide a given collection with a unified, consistent, and minimally redundant topic index.

The Standard Generalized Markup Language (SGML) defined in ISO 8879:1986 allows all kinds of documents to become databases. By providing ways to navigate data stores so that parts of documents that are relevant to a particular topic can be easily found and organized rapidly by machine, this standard augments the suitability of SGML for electronic document interchange. The number and complexity of indexable topics, and the relationships between them, greatly exceeds the number and complexity of relations normally represented in traditional databases or, for that matter, in the kinds of indexes normally found in books. The number of topic relationships that might usefully be represented with respect to any reasonably large collection of documents is, in fact, for all practical purposes limitless. Moreover, even in archived documents, new kinds of topic relationships can be expected to appear from time to time. This standard, therefore, is specifically designed to allow multiple topic maps to be created over a period of time for any collection of data, and to allow for different topic maps to be inter-related.

An application-neutral, internationally understandable, rigorous, and yet flexible and open way to represent topical indexes, such as the one set forth in this standard, can help to make indexes easier to make, easier to maintain, and easier to use. As new relationships are discovered and included as part of a topic architecture, the architecture changes. Many specialists may have to collaborate and contribute, over a number of years, to an evolving topic navigation map, which at any given time must unambiguously and comprehensibly govern all maintenance activities. Unless those who are adding and/or maintaining anchors have clear guidance, the instantiation of that topic navigation map -- the index itself -- may become unsound and unsafe.

There are many cases where individuals need to create their own classification of the information sets they frequently access. For example, when researching a subject a reader may wish to create his own method for linking relevant files together. This standard, therefore, provides a mechanism by which individuals can create their own 'views' of an information set, or of the relationships between topics defined in existing topic navigation maps by other experts.

A topic navigation map defines both topics and the relations that they bear to one another. It must, therefore, permit:

to be represented, universally interchanged, processed, merged, and used for data navigation.

This standard provides SGML architectural forms that can be used in conjunction with the architectural forms defined in the HyTime standard to link topics definitions with information sources so that it can support applications that provide:

Topic navigation maps are defined using TNM.SemanticAssignment-form attributes that assign a known meaning, universe and weighting to a topic, and TNM.TopicRelationship-form attributes that identify relationships between topics. Categories of topics may be iteratively identified and described by linking suitable topics to other topics belonging to the category.

A topic map is created by linking, using HyTime hyperlinks, several pieces of information about a topic through a semantic assignment.

Architectural form definition

This clause formally defines the enabling architectures required to implement topic navigation maps. It is defined using the specification for Architectural Form Definition Requirements in the SGML Extended Facilities annex of ISO/IEC 10744:1997. The meta-DTD defined by this document starts, therefore, with a (non-processable) markup declaration of the form:

<!AFDR "ISO/IEC 10744:1997">

The SGML declaration accompanying a document containing a topic navigation map shall include the token ArcBase in its APPINFO field to indicate that details of the SGML architectures for which support is required will be found in processing instructions starting with the (default) ArcBase keyword. If another name is required to identify applicable processing instructions it can be specified, as part of the same token, by using the form ArcBase=ArcUsed.

An SGML document type definition (DTD) that uses the facilities defined in this standard shall contain, as near to the start of the part of the DTD that contains definitions using HyTime or the architectural forms defined in this document as possible, a processing instruction, coded using the document's concrete syntax, whose contents are IS10744 ArcBase TNM HyTime where IS10744 indicates that this standard conforms the the Architectural Form Definition Requirements specified in Annex A of ISO/IEC 10744:1997, ArcBase is the name assigned for identifying architectural bases in the APPINFO section of the SGML declaration associated with the DTD, TNM is a mnemonic used throughout this standard to identify components of a Topic Navigation Map and HyTime is the name of the base architecture from which topic navigation maps are derived.

A DTD that uses the facilities defined in this standard shall contain an SGML notation declaration and architectural support attribute definitions based on the template shown below, modified if necessary to conform to the concrete syntax defined in the associated SGML declaration:

<!NOTATION TNM PUBLIC "ISO 13250:1998//NOTATION AFDR ARCBASE

               Topic Navigation Map

               Architecture Definition Document//EN"

            -- A derived architecture used in conformance with the

               Architectural Form Definition Requirements of

               International Standard ISO/IEC 10744.           -->



<!ATTLIST #NOTATION TNM

  ArcFormA       NAME    #FIXED "TNM"

  ArcNamrA       NAME    TNM-name

                         -- Can be used to assign locally

                            meaningful names to TNM attributes --

  ArcDTD         CDATA   #FIXED "TNM-DTD" 

  ArcDocF        NAME    "TNM-hub" 

  ArcAuto        (ArcAuto|nArcAuto) "nArcAuto"

  ArcQuant       CDATA   #FIXED "NAMELEN 24"

  -- NAMELEN has been extended to 24 to allow "natural" naming

     to be applied to Topic Navigation Map attribute values -- >

Editor's Question: Do we need other attributes here, such as ArcBridF and ArcSuprA (or is the fact that the standard HyTime options are defined in the next section sufficient)?

Note: Definitions of the attributes defined in this declaration are provided in Annex A of the 2nd Edition of ISO/IEC 10744.

The DTD must also contain the following entity definition, with its associated notation declaration, which is referenced by the following architecture support attributes:

<!NOTATION AFDRMeta --AFDR Meta-DTD Notation--

            PUBLIC "ISO/IEC 10744:1997//NOTATION AFDR

                    Meta-DTD Notation//EN" >



<!ENTITY TNM-DTD    -- Meta-DTD for navigation map instances --

            PUBLIC "ISO 13250:1998//DTD AFDR Meta-DTD

                    Topic Navigation Map//EN" CDATA AFDRMeta >

In addition support must be declared for at least a minimal set of HyTime functionality by including the following declarations:

<!-- HyTime Support Declarations -->

<!NOTATION

 HyTime         -- HyTime Architecture --

                -- A base architecture used in conformance with the

                   Architectural Form Definition Requirements of

                   International Standard ISO/IEC 10744. --

 PUBLIC "ISO/IEC 10744:1997//NOTATION AFDR ARCBASE

         Hypermedia/Time-based Structuring Language (HyTime)//EN"

>

<!ATTLIST  #NOTATION HyTime

 ArcFormA NAME     HyTime

 ArcNamrA NAME     HyNames

 ArcSuprA NAME     sHyTime

 ArcIgnDA NAME     HyIgnD

 ArcDocF  NAME     HyDoc

 ArcDTD   CDATA    "HyTime"

 ArcQuant CDATA    #FIXED "NAMELEN 9"

 -- NAMELEN must be at least 9 because the HyTime meta-DTD uses

    8-character parameter entity names. --

 ArcDataF NAME     #FIXED HyBridN

 ArcBridF NAME     #FIXED HyBrid

 ArcAuto  (ArcAuto|nArcAuto) nArcAuto

 ArcOptSA NAMES    "GenArc base locs links"

 GenArc            -- General architecture facilities --

          CDATA    -- Lextype: csname+ --

                   "HyLex ireftype lextype"

 base              -- Base module facilities --

          CDATA    -- Lextype: csname+ --

                   "bos bosspec conloc dimspec HyDimLst HyDimSpc valueref"

 locs              -- Location address module facilities --

                   -- Clause: 6.3 --

          CDATA    -- Lextype: csname+ --

                   "agrovdef bibloc dataloc datatok grovplan listloc mixedloc

                    multloc nameloc nmsploc pathloc pgrovdef proploc queryloc

                    refctl referatt refloc reftype relloc spanloc treecom treeloc

                    treetype"

 links             -- Hyperlinks module facilities --

                   -- Clause: 6.3 --

          CDATA    -- Lextype: csname+ --

                   "agglink anchloc clink hylink ilink linkloc traverse varlink"

 anysgml           -- Any SGML declaration allowed --

                   -- Clause: 6.3 --

          (anysgml|nanysgml) anysgml

 exrefs            -- External references allowed --

                   -- Clause: 6.3 --

          (exrefs|nexrefs)   exrefs

 refmodel          -- SGML model groups for reftype --

                   -- Clause: 6.3 --

          (SGMLmdl|nSGMLmdl) SGMLmdl

 hyqcnt            -- Highest quantum count limit --

                   -- Clause: 6.3 --

          NUMBER   -- Constraint: power of 2 >= 32  --

                   32

 manyanch          -- Maximum number of anchors allowed in

                      hyperlinks --

                   -- Clause: 6.3 --

          NUMBER   -- Constraint: must be >= 2 --

                   #IMPLIED    -- Default: no limit --

>

<!NOTATION AFDRMeta

 PUBLIC "ISO/IEC 10744:1997//NOTATION AFDR Meta-DTD Notation//EN"

>

<!ENTITY

 HyTime         -- HyTime meta-DTD --

                -- Clause: 6.3 --

                PUBLIC "ISO/IEC 10744:1997//DTD AFDR Meta-DTD

                        Hypermedia/Time-based Structuring Language (HyTime)//EN"

                CDATA AFDRMeta

>

All of the addressing mechanisms allowed by HyTime may be used in the context of this standard. However, each application can limit itself to a more specific list of the options relevant to its context by removing unused options from the declarations given above.

Defining the base element of a Topic Navigation Map

A topic navigation map may form a HyTime hub document. If this is the case, the base document element of the topic navigation map shall conform (except that the list of options provided in each parameter entity may be a subset of that shown) to the following architectural forms:

       <!-- HyTime Topic Navigation Map Document Hub -->

<!entity %

 loc            -- Location address forms --

 "anchloc|bibloc|dataloc|linkloc|listloc|mixedloc|nameloc|

  nmsploc|pathloc|proploc|queryloc|relloc|treeloc"          >

<!entity %

 link           -- Hyperlink forms --

 "agglink|clink|hylink|ilink|varlink"                       >

<!entity %

 resbase        -- Base module resource forms --

 "bosspec"                                                  >

<!entity %

 resloc         -- Location address module resource forms --

 "agrovdef|datatok|grovplan|pgrovdef"                       >

<!entity %

 resorce        -- All resource architectural forms --

 "%resbase;|%resloc;"                                       >

<!element TNM-hub  -- Topic Navigation Map HyTime hub document element --

           - O ANY +(%loc;|%link;|%resorce;)

-- OptionalAttributes [base]: bos, bosspcat --

-- OptionalAttributes [locs]: dgrvplan --

-- CommonAttributes [GenArc]: dvlatt, etfullnm, id, irefmodl,

 ireftype, lextype --

-- CommonAttributes [base]: dtxtatt, valueref --

-- CommonAttributes [locs]: refctl, refloc, reftype, rflocspn -- >

<!attlist TNM-hub

  TNM      NAME  #FIXED "TNM-hub"

  HyTime   NAME  #FIXED "HyDoc"    >

Note: The formal definition of the elements and attributes defined in this section is provided in Clause 6 of ISO/IEC 10744:1997.

Semantic assignment -- TNM.SemanticAssignment

A semantic assignment is a specialized HyTime hyperlink that connects the information objects sharing a common semantic. The set of objects referenced by a semantic assignment element is known as a topic. The objects located by the hyperlink have the common property of being anchors of a semantic assignment element. Therefore, one can distinguish:

Common examples of topics are index and/or glossary entries. An index entry is a set of locations sharing the semantics associated with the term that is displayed in the index. Indexed terms are normally displayed in alphabetical order at each level in the index's hierarchy. A glossary entry is a topic that has associated with it text that is considered to be its definition. In creating a glossary, defined terms are normally displayed as headings for their definitions, with terms typically being listed in alphabetical order. This standard enables topics that play at the same time the role of index and glossary entries: the contents of the defining element, or one of its attributes, indicates the text to be used to identify the topic while an attribute identifies the location of a formal definition of the topic, which could form a nested subelement of the topic definition or be stored in a completely separate document.

When a topic navigation map semantic assignment element is defined its anchrole attribute value (or, in the case of the varlink architectural form, the anchor specification elements it contains) have to be specified. The role of each anchor attribute (or anchor specification element) indicates the type of occurrences identified by the topic. These lists of anchor addresses are known as 'occurrence roles'. There is no limit to what can be represented and distinguished through occurrence roles, nor to the number of occurrence roles. In particular, the anchors for a given anchor role need not be prespecified, but can be ascertained through a HyTime query location. It is entirely the realm of the application to decide what to do when all anchor roles are not filled. (A "null" address could be interpreted as no occurrence, for example.) The purpose of differentiating between different kinds of occurrence roles is to help users distinguish between different kinds of targets and navigate with more precision in a large set of information objects.

The semantic assignment attribute type form (TNM.SemanticAssignment) can be used to qualify as many HyTime hyperlink element types as desired in a DTD; each such element becomes a semantic assignment element type. Different semantic assignment element types will typically have different sets of anchors described in their anchrole attribute.

Most semantic assignment elements will be assigned a unique identifier (id) so that they can become part of a topic navigation map topic relationship (see Clause ???).

Note: Those semantic assignment elements that are intended to become part of a topic navigation map topic relationship must have a unique identifier (id). To prepare semantic assignment elements that can later play a role in a topic relationship, it is a good practice to assign ids to all semantic assignment definitions, although this is not an absolute requirement of this standard, since there will be cases where setting up ids will be irrelevant.

The mnemonic attribute allows a short name to be given to the semantic definition. This is the name that will typically be assigned to the topic within lists or menus of topics. The contents of the mnemonic attribute may be used by some applications to control the order in which entries are to be sorted. Where no mnemonic is assigned the value of the unique identifier attribute will be used as the mnemonic.

The topictitle attribute can optionally be used to assign a full title to the topic where this is not defined in the contents of the semantic assignment element. When this attribute has no value assigned to it, and there are no textual contents for the element, the value of the mnemonic attribute (which may be inherited from the id attribute) will form the topic title.

The topicdef attribute can optionally be used to identify where the topic has been formally defined. It will normally point to either a uniquely named definition within the same document or to a HyTime location address element within the current document that identifies where the definition can be found.

The domains attribute identifies the knowledge domain(s) in which the semantic definition will be valid. There is no limit to the number of domains a topic can be associated with. Domain names must be defined as tokens conforming to the domain-name lexical type.

Editor's Note: I have presumed that domain names must form a set of space delimited tokens, with underlines or hyphens replacing spaces. Michel would prefer a more general purpose mechanism for identifying filtering agents. This could take the form of a list of the names of attributes that can be used to control filtering that is similar to the way an anchrole attribute defines which attributes identify occurrences of uses of the topic. What limitations should be placed on the construction of domain nametokens?

Editor's Note: Michel has also pointed out another potential problem here: 'The domain currently applies to the topic as a whole, and can not be fine-tuned at the level of anchors. Now suppose I am speaking about the topic "Sun". I have a domain called "Tourism", and another one role called "Astronomy". I want to be able to distinguish the display of the topic and its occurrences depending on the domain. With the mechanism proposed here, we need to create two separate topics for Sun, one within the domain "Astronomy", one within the domain "Tourism". It's like if the domain would be the parent of the semantic assignment.'-- Does this imply that domains should be inherited (e.g. assigning a domain attribute to the container of a set of semantic assignments that do not have specific values for the domain attribute will cause that value to be inherited)? If so this attribute needs to be defined as a separate architectural form.

An application that can process topic navigation maps is said to be a 'TNM engine. Among its other functions, aTNM engine must be able to suppress an element with a particular value as one of its domain-name values in a domains attribute. The question of what it means, in any particular case, for information to be so 'disqualified' is entirely the realm of the application. In general, though, the purpose of disqualification by knowledge domain is to avoid wasting the user's time and attention on irrelevant information. It is the responsibility of the application to inform the TNM application whenever domains become valid or invalid due to changes in user context; this minimizes transmission of unwanted information. (In some applications, a user can say that all knowledge domains are always valid, and then see everything. In other applications, domains can be used for separating access levels depending on the degree of classification for different parts of the document, as defined by the hyperdocument editor in a way that cannot be modified by individual end-users.)

A TNM engine is responsible for maintaining a namespace of domains for each mnemonic, and a namespace of mnemonics for each domain. Given a domain, a TNM engine must be able to provide a comprehensive list of all mnemonics declared in semantic assignment elements within the document's bounded object set (BOS).

Where appropriate the optional weighting attribute can be used to indicate the relevance of a particular instance of a topic to the application. The specified integer could, for instance, be used to control the sequence of related assignments.

<!attlist

-- TNM.SemanticAssignment --

--Topic Navigation Map Semantic Assignment attributes --

-- Clause ???? --

(agglink|clink|hylink|ilink|varlink)

  TNM         NAME        "TNM.SemanticAssignment"

  id          -- Unique identifier for assignment --

              -- ISO/IEC 10744 Clause: A.5.2 --

              ID          #IMPLIED

              -- Each TNM.SemanticAssignment should be assigned a unique

                 identifier to allow the topic to be used as an

                 anchor for a topic relationship link. As all topics need

                 not be anchors, the id is not required. --

  mnemonic    -- Mnemonic used to identify the topic in menus,

                 or to determine the order of topics --

              -- The short name for the subject matter of

                 this definition; machine-processable identifier.

                 Can be seen as a "semantically-loaded identifier"

                 (which may or may not be unique)  --

              CDATA       #IMPLIED -- Default: Value of id attribute --

  topictitle  -- Title by which semantic is referred to --

              CDATA       #IMPLIED -- Default: Contents of element,

                                      or value of mnemonic attribute --

  topicdef    -- Topic semantic definition pointer --

              -- Identifier(s) of element(s) containing text that

                 defines the topic. --

              -- Lextype: (IDREFS)--

              CDATA       #IMPLIED

              -- Default: No formal definition. --

  domains     -- Domains relevant to topic --

              -- Defines the knowledge domains (semantic universes) in which

                 this topic is relevant. This attribute is generally used to

                 filter out non-relevant topics according to a list of

                 domains chosen by by the user. --

              -- Lextype: (domain-name+) --

              -- Question: What constraints should be placed on

                           a domain-name? Should it

                           be a valid SGML name or any text with

                           spaces replaced by underlines so that

                           spaces delimit universe specifications?

                           Where is the relevant lextype declared?--

              CDATA      #IMPLIED -- Default: All knowledge domains --

  weighting   -- Weighting of assignment relevance within domain(s) --

              -- Indicates relevence to topic to associated domains .

                    0 = means possibly irrelevent.

                  100 = means highly relevent.

                 If a weight is more than 100, it can indicate extreme

                 or compelling relevence, though weights greater

                 than 100 may be deemed to be 100 by any application.--

              NUMBER      #IMPLIED --Default: 100--

>

Representation of relationships between topics -- TNM.TopicRelationship

Topics may be linked to one another by means of Hytime hyperlinks that have had associated with them the attributes defined in the topic relationship attribute type architectural form (TNM.TopicRelationship). Any number of relationships can exist between any two or more topics. Linked topics create topic navigation maps. Topic navigation maps can provide skeletal structures onto which related occurrences can subsequently be added.

Note: Topic navigation maps provide a flexible mechanism for defining database schemas. Topic navigation maps allow relational and heirarchical views of data models to be combined.

Normally the element type name for the topic relationship element will sufficiently identify the type of relationship being defined. Alternatively the name of the relationship between the objects identified by the occurence roles for the associated link element can form the displayable contents of the topic relationship element. The relationship attribute can optionally be used to identify where the relevant relationship has been recorded. It will normally point to either a uniquely named definition within the same document or to a HyTime location address element within the current document that identifies where the definition can be found.

All other attributes in this architectural form have the same role as for the TNM.SemanticAssignment attribute type architectural form (see Clause ????).

<!attlist

-- TNM.TopicRelationship --

-- Topic Navigation Map Topic Relationship --

-- Clause ???? --

(agglink|clink|hylink|ilink|varlink)

  TNM         NAME        "TNM.TopicRelationship"

  id          -- Unique identifier for assignment --

              -- ISO/IEC 10744 Clause: A.5.2 --

              ID          #IMPLIED

              -- Each TNM.SemanticRelationship can be assigned a unique

                 identifier to allow the relationship to be used as an

                 anchor for a topic relationship link. As all topics need

                 not be anchors, the id is not required. --

  relationship -- Relationship definition identifier --

              -- Identifier(s) of element(s) containing text that

                 defines the topic. --

              -- Lextype: (IDREFS)--

              CDATA       #IMPLIED

              -- Default: Relationship is defined sufficiently by name or                            contents of link element. --

  domains     -- Domains relevant to topic --

              -- Defines the knowledge domains (semantic universes) in which

                 this relationship is relevant. This attribute is generally used

                 to filter out non-relevant relationships according to a list of

                 domains chosen by by the user. --

              -- Lextype: (domain-name+) --

              -- Question: What constraints should be placed on

                           a domain-name? Should it

                           be a valid SGML name or any text with

                           spaces replaced by underlines so that

                           spaces delimit universe specifications?

                           Where is the relevant lextype declared?--

              CDATA      #IMPLIED -- Default: All knowledge domains --

  weighting   -- Weighting of assignment relevance within domain(s) --

              -- Indicates relevence to topic to associated domains .

                    0 = means possibly irrelevent.

                  100 = means highly relevent.

                 If a weight is more than 100, it can indicate extreme

                 or compelling relevence, though weights greater

                 than 100 may be deemed to be 100 by any application.--

              NUMBER      #IMPLIED --Default: 100--

>

Informative Annex A: Examples

Note: These examples have deliberately been chosen to challenge some of the assumptions made in the definitions of the architectural forms, and should not be taken as indicating the best way of implemeting topic navigation maps. Whether or not examples will be provided in the published standard is still to be debated. For the time being these examples should only be viewed as something designed to stimulate discussion about the preceding architectural forms.

No SGML declaration is provided for the following examples. It is presumed that the SGML declaration used does not change the markup character set or the delimiters associated with the reference concrete syntax defined in ISO 8879, other than the fact that colons are permitted within names, and that the SGML declaration ends with the following statement:

APPINFO "ArcBase"

Note: The extension of name so that colons can occur within them allows the namespace rules commonly employed by SGML applications to be adopted in the following text. However, as some of the examples deliberately suggest uses for more than one colon they do not conform to the namespace rules currently being proposed for XML.

The examples in this annex start with a reference to a file containing the following library definitions for a topic navigation map architecture:

      <!-- Architecture definition for Topic Navigation Maps -->

<!-- This file can be referred to using the following public identifier:

   "ISO 13250//DTD AFDR TNM Meta-DTD Support Declarations//EN"

-->

<! AFDR "ISO/IEC 10744:1997">

<!NOTATION TNM PUBLIC "ISO 13250:1998//NOTATION AFDR ARCBASE

               Topic Navigation Map

               Architecture Definition Document//EN"

            -- A derived architecture used in conformance with the

               Architectural Form Definition Requirements of

               International Standard ISO/IEC 10744.           -->

<!ATTLIST #NOTATION TNM

  ArcFormA       NAME    #FIXED "TNM"

  ArcNamrA       NAME    TNM-name

                         -- Can be used to assign locally

                            meaningful names to TNM attributes --

  ArcDTD         CDATA   #FIXED "TNM-DTD"

  ArcDocF        NAME   "TNM-hub"

  ArcAuto        (ArcAuto|nArcAuto) "nArcAuto"

  ArcQuant       CDATA   #FIXED "NAMELEN 24"

  -- NAMELEN has been extended to 24 to allow "natural" naming

     to be applied to Topic Navigation Map attribute values -- >



<!NOTATION SGML PUBLIC "ISO 8897:1997//NOTATION Standard Generalized

                        Markup Language//EN" >

<!ATTLIST #NOTATION SGML

  boslevel       NUMBER  #IMPLIED

                 -- Default determined by doc's boslevel attribute --

  inbos          (inbos|notinbos)  inbos    -- Is it in/out of bos? --

  subhub         (subhub|nosubhub) subhub   -- TNMs can be subhub documents --

>

<!NOTATION AFDRMeta --AFDR Meta-DTD Notation--

            PUBLIC "ISO/IEC 10744:1997//NOTATION AFDR

                    Meta-DTD Notation//EN" >

<!ENTITY TNM-DTD    -- Meta-DTD for navigation map instances --

            PUBLIC "ISO 13250:1998//DTD AFDR Meta-DTD

                    Topic Navigation Map//EN" CDATA AFDRMeta >

         <!-- HyTime Support Declarations -->

<!NOTATION

 HyTime         -- HyTime Architecture --

                -- A base architecture used in conformance with the

                   Architectural Form Definition Requirements of

                   International Standard ISO/IEC 10744. --

       PUBLIC "ISO/IEC 10744:1997//NOTATION AFDR ARCBASE

               Hypermedia/Time-based Structuring Language (HyTime)//EN"

>

<!ATTLIST #NOTATION HyTime

 ArcFormA NAME     HyTime

 ArcNamrA NAME     HyNames

 ArcSuprA NAME     sHyTime

 ArcIgnDA NAME     HyIgnD

 ArcDocF  NAME     #FIXED HyDoc

 ArcDTD   CDATA    "HyTime"

 ArcQuant CDATA    #FIXED "NAMELEN 9"

 -- NAMELEN must be at least 9 because the HyTime meta-DTD uses

    8-character parameter entity names. --

 ArcDataF NAME     #FIXED HyBridN

 ArcBridF NAME     #FIXED HyBrid

 ArcAuto  (ArcAuto|nArcAuto) nArcAuto

 -- For some HyTime forms, the generic identifiers of client elements

    are meaningful to a HyTime engine and must be customized.

    Therefore, automatic form mapping may be inappropriate. --

 ArcOptSA NAMES    "GenArc base locs links"

 GenArc         -- General architecture facilities --

          CDATA -- Lextype: csname+ --

                "HyLex ireftype lextype"

 base           -- Base module facilities --

          CDATA -- Lextype: csname+ --

                "bos bosspec conloc dimspec HyDimLst HyDimSpc valueref"

 locs           -- Location address module facilities --

                -- Clause: 6.3 --

          CDATA -- Lextype: csname+ --

                "agrovdef bibloc dataloc datatok grovplan listloc mixedloc

                 multloc nameloc nmsploc pathloc pgrovdef proploc queryloc

                 refctl referatt refloc reftype relloc spanloc treecom treeloc

                 treetype"

 links          -- Hyperlinks module facilities --

                -- Clause: 6.3 --

          CDATA -- Lextype: csname+ --

                "agglink anchloc clink hylink ilink linkloc traverse varlink"

 anysgml        -- Any SGML declaration allowed --

                -- Clause: 6.3 --

         (anysgml|nanysgml) anysgml

 exrefs         -- External references allowed --

                -- Clause: 6.3 --

         (exrefs|nexrefs) exrefs

 refmodel       -- SGML model groups for reftype --

                -- Clause: 6.3 --

         (SGMLmdl|nSGMLmdl) SGMLmdl

 hyqcnt         -- Highest quantum count limit --

                -- Clause: 6.3 --

         NUMBER -- Constraint: power of 2 >= 32  --

                32

 manyanch       -- Maximum number of anchors allowed in

                   hyperlinks --

                -- Clause: 6.3 --

         NUMBER -- Constraint: must be >= 2 --

                #IMPLIED  -- Default: no limit --

>

<!NOTATION AFDRMeta

 PUBLIC "ISO/IEC 10744:1997//NOTATION AFDR Meta-DTD Notation//EN"

>

<!ENTITY

 HyTime         -- HyTime meta-DTD --

                -- Clause: 6.3 --

        PUBLIC "ISO/IEC 10744:1997//DTD AFDR Meta-DTD

                Hypermedia/Time-based Structuring Language (HyTime)//EN"

                CDATA AFDRMeta

>

Creating a Combined Thesaurus and Dictionary

The following example illustrates how the architectural forms defined in this standard could be combined with the hylink and varlink element type architectural forms defined in ISO/IEC 10744:1997 to create a (simplified) document type definition for a navigatable thesuarus/dictionary:

<!DOCTYPE thesaurus [

<!ENTITY % TNM.library PUBLIC

   "ISO 13250//DTD AFDR TNM Meta-DTD Support Declarations//EN" >

%TNM.library;

<!ENTITY % translations -- Sources of traslations --

              "french-equivalent|german-equivalent|italian-equivalent" >

<!ENTITY % locs        -- Local location address forms --

              "external-definition|example|%translations;">

<!ELEMENT thesaurus  - O (entry+) +(definitions|%locs;) >

<!ATTLIST thesaurus  TNM       NAME  #FIXED "TNM-hub"

                     HyTime    NAME  #FIXED "HyDoc"

>

<!ELEMENT entry - O (term, equivalent-terms, broader-terms?,

                     narrower-terms?, french-equivalents?,

                     german-equivalents?, italian-equivalents?) >

<!ATTLIST entry

  TNM                NAME   "TNM.TopicRelationship"

  id                 ID     #REQUIRED

  relationship       CDATA  #FIXED "thesaurus-relationships"

  domains            CDATA  #IMPLIED -- Default: All domains --

  HyTime             NAME   varlink

  anchrole           NAME   #IMPLIED --Default: Not a self anchor --

  linktrav           -- Hyperlink traversal rules:

                        A any traversal or departure --

                     -- Constraint: one per anchor or one for all --

                     -- Lextype("I"|"R"|"D"|"A"|"N"|"P"|"ID"|"RD"|

                                "EI"|"ER"|"ED"|"EN"|"EP"|"ERD")+) --

                     NAMES  "A"

  listtrav           -- Traversal between members of list anchors.

                        A Adjacent traversal

                        W Wrapping traversal --

                     -- Lextype: (("N"|"L"|"R"|"A"|"LW"|"RW"|"AW")+) --

                     NAMES "AW"

>

<!ELEMENT (term|equivalent-terms|broader-terms|narrower-terms|

           french-equivalents|german-equivalents|italian-equivalents)

               - O (#PCDATA) -- Reference --

                             -- Note: Requires use of refloc facility -- >

<!ATTLIST (term|equivalent-terms|broader-terms|narrower-terms|

           french-equivalents|german-equivalents|italian-equivalents)

  TNM                NAME   TNM.TopicRelationship

  domains            CDATA  #IMPLIED -- Default: All domains --

  HyTime             NAME   anchspec

  anchrole           NAMES  #IMPLIED -- Default: GI (element type) --

  loctype            CDATA  #FIXED "#CONTENT IDLOC"

                     -- Content of element is a list of IDs --

  relationship       (term|equivalent-terms|broader-terms|narrower-terms|

                      french-equivalents|german-equivalents|italian-equivalents)

                           #IMPLIED -- Default: GI (element type) --

  weighting       -- Indicates relevence to associated terms.

                        0 = means probably irrelevent.

                      100 = means definitely relevent.

                     If a weight is more than 100, it can indicate extreme

                     or compelling relevence, though weights greater

                     than 100 may be deemed to be 100 by any application.--

                     NUMBER  #IMPLIED --Default: 100--

  linktrav           -- Hyperlink traversal rules:

                        A any traversal or departure (EID) --

                     -- Constraint: one per anchor or one for all --

                     -- Lextype: ("I"|"R"|"D"|"A"|"N"|"P"|"ID"|"RD"|

                                  "EI"|"ER"|"ED"|"EN"|"EP"|"ERD")+) --

                     NAMES  "A"

  listtrav           -- Traversal between members of list anchors.

                        A adjacent traversal

                        W wrapping traversal --

                     -- Lextype(("N"|"L"|"R"|"A"|"LW"|"RW"|"AW")+) --

                     NAMES "AW"

  multmem            -- May anchor have multiple members? --

                     (single|list|corlist) #FIXED list

  emptyanch          -- Is empty anchor an error? --

                     -- Constraint: one per anchor or one for all --

                     -- Lextype: ("ERROR"|"NOTERROR") --

                     NAMES "error"

>

<!ELEMENT definitions -- Definitions of term --

          - O (local-definition|external-definition|example|citation)+>

<!ATTLIST definitions

  TNM         NAME        TNM.SemanticAssignment

  id          ID          #REQUIRED

  mnemonic    -- Short name for the term --

              CDATA       #IMPLIED -- Default: Same as id --

  topictitle  -- Full form for term --

              CDATA       #IMPLIED -- Default: Same as mnemonic (could

                                               be inherited from id) --

  topicdef    -- Topic semantic definition pointer --

              -- Identifier(s) of element(s) containing text that

                 defines the topic. --

              -- Lextype: (IDREFS)--

              CDATA       #IMPLIED

              -- Default: Embedded local-definition elements --

  domains     -- Knowledge domains in which definition is relevant --

              -- Lextype: (domain-name+) --

              CDATA       #IMPLIED -- Default: All knowledge domains --

  weighting   -- Indicates relevence to term to domain(s):

                 terms could be ordered based on relevance

                 within domain.

                   0 = means probably irrelevent.

                 100 = means definitely relevent.

                 If a weight is more than 100, it can indicate extreme

                 or compelling relevence, though weights greater

                 than 100 may be deemed to be 100 by any application.--

              NUMBER      #IMPLIED -- Default: 100 --

  HyTime      NAME        hylink

  anchrole    NAMES       #FIXED

                          "OED        #LIST

                           Websters   #LIST

                           Chambers   #LIST

                           examples   #LIST"

  anchcstr    NAMES       "COND"

  OED         CDATA   -- Reference --           #IMPLIED

                      -- Default: Term not defined in OED --

  Websters    CDATA   -- Reference --           #IMPLIED

                      -- Default: Term not defined in Websters --

  Chambers    CDATA   -- Reference --           #IMPLIED

                      -- Default: Term not defined in Chambers --

  examples    CDATA   -- Reference --           #IMPLIED

                      -- Default: No examples of use of term found --

  linktrav    -- Hyperlink traversal rules. One or more of:

                 E traversal after external arrival

                 I traversal after internal arrival

                 R return traversal after internal arrival

                 D departure after internal arrival

                 A any traversal or departure (EID)

                 N no traversal after internal arrival

                 P no internal arrival --

              -- Constraint: one per anchor or one for all --

              -- Lextype: ("I"|"R"|"D"|"A"|"N"|"P"|"ID"|"RD"|

                           "EI"|"ER"|"ED"|"EN"|"EP"|"ERD")+) --

              NAMES   #IMPLIED  -- Default: Any traversal or departure --

  listtrav    -- Traversal between members of list anchors.

                 One of:

                  N no traversal

                  L left traversal

                  R right traversal

                  A adjacent (both left and right) traversal

                 optionally concatenated with:

                  W wrapping traversal --

              -- Constraint: one per anchor or one for all --

              -- Lextype(("N"|"L"|"R"|"A"|"LW"|"RW"|"AW")+) --

              NAMES   #IMPLIED  -- Default: No list traversal --

  emptyanch   -- Is empty anchor an error? --

              -- Constraint: one per anchor or one for all --

              -- Lextype("ERROR"|"NOTERROR") --

              NAMES   "noterror" -- Allows topic maps that have no

                                              existing occurrences to be defined --

>

<!ELEMENT local-definition   - O (#PCDATA|list)+ >

<!ELEMENT list O O (item+)>

<!ELEMENT item - O (#PCDATA) >

<!ELEMENT (external-definition|example|%translations;)

           --External object locators --

           -- Addresses objects by name in a name space

           - O (#PCDATA) -- Lextype: (word|literal)+ -- >

<!ATTLIST (external-definition|example|%translations;)

  HyTime      NAME    "nmsploc"

  id          ID      #REQUIRED

  locsrc      -- Location source document --

              ENTITY #REQUIRED

  impsrc      -- Implied location source --

              (grovert|implicit|ptreert|referatt|referrer) implicit

  namespc     -- Name space property of location source from

                 which named nodes are selected --

              NAME  "element"

  --? Must all (or any) of the following be defined for a nmsploc? --

  notspace    -- If nmspce name is invalid? --

              NAME -- Lextype: ("ERROR"|"IGNORE") -- ERROR

  notname     -- If name is not valid in nmspc? --

              NAME -- Lextype: ("ERROR"|"IGNORE") -- ERROR

  cantcnst    -- Cannot construct grove from located data --

              NAME -- Lextype: ("ERROR"|"IGNORE") -- ERROR

  notprop     -- If not property of locsrc --

              NAME -- Lextype: ("ERROR"|"IGNORE") -- ERROR

  apropsrc    -- Use additional property source? --

              (apropsrc|solesrc)                     solesrc

  direct      -- Direct or indirect value? --

              (direct|indirect)                      indirect

>

<!ELEMENT citation -- Location of objects not electronically available --

              - O (#PCDATA) >

<!ATTLIST citation

              HyTime      NAME     bibloc

>

<!ENTITY OED PUBLIC "-//OUP//DOCUMENT Oxford English Dictionary//EN">

<!ENTITY Websters PUBLIC "-//Longmans//DOCUMENT Dr Webster's Abridged Dictionary//EN">

<!ENTITY Chambers PUBLIC "-//Chambers//DOCUMENT Dictionary//EN">

<!ENTITY Harraps PUBLIC

         "-//Harraps//DOCUMENT Shorter French Dictionary//FR">

<!ENTITY Duden PUBLIC

         "-//Duden//DOCUMENT Stilworterbuch//DE">

<!ENTITY Italian-Gem PUBLIC

         "-//Collins//DOCUMENT Gem Italian Dictionary//IT">

<!ENTITY % references -- The set of entity definitions that identifies all

                         sources of documents referenced in the examples --

           SYSTEM "references.ent" >

%references;

]>

<thesaurus>

<entry id="term1234">

<term>reality</term>

<equivalent-terms>substantiality in-accordance-with-fact absoluteness factualness no-lie</equivalent-terms>

<broader-terms>existence being</broader-terms>

<narrower-terms>fact thing entity</narrower-terms>

<french-equivalents>harraps:realite harraps:reel</french-equivalents>

<german-equivalents>DE:Relitat DE:Wirklichkeit DE:Naturtreue DE:Tatsache

DE:Faktum</german-equivalents>

<italian-equivalents>IT:gem:realta</italian-equivalents>

</entry>

...

<definitions id="reality" OED="OED:reality-1 OED:reality-3" Websters="US:reality-2" Chambers="Scotch:reality" examples="Russell.B:5:reality Russell.B:7:substantiality Leibnitz:1:philosphy">

<local-definition>being real; existent thing; the real nature of</local-definition>

<external-definition id="OED:reality-1" locsrc="OED">reality-1</external-definition>

<external-definition id="OED:reality-3" locsrc="OED">reality-2</external-definition>

<external-definition id="US:reality-2" locsrc="Websters">reality-2</external-definition>

<external-definition id="Scotch:reality locsrc="Chambers">reality</external-definition>

<example id="Russell.B:5:reality" locsrc="Russell.B:5">reality</a>

<example id="Russell.B:7:substantiality" locsrc="Russell.B:7">substantiality</a>

<example id="Leibnitz:1:philospohy" locsrc="Leibnitx:1">philosophy</a>

</definitions>

<definitions id="in-accordance-with-fact" topictitle="In accordance with facts"

examples="Leibnitz:1:philosophy Russell.B:5:facts">

<local-definition>That which agrees with well substantiated facts</local-definition>

<example id="Russell.B:5:facts" locsrc="Russell.B:5">facts</a>

</definitions>

...

<definitions id="local-definitions">

<local-definition id="thesaurus-relationships">In this thesaurus entries consists of:

<list>

<item>a compulsory term definition

<item>one or more identifiers of equivalent terms

<item>optionally, one or more identifiers of broader terms which can be deemed to encompass the term in the category title

<item>optionally, one or more narrower terms which can be deemed to form a sub-catefory of the term in the category title

<item>optionally, one or more pointers to French equivalents of the term

<item>optionally, one or more pointers to German equivalents of the term

<item>optionally, one or more pointers to Italian equivalents of the term.

</list>

</local-definition>

...

</definitions>

...

<french-equivalent id="harraps:realite" locsrc="Harraps">

realite</french-equivalent>

<german-equivalent id="DE:Wirklichkeit" locsrc="Duden">

Wirklichkeit</german-equivalent>

...

</thesaurus>

Note: Many existing examples of the use of topic maps rely on the relationships between people. The following example is an attempt to fully define the relations a person may have using the topic relations.

A genealogy could be defined as part of a topic navigation map by defining a topic relation element type, and supporting elements, entities and processing instructions, of the following form:

<!DOCTYPE genealogy [

<!ENTITY % TNM.library PUBLIC

           "ISO 13250//DTD AFDR TNM Meta-DTD Support Declarations//EN" >

%TNM.library;

<!ENTITY % calendars PUBLIC "ISO 13250//NONSGML LEXTYPES

                             TNM-example UTC date order//EN">

<!-- Note: This is an identifier for some of the lextypes defined in

   the unnamed set in Annex C.1.1 of the 2nd edition of ISO/IEC 10744. -->

<?IS10744 LEXUSE calendars>

<!ELEMENT genealogy (person+) >

<!ATTLIST genealogy title CDATA #IMPLIED >

<!ELEMENT person - O (family-name, previous-family-name*, given-names,

                      born, died?, relations) >

<!ATTLIST person id     ID    #REQUIRED    >

<!ELEMENT (family-name|previous-family-name|given-name) - O (#PCDATA)>

<!ELEMENT (born|died) - O (#PCDATA) -- Lextype: UTCdate -- >

<!ELEMENT relations - O EMPTY >

<!ATTLIST relations

  TNM                  NAME   TNM.TopicRelationship

  id                   ID     #REQUIRED

  relationship         CDATA  -- Lextype: (IDREF) --

                              #IMPLIED -- Default: No formal definition --

  domains              CDATA  #IMPLIED    -- Default: not specified (not

                                             relevant to genealogy) --

  --? Should all (or some) of the TNM attributes be optional? --

  HyTime               NAME   hylink

  anchrole             NAMES  #FIXED

                              "person

                               maternal-ancestors   #LIST

                               paternal-ancestors   #LIST

                               spouses              #LIST

                               other-child-creators #LIST

                               sons                 #LIST

                               daughters            #LIST"

  anchcstr             NAMES "OMIT REQUIRED OMIT OMIT OMIT OMIT OMIT"

  person               IDREF  #IMPLIED -- Default: ID of parent element --

  maternal-ancestors   IDREFS #REQUIRED -- Lextype (born

                                                    #ORDER calendar) --

  --? What is the correct form for an instruction to the

      effect that ancestors must be listed so that the earliest

      is listed first? --

                                        -- Reftype: person --

  paternal-ancestors   IDREFS #IMPLIED  -- Lextype: (born

                                                     #ORDER calendar) --

                                        -- Reftype: person --

  maternal-siblings    IDREFS #IMPLIED  -- Reftype: person --

  paternal-siblings    IDREFS #IMPLIED  -- Reftype: person --

  --Question: Would calender order be appropriate for siblings?--

  spouses              CDATA  -- Reftype: person--

                        -- Lextype (IDREF, UTC-date, UTC-date?)+ --

                        -- Constraint: First UTC-date identifies

                           start of relationship, second one

                           identifies end of relationship --

                        #IMPLIED

  other-child-creators IDREFS #IMPLIED  -- Reftype: person --

  sons                 IDREFS #IMPLIED  -- Reftype: person --

  daughters            IDREFS #IMPLIED  -- Reftype: person --

                       --? Is birth data significant for children? --

  linktrav             -- Hyperlink traversal rules:

                          A any traversal or departure (EID) --

                       -- Constraint: one per anchor or one for all --

                       -- Lextype: ("I"|"R"|"D"|"A"|"N"|"P"|"ID"|"RD"|

                                    "EI"|"ER"|"ED"|"EN"|"EP"|"ERD")+) --

                       NAMES  "A"

  listtrav             -- Traversal between members of list anchors.

                          A adjacent (both left and right) traversal

                          W wrapping traversal --

                       -- Lextype: (("N"|"L"|"R"|"A"|"LW"|"RW"|"AW")+) --

                       NAMES "AW"

  emptyanch            -- Is empty anchor an error? --

                       -- Constraint: one per anchor or one for all --

                       -- Lextype("ERROR"|"NOTERROR") --

                       NAMES "error noterror noterror noterror

                              noterror noterror noterror noterror"

 >

]>

<genealogy title="The Bryan Family">

<person id="mtb">

<family-name>Bryan

<given-names>Martin Terry

<born>1948-03-08

<relations

 maternal-ancestors="mm flm smb"

 paternal-ancestors="eb lfb"

 spouses="gmb 1978-06-17"

 sons="dab pab">

<person id="gmb">

<family-name>Bryan

<previous-family-name>Oxley

<given-names>Gillian Margaret

<born>1951-03-06

<relations

 maternal-ancestors="mo"

 paternal-ancesotrs="no"

 spouses="mtb 1978-06-17"

 sons="dab pab">

<person id="dab">

<family-name>Bryan

<given-names>David Alexander

<born>1984-04-18

<relations

 maternal-ancestors="no mo gmb"

 paternal-ancestors="lfb smb mtb">

For the preceding declarations to work there would need to be a lexical ordering definition of the following form within the file identified through the lexord processing instruction:

<!--

This file is identified by the following public identifier:

  "ISO 13250//NONSGML LEXTYPES TNM-example UTC date order//EN"

This set extends the default lexical ordering set, HyOrd, found

in HyTime.

-->

<!LEXORD calendar  -- UTC date order --

           SPEC PUBLIC "ISO 13250//NOTATION LEXORD

                        TNM-example UTC date order//EN" > 

Editor's Note: Do we need to define date ordering as part of ISO 13250? Michel has suggested that we should adopt the HyTime HyCalSpc notation without modification. I wonder if this is not a case of overkill (it allows Julian and time to the nearest 100th of a second to be defined).

The Text Encoding Intiative (TEI) has defined the following set of elements for defining taxonomies within TEI class declarations:

<!ELEMENT classDecl   - - (taxonomy+)>

<!ATTLIST classDecl       %a.global; >

<!ELEMENT taxonomy    - - (category+ |

                           ((bibl | biblStruct | biblFull), category*))>

<!ATTLIST taxonomy        %a.global; >

<!ELEMENT category    - - (catDesc, category*)>

<!ATTLIST category        %a.global; >

<!ELEMENT catDesc     - o (%phrase.seq; | textDesc) >

<!ATTLIST catDesc         %a.global; >

The example of the use of this construct given in the TEI Guidelines for Electronic Text Encoding and Interchange (TEI P3) is:

<taxonomy id=B>

 <bibl>Brown Corpus</bibl>

 <category id=B.A><catDesc>Press Reportage

 <category id=B.A1><catDesc>Daily</category>

 <category id=B.A2><catDesc>Sunday</category>

 <category id=B.A3><catDesc>National</category>

 <category id=B.A4><catDesc>Provincial</category>

 <category id=B.A5><catDesc>Political</category>

 <category id=B.A6><catDesc>Sports</category>

 ...

 </category>

 <category id=B.D><catDesc>Religion

 <category id=B.D1><catDesc>Books</category>

 <category id=B.D2><catDesc>Periodicals and tracts</category>

 </category>

 ...

</taxonomy>

To use this existing data as a topic navigation map you would need to:

To do this you would extend the existing definitions as follows:

<!ENTITY % TNM.TR  --Attributes used to define a Topic Navigation Map

                     Topic Relation--

 'TNM         NAME        "TNM.TopicRelationship"

  id          -- Unique identifier for assignment --

              -- ISO/IEC 10744 Clause: A.5.2 --

              ID          #IMPLIED

              -- Each TNM.SemanticRelationship can be assigned a unique

                 identifier to allow the relationship to be used as an

                 anchor for a topic relationship link. As all topics need

                 not be anchors, the id is not required. --

  relationship -- Relationship definition identifier --

              -- Identifier(s) of element(s) containing text that

                 defines the topic. --

              -- Lextype: (IDREFS)--

              CDATA       #IMPLIED

              -- Default: Relationship is  not formally defined --

  domains     -- Domains relevant to topic --

              -- Defines the knowledge domains (semantic universes) in which

                 this relationship is relevant. This attribute is generally used

                 to filter out non-relevant relationships according to a list of

                 domains chosen by by the user. --

              -- Lextype: (domain-name+) --

              CDATA      #IMPLIED -- Default: All knowledge domains --

  weighting   -- Weighting of assignment relevance within domain(s) --

              -- Indicates relevence to topic to associated domains .

                    0 = means possibly irrelevent.

                  100 = means highly relevent.

                 If a weight is more than 100, it can indicate extreme

                 or compelling relevence, though weights greater

                 than 100 may be deemed to be 100 by any application.--

              NUMBER      #IMPLIED --Default: 100-- '

>

<!ENTITY % TNM.SA   --Attributes used to define a Topic Navigation Map

                      Semantic Assignment--

 'TNM         NAME        "TNM.SemanticAssignment"

  id          -- Unique identifier for assignment --

              -- ISO/IEC 10744 Clause: A.5.2 --

              ID          #IMPLIED

              -- Each TNM.SemanticAssignment should be assigned a unique

                 identifier to allow the topic to be used as an

                 anchor for a topic relationship link. As all topics need

                 not be anchors, the id is not required. --

  mnemonic    -- Mnemonic used to identify topic in menus --

              -- The short name for the subject matter of

                 this definition; machine-processable identifier.

                 Can be seen as a "semantically-loaded identifier"

                 (which may or may not be unique)  --

              CDATA       #IMPLIED

  topictitle  -- Title by which semantic is referred to --

              CDATA       #IMPLIED -- Default: Contents of element,

                                      or value of mnemonic attribute --

  topicdef    -- Topic semantic definition pointer --

              -- Identifier(s) of element(s) containing text that

                 defines the topic. --

              -- Lextype: (IDREFS)--

              CDATA       #IMPLIED

              -- Default: No formal definition --

  domains     -- Domains relevant to topic --

              -- Defines the knowledge domains (semantic universes) in which

                 this topic is relevant. This attribute is generally used to

                 filter out non-relevant topics according to a list of

                 domains chosen by by the user. --

              -- Lextype: (domain-name+) --

              CDATA      #IMPLIED -- Default: All knowledge domains --

  weighting   -- Weighting of assignment relevance within domain(s) --

              -- Indicates relevence to topic to associated domains .

                    0 = means possibly irrelevent.

                  100 = means highly relevent.

                 If a weight is more than 100, it can indicate extreme

                 or compelling relevence, though weights greater

                 than 100 may be deemed to be 100 by any application.--'

>

<!ENTITY % TNM.hylink

 'HyTime               NAME   hylink

  linktrav             -- Hyperlink traversal rules. One or more of:

                           E traversal after external arrival

                           I traversal after internal arrival

                           R return traversal after internal arrival

                           D departure after internal arrival

                           A any traversal or departure (EID)

                           N no traversal after internal arrival

                           P no internal arrival --

                       -- Constraint: one per anchor or one for all --

                       -- Lextype: ("I"|"R"|"D"|"A"|"N"|"P"|"ID"|"RD"|

                                    "EI"|"ER"|"ED"|"EN"|"EP"|"ERD")+) --

                       NAMES  "A"

  listtrav             -- Traversal between members of list anchors.

                           A adjacent (both left and right) traversal

                           W wrapping traversal --

                       -- Lextype(("N"|"L"|"R"|"A"|"LW"|"RW"|"AW")+) --

                       NAMES "AW"

  emptyanch            -- Is empty anchor an error? --

                       -- Constraint: one per anchor or one for all --

                       -- Lextype("ERROR"|"NOTERROR") --

                       NAMES "noterror"  '

>

<!ELEMENT classDecl   - - (taxonomy+)>

<!ATTLIST classDecl       %a.global; >

<!--Question: Is there any logical reason why classDecl should not be

              declared as a TNM-hub even though it is not the

              base document element for a TEI document?

              Is this a valid way of identifying this element as a subhub,

              or can this only be done by declaring the subhub as

              an external entity?

              At present TNM-hub is defined as the ArcDoc form

              for the architecture, and ArcDoc is constrained to

              be a document element (though it does not need to

              be the base document element). This example raises the

              question of hubs being only part of a larger entity,

              as taxonomies are defined as part of a TEI corpora.-->

<!ELEMENT taxonomy    - - (category+ |

                           ((bibl | biblStruct | biblFull), category*))>

<!ATTLIST taxonomy %a.global;

                   %TNM.TR;

                   %TNM.hylink;

  anchrole         NAMES  #FIXED

                          "taxonomy-of

                           categories-in        #LIST"

  anchcstr         NAMES  "SELF REQUIRED"

  categories-in    IDREFS #IMPLIED -- Default: Children of element

                                               type category --

>

<!ELEMENT category    - - (catDesc, category*)>

<!ATTLIST category    %a.global;

                      %TNM.SA;

                      %TNM.hylink;

  anchrole        NAMES       #FIXED

                  "descriptor

                   sub-categories #LIST

                   sources    #LIST"

  anchcstr        NAMES "OMIT"

  descriptor      IDREF   #IMPLIED  -- Reftype: catDesc --

                  -- Default: Nested category description --

  sub-categories  IDREFS  #IMPLIED  -- Reftype: category --

                                    -- Default: Implied from

                                       nesting of categories--

  sources         IDREFS  #IMPLIED  -- Default: None at present --

>

<!--Note: The id attribute in a.global may have

  to be redefined for catDesc to make it required.-->

Comment from Michel:Why are the subcategories not defined as TR instead of anchroles in the SA? Because category cannot be both a semantic assignment and a topic relationship. Without changing the TEI DTD definition, or using link process definitions, is there any other way of defining category nesting as a relationship?

Editor's Note: Whilst the above seems to be sufficient to turn the TEI structure into a valid topic navigation map it needs to be checked for accuracy by someone with experience in using TEI catalogues.

Informative Annex B: TNM Meta-DTD

This annex brings together all of the formal productions of this standard in a form that creates a valid SGML meta-DTD for topic naviation navigation maps.

Note: While the copyright of this standard rests with ISO, the formal SGML definitions provided in this Annex may be copied providing that the following text is included, for example as an SGML comment, with the copy:

(C) International Organization for Standarization 1998.

Permission to copy in any form is granted for use with conforming

Topic Navigation Maps as defined in ISO 13250, providing that

this notice is included with all copies.

The meta-DTD constructs required for Topic Navigation Map processing are:

To be added later when forms stabilised.