Quick Guide to HyTime Basics

[Mirrored from: http://info.admin.kth.se/SGML/Anvandarforening/Arbetsgrupper/HyTime/Reports/tr1v1.html]

Technical Report 1 from the HyTime Working Group, Swedish SGML User's Group, document version identifier TR1V1.

Credits and copyright

This document has been written by Peter Bergström, EuroSTEP AB, Hasse Haitto, Synex Information AB, Erik Helander, Saab Service Partner AB, Anna Ran, Saab Service Partner AB, and Per-Åke Ling, Ericsson Utvecklings AB on behalf of the HyTime Working Group of the Swedish SGML User's Group. (C)1996 by the authors and the Swedish SGML User's Group (sgml@sunet.se.)

Background

This Quick Guide to HyTime Basics is an easy introduction to HyTime Linking, to enable people and organisations to adapt HyTime linking concepts for their needs.

HyTime, formally known as "Hypermedia/Time-based Structuring Language" ISO/IEC 10744:1992, is a complex and powerful standard. In our experience, it is however quite difficult to read and understand how to use the most basic part of its linking functionality.

This document assumes basic knowledge of SGML. The structure of the document is as follows:

Introduction
Explanation of the concept of architectural form
Dissection of HyTime Links
A bit about semantics and terminology on links and locations
The contextual link explained, with examples
The independent link explained
HyTime Location Addressing
Converting SGML documents to HyTime

Introduction

HyTime is an enabling technology: you can describe virtually any kind of connected or temporal information. It covers many different aspects on linking, as well as a full repertoir of features for multimedia purposes, including virtual time, scheduling, synchronisation and so on.

Due to the broad scope of the standard it is quite difficult to get into the standard and understand all details of how links are created. Poeple often ask How do I convert my old documents to HyTime? or Does it have to be so complex to create links?. In this document we will try to answer those questions, and provide the reader with a good basic knowledge of hyperlinks using HyTime, in order for him to continue the exploration of the standard on his own, because support for these features is readily available.

HyTime Is an Enabling Architecture

HyTime is described in the form of "cookbook" element forms, as recipes for general construct. Because of this, HyTime is sometimes called a meta-DTD: Its SGML constructs should be used as templates. Contrary to an existing DTD or tag naming scheme, HyTime lets you define the names (generic identifiers, or GIs) of element types freely. This is possible through a specific HyTime attribute that is assigned the name of the corresponding HyTime element template. The HyTime attribute lets a HyTime-aware application recognize and apply to the element any particular processing associated with the specific HyTime construct. This simple yet powerful idea is fundamental to HyTime; it is called an architectural form or enabling architecture. A HyTime engine can be regarded as an SGML parser that understands the processing required for HyTime attributes.

As you may freely add other attributes for your own purposes, the architectural form is rather like a super-class with single inheritance, to use object-oriented terminology. A recent change to HyTime (the so-called HyTime corrigendum) extends HyTime with multiple inheritance.

Links and Locations

Historically, many hypertext systems and notations have not differentiated between a link and its connected endpoints (the locations or anchors). Rather, the link information has been treated as a logical whole: any editing change affects the entire linking construct. Also, the hypertext terminology has not been used consistently, so that what constitutes a link anchor can be anything from an entire document to a selected span of a few words, or a chunk of information on a card, depending on the hypertext system. The fineness of granularity of the addressed information has thus varied.

HyTime makes a distinct difference between links and locations. A link is a reference between two or more locations. A location is the address of a potential anchor point, which is the actual, physical point where the link ends.

This step of indirection can first be regarded as an extra level of complexity to the links, but as soon as you try to create documents with hyperlinks between them, you will notice that it is very hard to maintain and update hyperlinks without this step of indirection. The locations in HyTime are one of the basic building blocks that make HyTime such a powerful standard.

HyTime brings about a consistent way of describing hyperlinks between any media. As HyTime builds upon SGML, it is especially powerful for hypermedia documents encoded using this standard.

HyTime Links

Linking in SGML is achieved by assigning ID attributes to elements and reference these through an IDREF or IDREFS attribute. This kind of linking is restricted to work within one and same document because of the way SGML is defined.

Although HyTime is based entirely on SGML, it transcends the limitations of SGML linking and extends it forcefully. HyTime defines but two link constructs, one of which is a special case of the other. The richness of HyTime linking lies in the many ways one can describe locations in document, often as a sequence of stepwise refined addresses known as location ladders.

Let's first have a look at a traditional SGML reference to a table in one and same document:

  <p>The population in the major cities of Sweden is described
  in <xref idref="tab21">
  <tbl id="tab21">
  <tbltitle>Population in the major cities of Sweden</tbltitle>
  <body>.......

Contextual Link (clink)

The contextual link or clink (pronounced see-link) is the simpler of the HyTime link constructs. One of its anchors ("the context") is part of the document where the link markup resides. The other endpoint can be in the same or somer other document. The clink is rather common as it aptly describes traditional cross-references.

Using clink, one of the anchors is part of the document contents.

The clink markup is defined as follows.

  <!element clink    -- Contextual link --
		     - O      (%HyBrid;)* >
  <!attlist clink    HyTime   NAME     clink
		     id       ID       #IMPLIED  -- Default: none --
		     linkend  -- Link end --
			      -- Constraint: No HyTime reftype constraints,
				 but application designers can constrain
				 element types with reftype attribute --
			      IDREF    #REQUIRED
  >

The SGML reference in the example above is easy to convert to a HyTime clink in a two step procedure. First, add a HyTime attribute to the element type xref in the DTD:

  <!ELEMENT xref - o EMPTY>
  <!ATTLIST xref
	    ref     IDREF   #REQUIRED
	    HyTime  NAME    "clink"
	    HyName NAMES   "linkend ref"

The HyTime attribute tells the HyTime engine that this element is a HyTime clink element, and the HyName attribute can be used to map the ref attribute to the attribute name linkend, which the HyTime engine interprets and understands.

Secondly, trigger the HyTime engine: Formally, the SGML declaration needs to contain additional data for this processing to take place; however, this requirement is not enforced by current end-user applications. One such requirement is the HyTime statement in the APPINFO section of the SGML declaration:

  APPINFO "HyTime"

Just as the SGML declaration provides information for the SGML parser, HyTime-specific declarations inform the HyTime engine what parts of HyTime that will be used in the DTD. The HyTime declarations take the form of processing instructions prior to the DTD:

  <?HyTime VERSION "ISO/IEC 10744:1992" HYQCNT=32 >
  <?HyTime MODULE base >
  <?HyTime MODULE locs >
  <?HyTime MODULE links >

These declarations make all references behave as HyTime clinks, without changing anything in the document markup. The required HyTime declarations have been revised in the HyTime corrigendum so that there now is an alternative to processing instructions. How to make existing SGML documents conform to HyTime is explained in more detail below.

Simple clink

Below is a clink that emanates from the first paragraph and points to the element with the ID attribute whose value is id17.

  <P>This <clink linkend=id17>clink</clink> is
  linked to an element whose id is the value assigned to
  the linkend attribute.</P>
  ...
  <P ID=id17>This paragraph is a target of a clink from
  the previous paragraph.</P>

Independent Link (ilink)

The independent link or ilink (pronounced eye-link) permits the link data to be stored externally, separate from the document(s) that the link markup connects. One can thus make changes to the link data without modifying the documents being linked.

In this ilink, the ID attributes XYZ and ABC reference elements with addressing data locators, illustrating the separation of links from the addressing anchor.

HyTime Location Addressing

The HyTime links are complemented by rich addressing schemes that specify locations down to (virtually) any granularity. The location addresses can be defined using different methods. Two of the most common address entire elements:

Name Location (an element with a specific ID, or an entire document)
Tree Location (a path along the SGML tree structure)

To transcend the granularity of the markup, one can use a counting addressing scheme:

Data Location (address whole tokens of any type of data, e.g. words)

A dynamic method of addressing is through queries, where the exact location is not known but resolved through matching against properties, it in real time. HyTime defines a query language (HyQ) which has since been superseded by the DSSSL query language (SDQL).

The separation of links from anchors make it possible to have links between any kind of media, such as musical notes--assuming an addressing mechanism for referencing within the notation.

Name Location

Instead of letting links point directly to target elements, one can point out a locator. HyTime thus supports indirection, where the ultimate link target is resolved in a series of steps.

The element type form named location address or nameloc contains one or several nmlist elements, where the nmlist element contains of one or several names. The names are either entity or element names (actually, ID attribute values). The nameloc element locates all of these elements. Links to entities are considered as links to the root element of the entity document.

Typical declarations take the following form:

  <!ELEMENT nameloc ... >
  <!ATTLIST nameloc
	    HyTime   NAME   "nameloc"
	    id      ID      #IMPLIED >

  <!ELEMENT nmlist ... >
  <!ATTLIST nmlist
	    HyTime   NAME             "nmlist"
	    nametype (entity|element) element
	    docorsub ENTITY           #IMPLIED >

If the nametype attribute of nmlist has the value entity the element content is interpreted as entity names. Each name should then have a corresponding ENTITY declaration (unless it is implicitly defined by a #DEFAULT entity declaration). The target is interpreted as the root element of the named document.

If the nametype attribute of nmlist has the value element, its content is interpreted as element names, i.e. IDs, whose location depend on the document entity defined by the attribute docorsub. If docorsub is implied, it defaults to the document containing the nmlist element.

Cross-document reference

To make a cross-document reference to an entire document (i.e., an external entity), the document should contain the following declarations:

  <nameloc id="locid">
  <nmlist nametype="entity">docref</nmlist>
  </nameloc>
  ...

The target document is identified by the entity docref. In the document instance, one would have markup such as:

  See also the <clink linkend="locid">related document</clink>
  ...

The clink spanning the string "related document" is a hypertext link to the docref entity, where the clink is resolved to the nameloc with the id locid. In the nameloc, the nametype attribute of the nmlist element is declared as an entity: the contents of the nmlist ("docref") is thus an entity.

Cross-document reference to a specific element

The next sample markup is a cross-document reference to an element with the ID eid in the docref entity:

  ...
  <nameloc id="locid">
  <nmlist nametype="element" docorsub="docref">
  eid
  </nmlist>
  </nameloc>
  ...
  See also the<clink linkend="locid">related section</clink>
  ...

The clink spanning "related section" points to the element with an ID attribute with the value eid in the document corresponding to the docref entity.

Multiple link endpoints

You can have any numbers of names in an nmlist element, and any number of nmlist elements in an nameloc element.

Just as SGML allows for multiple link endpoints, the same effect can be achieved using a clink to a nameloc:

  <nameloc ID=re-nmloc>
  <nmlist nametype=element>first.para also amazing</nmlist>
  </nameloc>
  ...
  <P>Here's a reference to <clink linkend=re-nmloc>three paragraphs</clink> in the
  section "Multiple IDREFS links"
  above. </P>

Tree Location

As nameloc addressing pre-supposes the existence of an ID attribute, how do you address other elements? Because the SGML markup is hierarchical--elements either contain one another entirely or are disjunct--the document can be viewed as a tree, with the topmost element being the root, all of the contents at the leaves, and the branches being the structure of the markup.

The tree structure can be referred to using the HyTime construct treeloc, whose content describes a path along siblings. However, elements whose content model is mixed content (which allows for both #pcdata and elements) make treeloc addressing more complex than is readily apparent: each portion of data content becomes a pseudo element, that counts as an element in the tree. The somewhat confusing handling of line breaks complicates the picture further. For this reason, treeloc addressing is best suited for programs to generate.

A simple treeloc link

The example below creates a treeloc link to the element target. The treeloc addressing is necessary since the target element does not have an ID attribute.

  ...
  <chapter id=chp3>
  <title>Chapter 3</title>
  <para>The treeloc link ends
  <target>here!</target>
  </para>
  ...
  <treeloc id=treeloc.demo locsrc=chp3>
  <marklist>1 2 2</marklist>
  </treeloc>
  ...
  This is a <clink linkend=treeloc.demo>clink that uses treeloc addressing</clink>
  ...

Data Location

Because they are based on the SGML markup, the granularity of the previous addressing mechanisms is equal to the granularity imparted by the content markup.

The data location address or dataloc architectural form is used to refer to content within elements, by counting tokens such as words: typically, an offset into the element, and a number of subsequent tokens.

The example below creates a link to the word "HERE" by specifying an offset of five words and then counting one word. The dataloc addressing is necessary since no element markup encapsulates the target.

  ...
  <para id=target>
  The dataloc link ends HERE
  </para>
  ...
  <dataloc id=dataloc.demo locsrc=target>
  <dimlist>5 1</dimlist>
  </dataloc>
  ...
  This is a <clink linkend=dataloc.demo>link that uses dataloc addressing</clink>
  to transcend the markup boundaries.
  ...

A dataloc element contains a dimlist element. The dimlist element contains two integers. The first integer specifies the number of tokens to the beginning of the target, the second its size (in number of tokens).

Typical declarations have the form

  <!ELEMENT dataloc  (dimlist*) >
  <!ATTLIST dataloc
	    id       ID     #IMPLIED
	    locsrc   IDREF  #IMPLIED
	    quantum  (norm) norm
	    HyTime   NAME   "dataloc" >

  <!ELEMENT dimlist  (#PCDATA) >
  <!ATTLIST dimlist
	    HyTime   NAME  "dimlist" >

The locsrc attribute specifies the start location of the dataloc address. Locsrc can in turn point at a HyTime location address. If locsrc is implied, the beginning of the source document is used.

HyTime specifies several other counting strategies, and ways of specifying the counting direction (such as counting backwards, from the end of the element).

As HyTime linking and addressing are based on indirection, one can use a sequence of locators to create durable links. A typical such location ladder might consist of a clink referring to a nameloc, addressing a treeloc, addressing a dataloc.

Converting SGML documents to HyTime

There is no need to scratch your head and start counting hours before your documents are HyTime conformant with respect to linking within documents. Here we will show how easy it is to make existing documents HyTime conformant.

There are several ways to convert existing documents to HyTime; which one suits you is dependent on the nature of the source documents, and on your knowledge of HyTime. In the text that follows, we assume that we are dealing with SGML documents and links within single documents.

We will assume that the links in the documents are constructed using the standard ID-IDREF mechanism. This means that the target element of the link must contain an attribute of type ID. See the example below:

 

<!-- reference somewhere in the text -->
<para>For more information, 

see <ref refto=3D"explanation2"></ref> below.</para>

<!-- explanations in some other place in the document -->
<explanation explid=3D"explanation1">Words...<explanation>
<explanation explid=3D"explanation2">More words...<explanation>

The key to this simple conversion is that the actual change takes place in the DTD. You do not really have to edit the source (i.e. the documents), because you add fixed values to the attributes of the reference elements.

Here is a fragment of a DTD that could be used for the document above.

<!-- DTD fragment -->
<!ELEMENT text        - - (para*)                >
<!ELEMENT para        - - (#PCDATA | ref)*       >
<!ELEMENT ref         - - (#PCDATA)              >
<!ELEMENT explanation - - (#PCDATA)              >
<!ATTLIST ref
refto     IDREF  #REQUIRED             >
<!ATTLIST explanation
explid    ID     #REQUIRED             >

We will now add what is needed for the clink construction in the DTD.

<!-- DTD fragment -->
<!ELEMENT text        - - (para*)                >
<!ELEMENT para        - - (#PCDATA | ref)*       >
<!ELEMENT ref         - - (#PCDATA)              >
<!ELEMENT explanation - - (#PCDATA)              >

<!ATTLIST ref
refto     IDREF  #REQUIRED
HyTime    NAME   #FIXED "clink"
HyNames   NAMES  #FIXED "linkend refto">


<!ATTLIST explanation
explid    ID     #IMPLIED              >

As you see, the only changes are two extra lines in the attribute list of the element ref.

Today you have to add some application information between the SGML declaration and the DTD. This information can be added at the end of the file containing the SGML-declaration or it can be put before the reference to the DTD in the document instance if there is a reference. The APPINFO section in the SGML declaration is also changed.

HyTime processing instructions
```
	  
	  
	  
	  
	
```
SGML declaration
```
	  FORMAL  Yes
	  APPINFO "HyTime"
	
```

Please note, there are ongoing changes to how to submit the information listed above. (Refer to the HyTime corrigendum.)

The whole transformation does also require an application that understands the concepts of HyTime--at least the clink construction, in order to resolve and use the linking functionality. There is no need for a fully complient HyTime engine. It is in most cases enough with the capabilities of the API of your current SGML-aware system.

Using a hub document and HyTime independent links (ilinks)

This section briefly points out that you can have all your documents as is, creating hyperlinks within (or between) any document using a HyTime hub document as a map of all the links.

Please refer to documents describing more about HyTime ilinks to see the correct syntax and examples. This section will only describe the ideas of hyperlinking from a hub document.

Since there in HyTime are several different accessing methods to the "targets" of your links, you can always pinpoint any kind of information in the documents. It may be an element, a word or some pixels in a bitmap.

In this case we are interested in going from an existing pointer (an element) to the element it refers to. In the example above this would mean a reference from the element ref to the element explanation.

The HyTime hub document merely consist of the HyTime ilinks, which are used to create the links between HyTime namelocs, which in their turn points to the "targets" in the document(s).

A HyTime engine reads the hub document and processes the links, which in the end are presented by the application as hotspots, underlined words or whatever. All the specific HyTime processing will take place without us having to make any changes to the existing documents.

Conclusion

It is very easy to make your ID-IDREF links in your documents comply with the HyTime standard. There is not much work involved and your environment will not have to change.

The benefits are immediate, you have made your documents and linking more robust. Adding more HyTime constructs, such as HyTime ilinks, is a logical next step. With HyTime, your documents will be better prepared for the next century.

Definitions

HyTime: A standardized hypermedia structuring language for representing hypertext linking, temporal and spatial event scheduling, and synchronization. HyTime provides basic identification and addressing mechanisms and is independent of object data content notations, hyperlink types, processing and presentation functions, and other application semantics. Hyperlinks can be established to documents that conform to HyTime and those that do not, regardless of whether those documents can be modified. The full HyTime function supports "integrated open hypermedia" (IOH) - the "bibliographic model" of referencing that allows hyperlinks to anything, anywhere, at any time - but systems need support only the subset that is within their present capabilities.
[HyTime, ISO/IEC 10744:1992, clause 3.18]
Hypertext: Hypertext, in computer science, a metaphor for presenting information in which text, images, sounds, and actions become linked together in a complex, nonsequential web of associations that permit the user to browse through related topics, regardless of the presented order of the topics. These links are often established both by the author of a hypertext document and by the user, depending on the intent of the hypertext document. For example, traveling among the links to the word iron in an article might lead the user to the periodic table of the elements or a map of the migration of metallurgy in Iron Age Europe. The term hypertext was coined in 1965 by Ted Nelson to describe documents, as presented by a Computer, that express the nonlinear structure of ideas, as opposed to the linear format of books, film, and speech. The term hypermedia, more recently introduced, is nearly synonymous but emphasizes the nontextual components of hypertext, such as animation, recorded sound, and video.
[Microsoft (R) Encarta. Copyright (c) 1993 Microsoft Corporation. Copyright (c) 1993 Funk & Wagnall's Corporation]
Multimedia: Multimedia, the combination of sound, graphics, animation, and video. In the world of computers, multimedia is a subset of hypermedia, which combines the elements of multimedia with hypertext, which links the information.
[Microsoft (R) Encarta. Copyright (c) 1993 Microsoft Corporation. Copyright (c) 1993 Funk & Wagnall's Corporation]
Hypermedia: Hypermedia, in computer science, the integration of graphics, sound, video, or any combination into a primarily associative system of information storage and retrieval. Hypermedia, especially in an interactive format where choices are controlled by the user, is structured around the idea of offering a working and learning environment that parallels human thinking�that is, an environment that allows the user to make associations between topics rather than move sequentially from one to the next, as in an alphabetic list. Hypermedia topics are thus linked in a manner that allows the user to jump from subject to related subject in searching for information. For example, a hypermedia presentation on navigation might include links to such topics as astronomy, bird migration, geography, satellites, and radar. If the information is primarily in text form, the product is hypertext; if video, music, animation, or other elements are included, the product is hypermedia.
[Microsoft (R) Encarta. Copyright (c) 1993 Microsoft Corporation. Copyright (c) 1993 Funk & Wagnall's Corporation]
Hyperlink: An information structure that represents a relationship among two or more objects.
[HyTime, ISO/IEC 10744:1992, clause 3.15]