[This local archive copy mirrored from the canonical site: http://www.texcel.no/se97talk.htm; links may not have complete integrity, so use the canonical document at this URL if possible.]


Why Your Document Management System Should Care About Hyperlinks

Or, why you should care about a document management system if you care about hyperlinks

Paula AngersteinPresentation by Paula Angerstein

of Texcel Research, Inc.

at SGML/XML '97

on December 9, 1997

Why links

  • A link is some construct to represent a relationship between two or more things
  • Historical use is straightforward association of two data items, e.g., a cross-reference
  • Historical venue for links is hypermedia systems
  • Emerging use of links
    • to locate distributed objects
    • to specify dependencies
    • to associate “metadata” with data

Some common link requirements

  • Links to arbitrary data and to points within that data
  • Links within and across documents
  • Links with multiple endpoints
  • Links that carry with them some set of semantics, such as a type and behavior
  • Links into data that cannot be modified
  • Control over the direction of traversal of a link
  • Control over what types of objects a link can point to
  • Notification when a link end is invalidated or modified
  • Version history of a link
  • Context-sensitive links

Representing links

  • HTML
  • HyTime
  • XML Link

Well-accepted properties of links

  • Specification of the link itself usually via some combination of elements and/or attributes (link recognition)
  • Specification of how to find the endpoints (addressing)
  • What the link is for (role)
  • What to do when the link is “activated” (behavior)
  • Allowed types of things the link can point to
  • Allowed direction of traversal between link endpoints
  • Various other metadata about the link itself: a descriptor, who created it and when, system-specific instructions, etc.

Link lifecycle

  • Typical user uses links: hypermedia systems present and traverse existing links
  • Somebody has to create and manage links: authoring and document management systems must do interesting things with links
  • Linking within constantly modified data presents some hard problems:
    • Addressing that works when linked data is modified
    • Ongoing validation that links are still valid
    • Automated synthesis of links

Why ID/IDREF doesn't cut it

  • Limited to one document
  • Limited to addressing an element
  • Not enough standard semantics
  • Can't maintain links independently of the data
  • Impossible if read-only data doesn't already have IDs

Fundamentals of HTML links

  • The “A” tag is a link
  • It has an “href” attribute whose value is a URL
  • That's it

This is an HTML link: <a "href=http://www.texcel.no/texcel.htm">

HTML link shortcomings

  • No links to spans of text and spans of content
  • No links into arbitrary data types
  • No links with multiple endpoints
  • No links independent of data
  • No control over the types of endpoints of a link
  • No control over the direction of traversal of a link

Fundamentals of HyTime links

  • Hypermedia/Time-based Structuring Language, defined in ISO standard 10744
  • In the standard, HyTime defines architectural forms: a set of “meta” element classes and attributes with standard semantics
  • When you write a DTD, you make an element a link by applying an architectural form via a “HyTime” attribute
  • End result is that instances of the element in a document are links
  • A link relates two or more link ends. Each link end is a locator to a piece of data known as an anchor.
  • A link end addresses the anchor via various mechanisms such as nameloc, treeloc, queryloc, and dataloc.
  • A contextual link gets one of its link ends from the link element's position in the document.
  • An independent link resides independently of any of its link ends.

This is a HyTime link: <clink hytime="clink" linkend="TexcelLogo">

Shortcomings of HyTime

  • According to some people, it is not possible for HyTime to have shortcomings because it can do anything
  • This is its shortcoming

Fundamentals of XML links

  • Any element becomes an XML Link when it has an attribute named “xml-link”
  • The “href” attribute is the locator and is a URL
  • Additionally standardizes the fragment id and query portions of a URL as either ID referencing or TEI Extended Pointers (XPointer)
  • Other attributes specify information about the link such as its role and behavior
  • Simple links are like HyTime contextual links; extended links are like HyTime independent links

This is an XML link: <simple xml-link="simple" href="file:///C|/texcel/im/lib/texcel.gif">

Shortcomings of XML Links

  • No links into arbitrary data types
  • No control over the types of endpoints of a link
  • No control over the direction of traversal of a link

Links “inside” an SGML document management system

  • SGML document management systems typically have unique object identification for every SGML element
  • These repository identifiers (RIDs) make complex addressing unnecessary: link resolution is simple
  • Within its own domain, a system can provide efficient link storage and manipulation
  • When data goes out of this domain, links can be exported to a standard form

Link creation

  • Present candidates for link targets, e.g., via tree views, query results, views of content
  • Generate an address to a link target
  • Automatically generate ID values
  • Ensure links are only to allowed types
  • Associate link type and other information with the link
  • Update an independent link map
  • Automatically create links

Link maintenance

  • Integrate with authoring systems to prevent deletion of link targets
  • Notify when link target contents are modified
  • Notify when an address locates a different target
  • Potentially recalculate addresses
  • Retrieve and update link metadata
  • Maintain an independent link map
  • Maintain context of link applicability
  • Trace link lifecycle

Link delivery

  • Real-time link traversal
  • Determine and export a web of linked data
  • Export links in a form optimized for the delivery system
  • Drive conversion and delivery processes

There's more to links than viewing

  • Tangible benefits of planning for links across the entire document lifecycle
  • Exploit the capabilities of your SGML document management system to support linking
  • Quality gains are certain to follow

Home Our Products User Services Contact Us Mailing List Menu bar

Copyright © Texcel N.V. All rights reserved.