[Archive copy mirrored from: http://www.texcel.no/se97talk.htm]

Why you do (or don't) need HyTime in your document management system


Paula AngersteinPresentation by Paula Angerstein

of Texcel Research, Inc.

at SGML Europe '97

on May 14, 1997


What is HyTime?

  • Hypermedia/Time-based Structuring Language, defined in ISO standard 10744
  • Hypermedia: mostly addressing and linking techniques
  • Time-based: ways to encode information that describes things that happen over time
  • Additionally, there are generally applicable support features for SGML

Where is HyTime?

  • The standard has been around for 5 years
  • Technical Corrigendum (TC) very close to being published: some major rework and significant additions
  • Why don't you see HyTime in more products? It's big, powerful, and scary. While it can be applied to simple problems, it is too hard to do so.
  • HyTime was “ahead of its time”. Market wants useful subsets: witness URLs and HTML links. XML taking this approach.

HyTime and document management

  • HyTime originally focused on static material; TC augments handling of dynamic data
  • HyTime usually thought of for delivery systems but there is an effect on document management and authoring
  • Linking is the most notable user requirement satisfied by HyTime

Some common link requirements

  • Links within and across documents
  • Links to an element or entity and occasionally to something else
  • Links that carry some set of semantics, such as role and behavior
  • Links into data that cannot be modified
  • Links that resolve into multiple anchors
  • Control over the direction of traversal of a link
  • Context-sensitive links
  • Version history of a link
  • Notification when a link end is invalidated or modified

Why ID/IDREF doesn't cut it

  • Limited to one document
  • Limited to addressing an element
  • Not enough standard semantics
  • Impossible if read-only data doesn't already have IDs
  • Can't maintain links independently of the data

OK, how do I fix it?

  • HyTime: portable, ought to be interoperable
  • HTML: definitely interoperable, but too simple for many things
  • XML (eXtensible Markup Language): hey, this looks simple yet powerful! Can't wait until it's out there!
  • Use your document management system's capabilities: nicely integrated with other management of data, links made interoperable on export

Fundamentals of HyTime links

  • In the standard, HyTime defines architectural forms: a set of “meta” element classes and attributes with standard semantics
  • When you write a DTD, you make an element a link by applying an architectural form via a “HyTime” attribute
  • End result is that instances of the element in a document are links
  • A link relates two or more link ends. Each link end is a locator to a piece of data known as an anchor.
  • A link end addresses the anchor via various mechanisms such as nameloc, treeloc, queryloc, and dataloc.
  • A contextual link gets one of its link ends from the link element's position in the document.
  • An independent link resides independently of any of its link ends.

This is a HyTime link: <clink hytime="clink" linkend="TexcelLogo">

Link requirements satisfied by HyTime

  • Links within and across documents
  • Links to an element or entity and occasionally to something else
  • Links that carry some set of semantics, such as role and behavior
  • Links into data that cannot be modified
  • Links that resolve into multiple anchors
  • Control over the direction of traversal of a link
  • Context-sensitive links
  • Version history of a link
  • Notification when a link end is invalidated or modified

Fundamentals of HTML links

  • The “A” tag is a link
  • It has an “href” attribute whose value is a URL
  • That's it

This is an HTML link: <a "href=http://www.texcel.no/texcel.htm">

Link requirements satisfied by HTML

  • Links within and across documents
  • Links to an element or entity and occasionally to something else only with system-dependent extensions
  • Links that carry some set of semantics, such as role and behavior
  • Links into data that cannot be modified
  • Links that resolve into multiple anchors
  • Control over the direction of traversal of a link
  • Context-sensitive links
  • Version history of a link

Fundamentals of XML links

  • Any element becomes an XML link when it has an attribute named “xml-link”
  • The “href” attribute is the locator and is a URL
  • Additionally standardizes the fragment id and query portions of a URL as either ID referencing or TEI Extended Pointers
  • Other attributes specify information about the link such as its role and behavior
  • Simple links are like HyTime contextual links; extended links are like HyTime independent links

This is an XML link: <simple xml-link="simple" href="file:///C|/texcel/im/lib/texcel.gif">

Link requirements satisfied by XML

  • Links within and across documents
  • Links to an element or entity and occasionally to something else
  • Links that carry some set of semantics, such as role and behavior
  • Links into data that cannot be modified
  • Links that resolve into multiple anchors
  • Control over the direction of traversal of a link
  • Context-sensitive links
  • Version history of a link
  • Notification when a link end is invalidated or modified

What (at least one) DMS can do

  • SGML document management systems typically have unique object identification for every SGML element
  • These repository identifiers (RIDs) make complex addressing unnecessary: link resolution is simple RIDREF
  • Within its own domain, a DMS can provide unique ID generation and efficient link creation and management
  • When data goes out of this domain, links can be exported or translated to a standard form

    This is a RIDREF link: <link linkend="TexcelLogo">

    Link requirements satisfied by repository-wide RID/RIDREF

    • Links within and across documents
    • Links to an element or entity and occasionally to something else
    • Links that carry some set of semantics, such as role and behavior (proprietary semantics)
    • Links into data that cannot be modified (only if the data already has IDs)
    • Links that resolve into multiple anchors
    • Control over the direction of traversal of a link
    • Context-sensitive links
    • Version history of a link

    How do I decide?

    • Only you know what your linking requirements are
    • You've got to think about how links affect your document management system:
      • Link creation
      • Link maintenance
      • Link delivery

    Link creation

    • How to highlight a link target: does the user interface provide visibility to link targets?
    • How to address the link target: is an ID'ed element sufficient or do you have elements with no IDs, groups of elements, text spans, or non-SGML data?
    • How to make the link: what is the link type?
    • Do you need to attach additional information about the link?
    • Where to put the link: can you embed it in the data or do you need independent links?

    Link maintenance

    • How does change in a link target affect the link?
    • Do you need to edit links without editing the linked data?
    • Is link resolution dependent on the context in which the link is used?
    • Do you need version traceability of links in the context of the linked data?

    Link delivery

    • What is your browser expecting?
    • Document management system can package a web of linked data and generate the most efficient output
    • Document management system can drive conversion and delivery processes

    Making the decision

    • Two fundamental questions:
      • Is your web of linked documents well-bounded and controlled?
      • Is the linked data modifiable?
    • The closer to “yes”, the simpler your linking scheme can be

    The Technical Corrigendum

    • Publication planned within next few weeks
    • Clarification and better consistency
    • Generalizes and formalizes some of the major concepts
    • Additional annex for “SGML Extended Facilities”, planned to be moved to the SGML standard in next revision

    Generalized architectural forms

    • Enables association of a document architecture, or “rules for creating and processing documents” with a DTD
    • Concept of a “meta-DTD” with constrained element classes and attributes
    • A real DTD references one or more meta-DTDs to make use of the definitions
    • Does not specify processing semantics, only makes specification of similar data consistent
    • Gives clues to document management systems about how to treat similar elements
    • HyTime has already standardized an architecture for linking; generalized architectural forms could lead to more areas of standardization

    Property set definitions

    • Mechanism for formally defining the object model for a notation
    • A property set and grove plan are defined for SGML
    • Supports the processes in HyTime and DSSSL, e.g., addressing and query, and makes them vastly more rigorous
    • Provides a way for document management systems to have a standard API to stored objects; should promote interoperability
    • Could lead to grove plans for other notations, e.g., CGM and Postscript, making this data transparently processable by grove-enabled document management systems

    Formal System Identifiers and Storage Managers

    • SGML has a rigorous description of formal public identifiers
    • To date, system identifiers typically treated as system-specific file names
    • A Formal System Identifier (FSI) standardizes the form of a system identifier
    • Maps entities to storage objects; supports one-to-many and many-to-one mappings
    • Some predefined locators: file names, URLs
    • Can declare a storage manager and then reference it in an FSI
    • Allows a document management system to define itself as a storage manager and use its own notation to map onto its data objects, e.g., a repository identifier or query

    Did I answer the question?

    • No, only you can do that
    • HyTime has competitors for linking strategies
    • SGML Extended Facilities enable greater formal specification for SGML processing
    • As with all else, the market demand drives the products

    For more information about Texcel products and services: info@texcel.no
    To comment on this site: web@texcel.no

    Copyright © 1997 Texcel N.V. Texcel and Texcel Information Manager are trademarks (TM) of Texcel N.V. All rights reserved. Information in this document is subject to change without notice. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.