|
Why you do (or don't) need HyTime in your document management system
of Texcel Research, Inc.
at SGML Europe '97
on May 14, 1997
What is HyTime?
- Hypermedia/Time-based Structuring Language, defined in ISO standard 10744
- Hypermedia: mostly addressing and linking techniques
- Time-based: ways to encode information that describes things that
happen over time
- Additionally, there are generally applicable support features for SGML
Where is HyTime?
- The standard has been around for 5 years
- Technical Corrigendum (TC) very close to being published: some major rework
and significant additions
- Why don't you see HyTime in more products? It's big, powerful, and scary.
While it can be applied to simple problems, it is too hard to do so.
- HyTime was ahead of its time. Market wants useful subsets:
witness URLs and HTML links. XML taking this approach.
HyTime and document management
- HyTime originally focused on static material; TC augments handling of
dynamic data
- HyTime usually thought of for delivery systems but there is an effect
on document management and authoring
- Linking is the most notable user requirement satisfied by HyTime
Some common link requirements
- Links within and across documents
- Links to an element or entity and occasionally to something else
- Links that carry some set of semantics, such as role and behavior
- Links into data that cannot be modified
- Links that resolve into multiple anchors
- Control over the direction of traversal of a link
- Context-sensitive links
- Version history of a link
- Notification when a link end is invalidated or modified
Why ID/IDREF doesn't cut it
- Limited to one document
- Limited to addressing an element
- Not enough standard semantics
- Impossible if read-only data doesn't already have IDs
- Can't maintain links independently of the data
OK, how do I fix it?
- HyTime: portable, ought to be interoperable
- HTML: definitely interoperable, but too simple for many things
- XML (eXtensible Markup Language): hey, this looks simple yet powerful!
Can't wait until it's out there!
- Use your document management system's capabilities: nicely integrated
with other management of data, links made interoperable on export
Fundamentals of HyTime links
- In the standard, HyTime defines architectural forms: a set of meta
element classes and attributes with standard semantics
- When you write a DTD, you make an element a link by applying an architectural
form via a HyTime attribute
- End result is that instances of the element in a document are links
- A link relates two or more link ends. Each link end is a
locator to a piece of data known as an anchor.
- A link end addresses the anchor via various mechanisms such as nameloc, treeloc, queryloc, and dataloc.
- A contextual link gets one of its link ends from the link element's
position in the document.
- An independent link resides independently of any of its link ends.
This is a HyTime link: <clink hytime="clink" linkend="TexcelLogo">
Link requirements satisfied by HyTime
- Links within and across documents
- Links to an element or entity and occasionally to something else
- Links that carry some set of semantics, such as role and behavior
- Links into data that cannot be modified
- Links that resolve into multiple anchors
- Control over the direction of traversal of a link
- Context-sensitive links
- Version history of a link
- Notification when a link end is invalidated or modified
Fundamentals of HTML links
- The A tag is a link
- It has an href attribute whose value is a URL
- That's it
This is an HTML link: <a "href=http://www.texcel.no/texcel.htm">
Link requirements satisfied by HTML
- Links within and across documents
- Links to an element or entity and occasionally to something else only
with system-dependent extensions
- Links that carry some set of semantics, such as role and behavior
- Links into data that cannot be modified
- Links that resolve into multiple anchors
- Control over the direction of traversal of a link
- Context-sensitive links
- Version history of a link
Fundamentals of XML links
- Any element becomes an XML link when it has an attribute named xml-link
- The href attribute is the locator and is a URL
- Additionally standardizes the fragment id and query portions of a URL
as either ID referencing or TEI Extended Pointers
- Other attributes specify information about the link such as its role and
behavior
- Simple links are like HyTime contextual links; extended links are like
HyTime independent links
This is an XML link: <simple xml-link="simple" href="file:///C|/texcel/im/lib/texcel.gif">
Link requirements satisfied by XML
- Links within and across documents
- Links to an element or entity and occasionally to something else
- Links that carry some set of semantics, such as role and behavior
- Links into data that cannot be modified
- Links that resolve into multiple anchors
- Control over the direction of traversal of a link
- Context-sensitive links
- Version history of a link
- Notification when a link end is invalidated or modified
What (at least one) DMS can do
- SGML document management systems typically have unique object identification
for every SGML element
- These repository identifiers (RIDs) make complex addressing unnecessary:
link resolution is simple RIDREF
- Within its own domain, a DMS can provide unique ID generation and efficient
link creation and management
- When data goes out of this domain, links can be exported or translated
to a standard form
This is a RIDREF link: <link linkend="TexcelLogo">
Link requirements satisfied by repository-wide RID/RIDREF
- Links within and across documents
- Links to an element or entity and occasionally to something else
- Links that carry some set of semantics, such as role and behavior
(proprietary semantics)
- Links into data that cannot be modified (only if the data already has
IDs)
- Links that resolve into multiple anchors
- Control over the direction of traversal of a link
- Context-sensitive links
- Version history of a link
How do I decide?
- Only you know what your linking requirements are
- You've got to think about how links affect your document management system:
- Link creation
- Link maintenance
- Link delivery
Link creation
- How to highlight a link target: does the user interface provide visibility
to link targets?
- How to address the link target: is an ID'ed element sufficient or do you
have elements with no IDs, groups of elements, text spans, or non-SGML data?
- How to make the link: what is the link type?
- Do you need to attach additional information about the link?
- Where to put the link: can you embed it in the data or do you need independent
links?
Link maintenance
- How does change in a link target affect the link?
- Do you need to edit links without editing the linked data?
- Is link resolution dependent on the context in which the link is used?
- Do you need version traceability of links in the context of the linked
data?
Link delivery
- What is your browser expecting?
- Document management system can package a web of linked data and generate
the most efficient output
- Document management system can drive conversion and delivery processes
Making the decision
- Two fundamental questions:
- Is your web of linked documents well-bounded and controlled?
- Is the linked data modifiable?
- The closer to yes, the simpler your linking scheme can be
The Technical Corrigendum
- Publication planned within next few weeks
- Clarification and better consistency
- Generalizes and formalizes some of the major concepts
- Additional annex for SGML Extended Facilities, planned to
be moved to the SGML standard in next revision
Generalized architectural forms
- Enables association of a document architecture, or rules for creating
and processing documents with a DTD
- Concept of a meta-DTD with constrained element classes and
attributes
- A real DTD references one or more meta-DTDs to make use of the definitions
- Does not specify processing semantics, only makes specification
of similar data consistent
- Gives clues to document management systems about how to treat similar
elements
- HyTime has already standardized an architecture for linking; generalized
architectural forms could lead to more areas of standardization
Property set definitions
- Mechanism for formally defining the object model for a notation
- A property set and grove plan are defined for SGML
- Supports the processes in HyTime and DSSSL, e.g., addressing and query,
and makes them vastly more rigorous
- Provides a way for document management systems to have a standard API
to stored objects; should promote interoperability
- Could lead to grove plans for other notations, e.g., CGM and Postscript,
making this data transparently processable by grove-enabled document management
systems
Formal System Identifiers and Storage Managers
- SGML has a rigorous description of formal public identifiers
- To date, system identifiers typically treated as system-specific file
names
- A Formal System Identifier (FSI) standardizes the form of a system identifier
- Maps entities to storage objects; supports one-to-many and many-to-one
mappings
- Some predefined locators: file names, URLs
- Can declare a storage manager and then reference it in an FSI
- Allows a document management system to define itself as a storage manager
and use its own notation to map onto its data objects, e.g., a repository
identifier or query
Did I answer the question?
- No, only you can do that
- HyTime has competitors for linking strategies
- SGML Extended Facilities enable greater formal specification for SGML
processing
- As with all else, the market demand drives the products
For more information about Texcel products and
services: info@texcel.no
To comment on this site: web@texcel.no
Copyright © 1997 Texcel
N.V. Texcel and Texcel Information Manager are
trademarks (TM) of Texcel N.V. All
rights reserved. Information in this document is
subject to change without notice. Other products
and companies referred to herein are trademarks
or registered trademarks of their respective
companies or mark holders. |