[Cache from http://www.sun.com/xml/developers/xlink/ 2001-01-08. See this canonical URL/version if possible.]

XML Linking: State of the Art

Eve Maler, Senior XML Standards Architect
Sun Microsystems XML Technology Center


Back in the early days -- that is, two years ago -- XML was most often compared to HTML, and in the eyes of many, XML came up short. HTML was a simple language anyone could learn; XML had complexities that could confuse developers. HTML had built-in formatting; XML needed a stylesheet to be displayed as anything other than raw code. HTML had hyperlinking functionality with the <A HREF> tag; XML didn't even give you a linking starter kit for embedding hyperlinks into XML in a standardized way.

Today, we know that XML is scalable and flexible in ways that would stretch HTML to the breaking point, which has allowed XML to become the "universal solvent" for all data, not just the narrative information that HTML was originally designed to hold. However, if XML is to capture one of the most important features of the web, it still needs to offer a standardized way to do linking. The goal of the XML Linking Working Group of the World Wide Web Consortium is to provide exactly this, and we're closing in on our goal.

This paper describes the features, benefits, and basic technical details of XML linking technologies.

Linking Today

The web has an ever-growing set of information resources interconnected by links. Every resource has a URI, or Uniform Resource Identifier, which enables you to find it on the web.

These resources are of many types. For example, in addition to HTML, there are graphic files in various formats and, increasingly, XML. Each type of resource must adhere to the rules of its own data format, and cannot speak the language of any other type. For example, HTML files must contain tags in angle brackets, whereas GIF files are encoded entirely differently.

However, for any resource type that can contain URIs, a resource of that type has the ability to use URIs to link to other resources of any type. Thus, an HTML file can link to an image file, allowing a picture to be presented as if it were embedded directly in the HTML content. Similarly, some non-HTML formats (such as mail messages) are allowed to contain linked regions within them so that users can click on a region to go to an HTML document.

One strength of web linking as it exists today is that it can associate resources of arbitrary types. Another strength is that it can associate arbitrary individual resources. If you want to create a link, you just need write access to the starting point (the file in which the link is coded); you don't need someone's permission to make their resource an ending point.

Most pre-web hypermedia systems had a goal of ensuring that links never break (that is, lead to the wrong thing or to nothing). Compared to these systems, the architecture of the web is unusual in that it tolerates broken links; the worldwide system doesn't grind to a halt if some resource isn't where it's supposed to be. Seen this way, the web's tolerance of broken links, while sometimes an annoyance, is actually a strength.

Linking, XML Style

XML was designed to build on the popular features of HTML while offering a more flexible, scalable data format. Likewise, XML linking is designed to take all the good parts from HTML linking while adding much more powerful functionality.

The XML Linking Working Group has created two specifications to solve the two main requirements of XML linking:

Code Example

This example illustrates many of the features of XLink and XPointer.

<?xml version="1.0"?>
<!DOCTYPE doc [
<!ATTLIST bib ident ID #IMPLIED>
<doc xmlns:xlink="http://www.w3.org/1999/xlink">

<title>Contrived Linking Example</title>
<extendedlink xlink:type="extended">
  <loc xlink:type="locator"
       xlink:href="#xpointer(//body/para[1]/citetitle[1])" />
  <loc xlink:type="locator"
       xlink:href="#xml" />
  <arc xlink:type="arc"
       xlink:actuate="onRequest" />
<extendedlink xlink:type="extended">
  <loc xlink:type="locator"
       xlink:href="#xpointer(string-range(//text(),'prolog'))" />
  <loc xlink:type="locator"
       xlink:href="#/1/2/5" />
  <arc xlink:type="arc"
       xlink:actuate="onRequest" />

<para>The syntax of XML is defined by
<citetitle>Extensible Markup Language (XML) 1.0</citetitle>.
In this contrived example, we note that it includes some

<prod num="1">document ::= prolog element Misc*</prod>

<para>Here are a couple more:</para>

<prod num="23">XMLDecl ::= '&lt;?xml' VersionInfo EncodingDecl?
SDDecl? S? '?&gt;'</prod>

<prod num="22">prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?</prod>

<para>We might refer to them in our document: see
<simplelink xlink:type="simple"
            xlink:actuate="onRequest">production 22</simplelink>.

<bib ident="xml">Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen,
editors. <citetitle>Extensible Markup Language (XML) 1.0</citetitle>.
World Wide Web Consortium, 1998. (See



Despite the strengths of web linking, HTML links have some limitations.

For example, it can be a challenge to keep all the links pointing to the right places on a large web site with handcrafted content; if the file system is reorganized, someone has to go back and edit all the documents containing links that point to files that have moved.

Also, not all resources can contain links. If you're a professor reviewing a student's paper online and you want to annotate a particular sentence so that the student can read the paper and click on your link to see your comments, you're out of luck because you don't have write access to the paper. If you were a film professor reviewing a student's video, your luck may be doubly bad because the video format may not allow the insertion of links at any point.

XLink helps solve the challenges presented in both the above scenarios. How? In HTML you have to surround the starting point with an <A> element, and then provide a URI to the ending point; in order to do this, you have to have write permission on the resource containing the starting point in order to do any linking at all. In XLink, you simply provide a URI reference for both the starting point and the ending point — no permissions are required for either one, and there's no need to edit the starting points to fix them when the ending points get moved around.

Because XLink allows this, we can expect "content" to be created that consists solely of huge databases of links, with no content actually created by the link authors.

Following are details on how XLink works.

The Model Underlying XLink

The three main types of items in the XLink universe are:

The types of metadata you can apply to these components include both machine-processable roles modeled on RDF's (Resource Description Framework) notion of properties and human-readable titles. You might use titles, for example, to provide mouse-over text that helps a user choose the correct link.

XLink's Markup Design

In order to include XLink features in an XML document, you need to express them in terms of elements and attributes. XLink uses a somewhat unusual vocabulary design: It consists only of attributes, so that you can designate whatever elements you want to be linking elements. In this way, XLink is an "enabling" vocabulary; you don't use it by itself, but rather incorporate it into your own vocabulary. (For this reason, there is no normative DTD for XLink.) The XLink vocabulary is in a namespace with the name http://www.w3.org/1999/xlink.

As an example, to create an XLink "arc" element, you would use the XLink type attribute with a value of "arc" on the desired element, here called myArc:

<myArc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="arc" ... />

The possible XLink types are simple, extended, resource, locator, arc, title, and none. For each type, there are rules about what other XLink attributes may appear on that element, and about what values those attributes may have. In the code example, the arc element has xlink:from and xlink:to attributes. These attributes have an effect only if they are on an element with an xlink:type attribute value of arc.

Kinds of Links and Arcs

The extended-type element provides a fairly close mapping of the XLink model to actual syntax. Extended links allow you to specify an arbitrary number of participants and arcs, where any one participant, starting or ending, may be local or remote. Groups of extended links stored together are called linkbases (link databases).

The local/remote distinction in any one arc turns out to be important: If its starting point is local (contained in the link), then the link itself contains all the information required for traversing the arc. This is called an outbound arc because it "emanates from" the linking element when traversed. If the arc's starting point is remote but its ending point is local, it is called an inbound arc. If both the starting point and the ending point are remote, the arc is called a third-party arc. In the code example, the two extended links contain only third-party arcs.

Of course, in the cases of inbound and third-party arcs, XLink processors need to be told where they can find the link information, because it doesn't reside in the same place as the starting point. If they can't find the link, a user viewing the resource containing the starting point will never know the link is there -- the starting point just looks like undistinguished content. XLink defines a way for link processors to hunt down linkbases that might be relevant for a document, so that the starting points can be traversed.

As you may have noticed, simple-type elements seem to play multiple parts in the model. They provide a more convenient and compact syntax for a very common kind of extended link, one which has exactly two participants and one outbound arc. This is identical to the linking structure of HTML <A> and <IMG> links.

Traversal Behavior Options

The example shows two cases of assigning traversal behavior to an arc. The arc created by the arc element, when traversed, is supposed to show the ending point in a new window on a user's (or application's) request. The arc created by the simplelink element is supposed to show the ending point in the same window as the starting point, but again only when requested.

The xlink:show attribute controls the options for presentation of the ending point, and the xlink:actuate controls the options for the type of event that kicks off the traversal. Each has a finite list of allowed values. The attributes have two values in common: none, which specifies that the presentation or actuation is not controlled at all by XLink processors, and other, which specifies that XLink processors should examine other (non-XLink) attributes on the element in order to discover link-processing instructions.

Other values for the xlink:show attribute are new and replace, explained above, and embed, which embeds a presentation of the ending point directly into the presentation context (window, pane, or whatever) of the resource containing the starting point. This value lets you achieve the effect of the HTML <IMG> element.

Other values for the xlink:actuate attribute are onRequest, explained above, and onLoad. You can use different combinations of attribute values to achieve different effects. For example, with a combination of replace and onLoad, the link produces the effect of an automatic redirect.


When you provide a URI reference that has as its target an HTML document, you have the opportunity to identify a spot within the document to which a browser should scroll. Essentially, you're addressing into the HTML document, not just to it.

However, pointing inside an HTML document only works if the creator of the HTML document was kind enough to provide an <A NAME> or <A ID> for you to use; you point to locations that weren't so identified. Also, the best you can do with HTML is point to the spot where such an identifier occurs, not a whole "chunk" such as a paragraph or division. That is because the structure of the average HTML file is not very regular; even though there is a formal DTD specification for HTML, browsers don't require that you comply with it. Broken HTML files make it hard to point to a particular region without ambiguity about which region was actually intended.

XPointer takes advantage of XML's inherent structure to allow addressing into any portion of an XML document. Programmatically, you just use the structure to provide guideposts in your description of the desired content: "Give me the first paragraph in the section of the paper that has ID 'compareAndContrast'," or "Give me the second through the last of the item elements in the purchase order."

XPointer gets most of its power from XPath, which deals with whole nodes such as elements and attributes. What XPointer adds to the mix is the ability to address arbitrary ranges of content, even if they don't form whole nodes. For example, in our college professor scenario, if the professor wants to comment on a passage that spans the last sentence in a paragraph, even if the sentence has no distinguishing markup around it, XPointer can handle it.

Following are details on how XPointer works.

XPointer and URIs

The combination of a URI and a string beginning with a crosshatch (#) character is called a URI reference, and the string after the crosshatch is called a fragment identifier. The defining standard for each MIME type has the opportunity to define the fragment identifier language to be used in URI references that point into resources of that MIME type. For example, HTML's fragment identifier language consists of a simple string that references the value of some <A NAME> or <A ID> value in the document.

XML's structure allows for a much finer granularity of referencing, and so the fragment identifier language for resources of MIME type text/xml and application/xml is correspondingly more complex. XML's fragment identifier language is defined by the XPointer specification.

Keep in mind that XPointers are used when your URI points to a document of type text/xml or application/xml. Therefore, XPointers can actually appear inside any kind of document that can contain URIs -- not just XML documents.

The XPointer Language

The following five XPointer fragment identifiers appear in the code example (though not in the order shown here):

  1. #xml
  2. #/1/2/5
  3. #xpointer(//prod[@num='22'])
  4. #xpointer(//body/para[1]/citetitle[1])
  5. #xpointer(string-range(//text(),'prolog'))

These XPointers illustrate much of the variety you can find in the XPointer language. To keep things simple, each is attached to a null URI (that is, the URI field before the crosshatch is empty), which means that each refers to a location in the current document.

The first XPointer is an example of a bare-name XPointer. It points to the bib element. Bare-name XPointers function very much like HTML fragment identifiers, except that they find any element that has an attribute of type ID (as opposed to an HTML element called NAME or ID). In this way, it is also like XML's attribute type IDREF; if you are pointing into the current document, bare names differ from the syntax of IDREF only in the addition of the crosshatch character.

The second is an example of a child sequence XPointer. It counts element children to locate the desired element. The /1 refers to the doc element, the /2 refers to the body element, which is the second child element inside doc, and the /5 refers to the desired prod element, which is fifth inside body. A child sequence can begin with either /1 or an ID value.

The last three XPointers show the full syntax, which starts with the keyword xpointer and is followed by a parenthesized expression. The expression is a superset of XPath; in fact, the third and fourth XPointers contain expressions that are directly usable as XPath expressions. The third XPointer locates the prod element whose num attribute value is 22. The fourth XPointer locates the first citetitle element inside the first para element anywhere inside the body element.

The final XPointer demonstrates a feature that is unique to the XPointer language. String ranges allow for the targeting of substrings in XML content that are otherwise not marked up. This XPointer actually targets all the references to the word prolog in the document. The XLink in which the XPointer is used creates arcs from each of them to the prod element in which the prolog construct is defined.

Not shown in the code example is another unique feature, a range. Ranges allow for the targeting of arbitrary ranges of XML content whether or not they contain whole nodes. To locate a range with an XPointer, you supply two inner XPointers as parameters to the range-to() function for the starting and ending points of the range. Following is an example of a range that stretches all the way from the beginning of the prod element with the num value of 1 to the end of the prod element with the num value of 22, including both the paragraph and the production that are in between:


The Status of XML Linking

As of this writing, things are moving quickly in the XML linking world. XLink and XPointer (along with XML Base, another specification owned by the XML Linking WG) are in the W3C Candidate Recommendation phase, during which the W3C actively seeks implementation experience. After this phase, specifications typically move on to a Proposed Recommendation phase and then reach Recommendation status. Also, several W3C Notes have been published on technical matters related to XML linking, and one more is expected. The following sources are already publicly available:

The Author:

As an XML Standards Architect in Sun's XML Technology Center, Eve Maler specializes in the development of XML-related standards and vocabularies.

Eve was a charter member of the World Wide Web Consortium working group that created XML, and currently serves as Sun's voting representative to the W3C. She co-chairs its XML Linking working group and edits the XLink and XPointer specifications. Eve is co-author of Developing SGML DTDs: From Text to Model to Markup, the only book available on a methodology for designing DTDs.