Namespaces and URNs

From Tue Aug  4 16:15:06 1998
Date:     Tue, 04 Aug 1998 16:03:57 -0500
To:       XML Dev <>
From:     "W. Eliot Kimber" <>
Subject:  Re: Namespaces and URNs

At 04:31 PM 8/4/98 -0400, John Cowan wrote:

>RFC 1737 says vaguely that ISBNs, ISO public identifiers (what are
>those, please?) and UPC product codes "seem to satisfy the functional
>requirements" of URNs.

ISO public identifiers are a form of universal name designed to meet the requirements of SGML documents to be able to reference storage objects without referencing system-specific locations. They are formally defined in ISO 8879 (the SGML standard) and in ISO 9070, which defines various registration authorities for public IDs (of which the ISBN is one). The SGML standard does not define how public IDs are to be resolved into system IDs--that is left up to particular systems. The SGML Open (now OASIS) entity catalog mechanism provides a way to do this resolution. It is supported by most SGML tools, including SP and its derivitives (and I presume, EXPAT, but I haven't tried it). Public IDs predate the Internet and are independent of any type of system, rather than being specific to a particular access method or networking infrastructure.

Note that there is nothing magical about public IDs or URNs that distinguishes them from system IDs or URLs except the expectations that they generate: if you claim that something is a URN then there is an expectation that the name will be persistent. If you claim that something is a system ID, then there is no expectation that the name will be persistent. However, whether you call it a URN or URL (or a public ID or a system ID), it is up to the owner of the *name* to ensure that it is persistent.

Thus, URLs can be just as persistent as URNs, but nobody expects them to be. URN resolution is always presumably indirect (because persistence cannot be ensured in the general case without some form of indirection), but URLs can be just as indirect, so from an implementation standpoint, there's no useful difference between URNs and URLs.

SGML made the distinction between "public" IDs and "system" IDs based on the idea that some things would be "published" and thereby made public, so you would need a way to name them that wasn't in terms of any particular storage system. However, it turns out that without an infrastructure for publishing things, the distinction between public and non-public doesn't make much sense.

However, the distinction between "can expect it to be persistent" and "can't expect it to be persistent" is a useful one. In particular, the requirement to support URNs forces you to put some sort of indirection mechanism in place, something you can get by without doing for URLs. The problem, of course, is that some people do need the names they own to be persistent and therefore are forced to build (or buy, like PURLs), the infrastructure to manage name persistence. SGML assumed that data objects would move between different types of systems and therefore had to have some system-independent way of referring to things. The Web *is* a storage system and therefore doesn't need system-independent storage identifiers.

This is a subtle but important distinction: if your data only operates on the Web, then URLs are all you need, but if your data operates in a number of environments (or may be moved from one environment to another over its lifetime), then URLs are just one of an infinite number of system-specific addressing methods that you need to protect your data from by using system-independent names in your data and providing mappings of the moment as part of your data management infrastructure.

In my opinion, the Internet should provide a persistent name infrastructure anagous to (but more general than) DNS so that each server doesn't have to provide its own way to manage persistent names. And I don't want to have to pay for it on a per-name basis--I want it to be an infrastructure whose cost is borne by the community at large, just the way the rest of the Internet infrastructure is.

Note that it is the job of the owner of the resource to ensure that the resource is persistent, not that all of the names that might refer to it are persistent (although there is a general supposition that the owner of the resource might also maintain a name for the resource).

In other words, the owner of a name for a resource and the owner of the resource itself may not be the same entity.



<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004

xml-dev: A list for W3C XML Developers. To post,
Archived as:
To (un)subscribe, the following message;
(un)subscribe xml-dev
To subscribe to the digests, the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (