[This local archive copy is from the official and canonical URL, http://www.oasis-open.org/html/spec.htm; please refer to the canonical source document if possible.]
Copyright © 1999 by OASIS–Organization for the Advancement of Structured Information Systems
5 July 1999
The OASIS Registry and Repository Technical Committee of OASIS, the Organization for the Advancement of Structured Information Standards (formerly SGML Open), seeks to specify operation of a registry for some set of or XML-related entities, including but not limited to DTDs and schemas, with appropriate interfaces, that enable searching on the contents of a repository of those entities. The registry and repository shall interoperate and cooperate with other registries and repositories compliant with this specification and respond to requests for entities by their identifiers. The specification, which is the primary deliverable, is to be implemented in a prototype registry and repository.
Table of Contents
This is version 0.1.5 of this specification, an update of the initial version with some added material. Comments more than welcome, to tallen[at]sonic.net.
This document was written in Norm Walsh's Simplified Docbook XML DTD, and converted to HTML with the aid of his XSL style sheet. Output in RTF and other formats can be obtained by using James Clark's XT and XP tools and Norm Walsh's DSSSL style sheets for Docbook.
Please be sure to cite the version of the specification along with relevant section title when making comments.
Organization of this Document. The body of this document includes sections dealing with the registry and the repository, and a section on “Other Considerations” that apply to both The section dealing with the registry is mostly normative, and the section dealing with the repository is mostly nonnormative, as it covers issues that may vary according to implementation. Some of the “Other Considerations” are normative, also. It has been suggested that normative text should be separated from nonnormative; doing that might produce a specification plus a white paper on how the OASIS prototype registry and repository works. For now, it's all here, with sections intended to be normative labelled as such.
The words “must,” “must not,” “required,” “shall,” “shall not,” “should,” “should not,” “recommended,” “may,” and “optional” in this document are to be interpreted as described in RFC 2119.
As XML comes into use on the Web, DTDs, schemas, style sheets, and reuseable public text will be referred to by identifier, rather than being packaged with actual documents. It is critically necessary to be able to retrieve the referred-to entities, and in the Web context, it is preferrable to be able to do this automatically. And it is vital for users to be able to locate DTDs and schemas for the document types they want to create by consulting an interface to metadata about those DTDs and schemas.
Objective and Deliverables. The objective of the OASIS Registry and Repository Technical Committee is to develop a specification for interoperable registries and repositories for SGML- and XML-related entities, including but not limited to DTDs and schemas, with an interface that enables searching on the contents of a repository of those entities, and to construct a prototype registry and repository. The registry and repository are to be designed to interoperate and cooperate with other registries and repositories compliant with this specification. The prototype is intended to serve as a model for an extensible and distributed network of registries and repositories; the specification is viewed as the primary deliverable.
Chronology and Progress to Date. The first meeting of the OASIS Registry and Repository Technical Committee was at the OASIS Technical Committee meeting in Chicago, on 15 November 1998, where interest was stated, a rough outline of the problem agreed upon, and the chairman, Terry Allen, elected. Discussion began in January 1999 with the distribution by the chairman of a preliminary briefing package, scenarios, business case, and other documents. These were revised in February 1999. Separately, Terry Allen has worked on a DTD for ISO 11179 (for registry metadata) and distributed it. (Distribution has been to the e-mail discussion list firstname.lastname@example.org.) A second meeting was held at the OASIS Technical Committee meeting in San Jose, on 8 March 1999. The third meeting was in Granada on 23 April 1999, where the OASIS Registry and Repository Technical Committee held a useful discussion on practical matters and reiterated general support for the project.
The Technical Committee has agreed that it will use ISO 11179, “Specification and Standardization of Data Element,” as the underlying registry metadata format, and that the core functional requirements for the repository are that it return content in response to a request by URN, URL, PI, and, or, FPI, and return content in response to request made through an interface to the registry.
Remaining Project Schedule
Initial Technical Specification 15 June 1999 Business Case Document 15 July 1999 (rescheduled from June 15) Revised Technical Specification 15 August 1999 Completed Technical Specification 15 September 1999 Prototype Operational 15 September 1999
Contributors. The chairman of the OASIS Registry and Repository Technical Committee is Terry Allen of Commerce One, Inc. Norbert Mikula of DataChannel chairs the OASIS Technical Committee of which the OASIS Registry and Repository Technical Committee is a part. Names of individuals and organizations contributing to the development of the specification (and to the prototype) go here. So far, I have Jon Bosak (Sun), Carla Corkern (ISOGEN?), Robin Cover (ISOGEN?), Ron Daniel (Datafusion), Eduardo Gutentag (Sun), Mike MacKay (Novell), Michael Mealling (Network Solutions), Bill Smith (Sun), Norm Walsh (Arbortext), and Terry Allen (Commerce One). This is not a complete list of OASIS Registry and Repository Technical Committee members, and perhaps it ought to be.
The following design principles have been observed:
The Registry and Repository Technical Specification shall employ existing standards and specifications where possible, avoiding specifications that are not stable. OASIS must be prepared to track developments such as ANSI X3.285, which is the proposed revision of Part 3 of ISO 11179, and the W3C's XML Schema specification so that they can be considered for use when they are mature.
The normative part of the Registry and Repository Technical Specification shall be as small as reasonable.
The normative part of the Registry and Repository Technical Specification shall be complete enough that registries and repositories conformant to it can interoperate in an extensible and distributed network.
The normative part of the Registry and Repository Technical Specification shall be extensible; in particular, it shall be possible to extend the registration information schema or DTD without inhibiting interoperability among registries. (This point is called out because the registration information schema or DTD is likely to be a normative part of this specification).
Immediate needs should be satisified first. A repository offers opportunities for the application of many kinds of technologies; OASIS should focus on providing DTDs and schemas, and an interface to their metadata, before proceding to other matters.
The registry shall be user-friendly.
The Registry and Repository Technical Specification shall be vendor-neutral.
The Registry and Repository Technical Specification shall use XML by preference for encoding of information and documents.
The first complete and finished version of the Registry and Repository Technical Specification shall be delivered quickly.
TO BE DISCUSSED: should the Registry and Repository Technical Specification be neutral as to transport mechanism, or should use of HTTP be assumed?
The registry and repository shall be scaleable.
These scenarios involve both users retrieving something from the repository and contributors registering something in the registry, which may involve depositing something in the repository.
A user or user agent retrieves an XML-related entity such as a DTD automatically over the Web, as a result of some use of it in an XML context.
Motivation. Unless everything needed for parsing and displaying a document under all circumstances is packaged with the document itself, the document must refer to something (DTD, style sheet, public text) by identifier. It is necessary to be able to retrieve the referred-to entity, and in the Web context, it is preferrable to be able to do this automatically.
Example A. A user is sent a document the DOCTYPE declaration of which refers to a DTD by unique identifier (URN, PI, or FPI). His parser tells him it can't find the DTD, so he goes out and retrieves it manually from a repository (he doesn't need the registry interface because he has a unique ID but he does need to know where to find the repository).
Example B. A user clicks on a link to the stockmarket news and his browser receives an XML document the DOCTYPE declaration of which refers to a DTD by unique identifier; his browser, which has no copy locally, retrieves it automatically from the repository.
A creator of an XML-related entity deposits it for service to the public, at some range of accessibility from archival (retrieval rate could be slow) to utility (retrieval rate must be fast, large number of connections should be supported, round-the-clock uptime with failover, etc.).
Motivation. Many creators of XML entities lack the facilities to serve them reliably; even those that can do so may not wish to deal with the burden.
Example A. An IETF working group decides that a DTD that is part of their specification, but which the IETF has no facilities to serve, should be available from a public Web server with high bandwidth, and doesn't want to have to maintain the server. It sends the DTD to a repository and the repository serves it, as in Scenario 1.
Example B. A consortium or consultancy wishes its DTDs to be available for inspection and display. It deposits the DTDs in a repository and provides appropriate metadata for the repository's registry interface. The owner of the repository undertakes to make them available (but not with a high guaranteed quality of service).
Example C. Rosetta Net, a (real life) consortium of hardware vendors and suppliers, develops DTDs and sets of text values used in their content, all expected to be in heavy demand, the text values to change frequently. It deposits the DTDs and the initial set of text values in a repository, contracts for a regular update schedule and the highest available quality of service, and the repository undertakes to serve them, update them as agreed, push updates to subscribers, and maintain high quality of service for retrieval requests. Rosetta Net doesn't need a registry interface for this purpose because everything is to happen automatically, but it provides appropriate registry metadata so that the DTDs can be browsed and searched.
Example D. The Air Transport Association, which maintains important DTDs but make them available only to its members, wishes to offload the work of supplying those DTDs. It deposits the DTDs in a repository, contracts for service as in Example C, and in addition arranges that the DTDs are listed in the registry interface but are available only when an appropriate credential is presented in connection with a request for them. (This is an application of access control.)
The owner of an XML-related entity, or another repository, registers the entity in the OASIS-sponsored registry, but does not deposit the entity itself.
Motivation. Registries can interoperate to increase useability, but the actual storage location of an entity alone should not restrict the content of a registry.
Example A. A company wishes to makes its DTDs visible in the OASIS-sponsored registry, but prefers to serve them itself. It submits appropriate registry documents to the OASIS-sponsored registry, including a pointer to the address from which it serves the DTDs, and agrees with the OASIS-sponsored registry that it will supply timely update information and that the OASIS-sponsored registry will update its records and interface in a timely manner.
Example B. A special-purpose registry wishes to makes its content visible in the OASIS-sponsored registry, while maintaining that content in its own repository. It submits appropriate registry documents to the OASIS-sponsored registry, including a pointer to its repository, and agrees with the OASIS-sponsored registry that it will supply timely update information and that the OASIS-sponsored registry will update its records and interface in a timely manner.
A user ready to compose an XML document searches for a DTD that covers the subject of the document.
Motivation. Every day in newsgroups and e-mail discussion lists such as comp.text.sgml, comp.text.xml, and xml-dev people ask whether there is a DTD for some subject area or functional purpose. The number of such queries will grow if XML is widely adopted. Somehow they have to be answered if wheel reinvention is to be minimized.
Example A. A user is about to write his resume, and wants to use XML. He goes to a registry and looks in a subject hierarchy (or taxonomy) to find a resume DTD (this is browsing, not searching). The subject hierarchy interface displays three appropriate listings, he chooses among them on the basis of their descriptions, downloads the DTD he chose from the repository, manually adds it to his SO catalog, and sets to work with vi and SP.
Example B. A user is about to write his resume, and wants to use XML. He goes to a registry and uses its search engine to find a resume DTD (this is searching, not browsing). The search interface returns three hits, he chooses among them on the basis of their descriptions, downloads from the repository the DTD he chose, and loads it into his XML writing tool. The interface also provides a time-to-live value, showing him how long he can expect his resume DTD to be served by the repository.
Example C. A homeowner is about to advertise his house for sale, and opens his verboprocessor. He says "take a memo: real estate for sale" and the verboprocessor automatically contacts a registry to find an appropriate XML DTD (there is one already for real estate listings). He dictates the text of his ad without knowing anything about XML, and the verboprocessor sends it to all real estate listing services it can locate. (In this scenario the verboprocessor uses a registry to find something in a repository.)
Example D. An XML application designer needs a component to represent the list of names of French provinces, so he consults a registry. The registry interface indicates that the list is available as a tab-delimited list in ASCII, as an XML schema datatype declaration, and as a parameter entity declaration in DTD syntax. He chooses the parameter entity declaration format by clicking something in the interface, and the repository returns it.
NOTE: while it does not seem too useful at this stage, attention should be paid to SC32 WG2's 1999-04-20 draft “Metadata Query Service: An Object Technology Extension to the ISO/IEC 11179 Specification and Standardization of Data Elements, Part 3, Basic Attributes”, which has both use cases and IDL for “behavioral aspects of a data registry” (p. v).
NOTE: There are additional scenarios in ISO 11179.
On the basis of the above scenarios, the following functional requirements have been identified:
(Registry) Register contents of the repository (and potentially other repositories) using standardized administrative metadata.
(Registry) Apply both controlled vocabulary (for taxonomic view) and uncontrolled vocabulary (for searching) for subject matter of registered entities. One type of controlled vocabulary will be types of XML and SGML entities, which is a list OASIS can and should construct (starting with but not limited to the XML Information Set, including such things as “documentation” and “user's guide”).
(Repository) Return content in response to a request by URN, URL, PI, and, or, FPI. That is, a user should be able to request an XML-related entity by PI, FPI, URL, or URN (note that some entities may have multiple unique identifiers) and get the entity as the response. (This requirement can be viewed as a subrequirement of the next requirement.)
Return content in response to a request made through an interface to the registry.
TO BE DISCUSSED: should the OASIS-sponsored repository permit requests by URL, URN, PI, or FPI to be batched together so that multiple entities may be requested at the same time? if so, for the initial prototype or not?
TO BE DISCUSSED: should the OASIS-sponsored registry and repository support automatic execution of queries, as is envisioned in Scenario 4C? If so, the Requirements for DAV Searching and Locating work may be useful.
Tentative functional requirements for registry and repository interoperability include
Ability to hand off queries on subject matter to other repositories: for example, if a registry is queried for DTDs for some particular subject, it might forward the query to another registry known to contain entities relevant to that subject matter.
accept such queries
Ability to show content in the registry interface that is held in a repository other than that associated with the registry: for example, a registry might display the content of a repository operated by a different authority in its own registry interface.
Ability to redirect requests for content (that is, requests by URL, URN, PI, or FPI) to other repositories where the content is stored physically.
TO BE DISCUSSED: Does OASIS want to try for interoperability in the prototype? If so, information about the content of other registries and repositories can be exchanged using ICE. How would this information be aggregated with information about the content of the OASIS-sponsored registry and repository?
The operation of the OASIS-sponsored registry will follow that laid out in ISO 11179 as closely as feasible. (Note that there is a pointer to ISO 11179 online in the appendix below.) The procedure for registration of data elements is described in Part 6 of ISO 11179, and for purposes of this specification any XML-related entity can be registered using the same procedure (whether these entities are termed “data elements” or something else–other sorts of entities are described in ISO 11179 and its proposed revision, X3.285–is to be determined).
Any registry must provide a way to obtain information about the registration authorities it supports.
ISO 11179 describes the roles of the Registration Authority (RA), Submitting Organization (SO), and Responsible Organizations (RO). Note that a Repository Operator (a term and role not found in ISO 11179, which speaks only of registries) might not be the same as the Registration Authority.
Registration requires that administrative metadata be provided for every entity registered (which includes everything deposited in the repository). This metadata must be formatted in an XML document of a type to be specified, and associated with the entity to be registered, either by indicating where it is to be found (if it is not to be deposited in the repository or the RA has agreed to fetch it from that location) or by packaging it with the entity and identifying it according to the addressing mechanism provided by the packaging technology.
ISO 11179 posits the use of an International Registration Data Identifier (IRDI), specified in ISO/IEC 6523, and bureaucratic infrastructure to support assignment of various components of it. This infrastructure seems not to exist, and the OASIS Registry and Repository Technical Committee will not seek to implement it as specified, because it appears than URNs will serve the same purpose.
One of the uses of the IRDI is that the RA assigns one to an application for registration of a data element; in the same process the Registration Status of the entity to be registered is established by the RA. It is clear that the registry requires metadata beyond that supplied by the SO.
The consensus of the OASIS Registry and Repository Technical Committee is that in registry metadata we need not name a person as the OASIS Registrar.
Metadata for entities registered in the OASIS-sponsored registry and for entities deposited in the OASIS-sponsored repository shall be represented in the form of XML documents. These documents need not be the storage format for the information they contain, but are the normative representation of that information–their semantics and syntax are the API to that information.
ISO 11179 is the obvious choice for registry metadata. The existing standard is specified in English, so it lacks a concrete syntax. The proposed revision of its Part 3, ANSI X3.285, “Metamodel for the Management of Shareable Data” is specified in English and accompanied by UML diagrams. The ISO 11179 working group is exploring the possibility of generating an XMI representation of the metamodel automatically from the UML. (The ISO/IEC JTC1 / SC 32 / WG 2 proposal for a work item on XML for ISO 11179 is .) The OASIS Registry and Repository Technical Committee has decided not to use this approach, at least for the present, for the following reasons.
XMI has a very weak DTD, and is suitable only as a transfer syntax among UML tools. Many semantics important within a UML model are not tokenized by generation of an XMI representation (no markup for describing them results from generating the XMI).
The metamodel's object-oriented design is very much easier to express in an XML schema language that supports inheritance than in DTD syntax, but no XML schema language is ready for use by OASIS).
It is not clear that the full complexity of X3.285 is useful for the OASIS-sponsored registry, it is thought that those not actually involved in its development may find it difficult to learn and use, and it is suspected that perhaps it may not yet have been thought out as thoroughly as it should be, on the ground that its developers disagree about some aspects of its use.
X3.285 is still undergoing revision and is not entirely stable.
Consequently the registry, which does not require the complexity of X3.285, will use the existing standard rather than the metamodel and await developments from SC32 WG2. A DTD or DTDs (see next section) is required for this purpose. Terry Allen is exploring the possibilities of a set of DTDs for ISO 11179 that makes sense to both the OASIS Registry and Repository Technical Committee and SC32 WG2 members.
The OASIS Registry and Repository Technical Committee should track revision of ISO 11179 and the outcome of the W3C's XML Schema activity so that the registration document type may be updated when and as appropriate.
ISO 11179 describes both a core set of administrative metadata and the roles of the RA and SO in providing the values for that set of metadata. Some of it is supplied by the SO and some by the RA; the set of information that the user of the registry would wish to be provided is a combination of both. The following outline is a straw proposal for how to implement the ISO 11179 roles in XML documents.
SO sets up business relations with RA and is given a unique identifier by RA. This unique identifier could be a URN. RA constructs an XML document, identified by its own URN, containing SO's contact information, a reference to the terms of business between the two parties (such as intellectual property arrangements and copyright), and perhaps a list of SO's submissions and URNs for any other parties relevant to relations between SO and RA (such as the RO, and the corresponding information-about documents. This document need not be public. RA's-information-about-SO DTD required. The documents and the parties they concern have separate URNs so that the same parties can do business with different mixes of other parties and on various terms. The existence of these information-about documents is essential to avoid the need to repeat the information they contain in multiple places.
RA sends a copy of information-about-SO document to SO for approval, SO approves it. Acknowledgement or approval DTD required?
SO sends to RA a package containing:
Cover letter containing information sufficient to determine the identity of the SO, RA, and any other relevant parties, an indication of the intent of the package (that is, whether it is a submission, a revision, a cancellation, etc.), and a list of submissions, which are either contained in the package or are pointed to. Cover letter DTD required.
The submissions themselves (if any). Submission DTD required. In each submission, for each data element dictionary or data element record, the identity of administrative parties (SO, RO, RA) is indicated by use of the URNs for the documents in which the RA maintains information about these parties. (Doing so allows a submission in which there is a mix of parties, reduces the size of the record, and so reduces the impulse to construct document types that allow inheritance of administrative metadata, which has been found to be workable but cumbersome.) Each record must have an XML ID to provide a target for the RA's added metadata to link to, but this need not be the same ID as the identifier the RA assigns to the record in the context of the RA's registry.
The method of packaging these contents remains to be determined. A proposal for doing so using MIME Multipart/Related was presented by Terry Allen at the SGML '97 convention, entitled “Package or Perish.”
RA deposits the submissions in a repository.
For each data element dictionary or data element record, the RA constructs a record for the metadata it supplies, including the status of the entity and its identifier in the registry (and possibly including taxonomic information for use in populating the registry interface). Added metadata DTD required. This document need not be public, but should be sent to SO for approval and information (the SO needs to know what the status and identifiers are). Note that while the RA could simply augment the submission, doing so would destroy an integrity check such as a digital signature.
SO approves the added metadata (or has already waived this step as part of setting up business relations).
RA constructs (or can construct on the fly) a composite metadata document combining RA's and SO's metadata about the registered entity or entities. This document may refer to SO's submission document (providing an integrity check), and constitutes the metadata that is served to the public. Composite metadata DTD required.
The RA's registry interface is updated from relevant new metadata (drawn from the composite metadata document or the SO's and RA's metadata documents, as desired) and the registered entities are available for public access.
The human-readable interface to a registry should be constructed from the content of the registration documents, perhaps with some augmentation.
Users should be warned that links to entities made through the interface may be fragile, and that links to entities should be made only by means of unique identifiers.
Some subject matter taxonomy for the subject matter of entities in the OASIS-sponsored registry is required. Dewey has been suggested (it is owned by the OCLC, although some part of it may be available freely), LCSH has been advised against (on the ground that it has no compact top layer, so is difficult to navigate). Terry Allen has essayed a provisional taxonomy of the material in Robin Cover's SGML and XML pages, which may show that a custom taxonomy for the top several levels can be used to stitch together various taxonomies for specific subject areas. Here is that provisional taxonomy:
GOVERNMENT Commerce Patents Tax MILITARY BUSINESS LAW SCIENCE Astronomy and Physics Biology and Biotechnology Chemistry Electrical Engineering Engineering Geology Mathematics Medicine Meteorology HUMANITIES History Religious Studies INDUSTRY (Manufacturing; could also have Service and Financial) Air Transport Electronics Motor Vehicles Process Engineering Railroad Telecommunications COMPUTING Specifications Applications ARTS Literature (see below, s.v. WORDS AND STUFF, "General and Electronic Music Visual Art WORDS AND STUFF General and Electronic Texts, Archives Linguistics Information Management and Library Science Publishing Books, Journals, Articles, Documents Format-neutral Hypertext News Industry Resumes Style Sheets Theses
Clearly this doesn't work quite right (aside from its other faults). Ron Daniel has suggested employing another taxonometric axis, “genre”, with very provisional suggested headings
standards and specifications data and observations reports, memos, etc. business forms, RFPs, ... (this one can probably stand subdivision) dictionaries, thesauri, etc.
The combination of the two axes (subject and genre) appears promising.
TO BE DISCUSSED: What should the interface for the OASIS-sponsored registry look like? As an example, see the U.S. EPA's Environmental Data Registry interface . Carla Corkern may be able to volunteer someone. OASIS probably wants to provide indices by
intellectual property owner
author or corporate author, if not the same as intellectual property owner
For each index some interface mechanisms are required, such as an alphabet wheel and an alphabetical list of indexed items.
Eventually the OASIS-sponsored registry will encounter interface issues with respect to non-Latin scripts and multiple languages (not to mention the difficulties of indexing Unicode). The OASIS Registry and Repository Technical Committee should search for a reliable source of information on how to deal with these issues, which are orthogonal to our main purpose.
The same piece of information may exist in multiple formats. Multiplicity begets complexity. For the purpose of the prototype OASIS-sponsored registry it may be assumed that the same Submitting Organization registers all the formats of the same piece of intellectual property, so that it is possible to maintain one registration document, but this will not always be the case. The Repository Operator may (if permitted by the Submitting Organization) transform a registered entity into another format (TODO specify how the list of transform-formats would be maintained –a pointer from or to the registration document?)
Taking Docbook and OASIS as examples, OASIS would want to be able to serve the following (from discussion at the Granada meeting)
the registry document (which is essentially what was once discussed in the URN context under the term URC)
the whole DTD set, including documentation, wrapped in MIME
the whole DTD set, including documentation, zipped
the whole DTD set, without documentation, wrapped in MIME
the whole DTD set, without documentation, zipped
the documentation only
each module separately
a single-file version of the DTD, with all parameter entities resolved
Among other repository content, it is agreed that OASIS would want to serve the ISO entity sets for SGML, which, it is argued, can be done legally at least within the U.S.
The same is true, it is asserted, for the technical content of ISO 4217 (currency codes). If we were to come to an agreement with ISO to serve the whole standard, we would want to be able to serve the following (again from Granada discussion)
the whole standard, in whatever format ISO publishes it in electronically (I know they don't do that, but pretend)
the whole standard in HTML
the technical content pretty-printed in HTML
the technical content as an experimental datatype declaration per XML Schema (making it available directly for use in processing)
the technical content as an experimental parameter entity (making it available directly for use in processing)
What should such a parameter entity be named? The use of URNs has been suggested, and it might be necessary to construct an OASIS URN namespace. (Another possibly is that some URL could be constructed that would be valid as a parameter entity NAME.) Here's what such a parameter entity and its URN might look like:
<!ENTITY % urn:x-oasis:x-iso:stds:4217:parameter-entity "AFA | ALL | ... | ZWD">
Issues here include the use of colons as separators (would using more than one be forbidden in this context by the XML Langugage specification?), the need to indicate the format of the content ("parameter-entity"), and how to show that the segment of the x-oasis namespace in use is itself experimental ("x-iso") - Bill Smith has suggested that OASIS could use the x- prefix anywhere within x-oasis to so indicate. Of course the rules for use of an x-oasis URN namespace should be constructed before it is deployed.
The line between functions of a registry and functions of a repository is not clear: the two are logical views of a system that can be implemented as a number of different components. One can imagine a repository that did no more than return entities requested by identifier, all other functions being handled by the registry.
What should repose in an OASIS-sponsored repository?
At the November 1998 meeting in Chicago, it was agreed universally that DTDs are in scope. Schemas, style sheets, data elements, and “name spaces” were mentioned, too. In subsequent discussion it has become clear that very little can be ruled out of scope, at least by format, and that functionality is more important than scope of content. Scope seems to be a business issue rather than a technical one.
Resolution of requests by URL and URN are discussed in RFC 2483, URI Resolution Services Necessary for URN Resolution. As this is an experimental specification, use of it should not be made normative, but the typology of requests, results, errors, and security considerations is well considered and will be the basis for the implementation of the OASIS-sponsored registry and repository (perhaps with the addition of a request for both “resource” and metadata). The specification has no concrete syntax.
If OASIS wishes to support protocols in addition to HTTP, some protocol-independent syntax must be found (RDF has been suggested, by Michael Mealling) and mapped into each protocol supported.
If OASIS is willing to limit the protocols supported by the OASIS-sponsored registry and repository to HTTP, it would be sensible to use the syntax proposed in RFC 2169, "A Trivial Convention for using HTTP in URN Resolution" (THTTP). (Note that the terminology in RFC 2169 is out of synch with the later RFC 2483, but that the semantics are the same.) Section 2.0 of RFC 2169 reads:
The general approach used to encode resolution service requests in THTTP is quite simple:GET /uri-res/<service>?<uri> HTTP/1.0
For example, if we have the URN "urn:foo:12345-54321" and want a URL, we would send the request:GET /uri-res/N2L?urn:foo:12345-54321 HTTP/1.0
The request could also be encoded as an HTTP 1.1 request. This would look like:GET /uri-res/N2L?urn:foo:12345-54321 HTTP/1.1 Host: <whatever host we are sending the request to>
Responses from the HTTP server follow standard HTTP practice. Status codes, such as 200 (OK) or 404 (Not Found) shall be returned. The normal rules for determining cachability, negotiating formats, etc. apply.
To use this syntax in general, one would follow the pattern (cast as a URL rather than a full HTTP request):
To obtain an entity such as the Docbook DTD (the URN is imaginary):
To obtain the composite metadata document for the Docbook DTD (the URN is again imaginary):
TO BE DISCUSSED: shall we add types of request parallel to those for resolution of URNs, for the case of PIs and FPIs?
TO BE DISCUSSED: does OASIS wish to attempt URN resolution via DNS at this time?
RFC 2483 defines an “I2C” request (section 4.5), for resolution of a URL or URN to a description of a resource, which can be understood in this context as requests for an entity's registration document. (The “I2CS” request, section 4.6, allows a request for multiple documents, and this would be useful should the registry hold its own information about an entity separate from the information submitted by the SO).
TO BE DISCUSSED: it would seem that when an entity having a unique identifier is returned in response to a request stemming from searching or browsing, such that the requestor does not possess a unique identifier for the entity, the unique identifier must be returned along with but separate from the entity (else the requestor cannot refer to the entity properly). This case extends the typology of RFC 2483, and to meet it, some packaging mechanism is required. As the list above shows, it was thought desireable at the Granada meeting to be able to return both MIME and zip packages when what is requested consists of multiple entities; RFC 2483 specifies the use of MIME multipart/alternative for the return of multiple entities. Is yet more packaging required to contain the unique identifiers? How can unique identifiers be packaged within a zip file? Do we really want to return zip files anyway? Should a catalog be returned along with anything requested? NOTE that a packaged entity is not immediately available for use by any existing SGML or XML application; in scenarios requiring automatic resolution by unique identifier, say, for a browser, it will not be possible to provide the unique identifier (but then the application presumably has it and used it to request the entity). (Is that right?)
Business relations between the Submitting Organization, the Registration Authority, and the Repository Operator are potentially complex–too complex to specify here–and should be established out of band. Issues include ownership of intellectual property rights for entities in the repository, and for the registry's interface to them.
The registry and repository shall have published policies relating to their provision of intellectual property notices for entities in the repository; that is, whether the interface to the registry or repository warns of the existence of copyright notices, asserted licenses, or other intellectual property restrictions or encumbrances, or leaves it to the user to discover them.
The registry and repository shall have published policies relating to their use of methods to guarantee the integrity of entities in repository and metadata in the registry; for example, does the repository employ digital signatures to ensure against corruption?
Security of some sort is required for all functions of the registry and repository, and so should not be considered separately. Security should be sufficient to engender confidence in the registry and repository.
The complete content of both the registry and repository should be backed up offsite, and the backup tested. Some plan should be made for reconstituting the registry and repository from the backup should the original site be rendered inoperable.
The registry and repository shall have published policies relating to its plans for continuing in operation and the outcomes to be expected should it cease operation or should business relationships with the owners of its content change. A point of departure for describing archival longevity is the “Reference Model for an Open Archival Information System” (OAIS) which is a draft ISO standard.
The registry and repository shall have published policies relating to the privacy of users and the sale or other distribution of usage information.
ISO 11179 defines a data element status value, “certified” (Part 6, p. 9) for a “recorded data element [that] has met the quality requirements specified in this and other parts of ISO/IEC 11179.”
TO BE DISCUSSED: is quality control a requirement for all registries? is it something the OASIS-sponsored registry should engage in? If so, the registry should provide metadata about what specifications an entity conforms to and who did the testing to determine that conformance. (XML validity vs. well-formedness falls under this heading.)
I have no idea what sort of conformance requirements should be stated, if any, but I'm providing this placeholder.
Some registries or repositories may require payment for use of their services.
Inevitably, the format of the registry documents will be revised. Procedures should be established, or at least envisioned, for moving records to a new format, informing SOs of the change, and ensuring seamless transition of services.
Various specifications describe additional, generally domain-specific functionality for registries and repositories, chiefly in the area of searching or automatic processing (the ECO Working Group, the XML/EDI Group). The OASIS-sponsored registry may provide some such added value by cataloguing entities according to their character as XML or SGML entities. Another form of added value is the capability to push content to subscribers.
Consequently, the OASIS-sponsored registry and repository should support an API such that value-added services can be built on top of it.
(In this section there can be a cumulative wish list of such added value functions if desired.)
It may be wise to contemplate–as a separate effort–a means of recording user preferences among registries and, or, repositories, so that a user agent may be instructed to consult them in some particular order.
Glossed here are relevant terms, including acronyms, with entries for some specifications relevant to the registry and the repository.
"Requirements for DAV Searching and Locating" http://www.ietf.org/internet-drafts/draft-ietf/dasl-requirements-01.txt (note mismatch of version and URL)
Government Information Locator Service
ISO 11179 is online at http://www.sdct.itl.nist.gov/~ftp/l8/11179/ . The home page of the relevant committee is http://sdct-sunsrv1.ncsl.nist.gov/~ftp/l8/sc32wg2/projects/11179content/content-home.htm with a link to an HTML representation of the stanadard. It is proposed to replace Part 3 of 11179 with ANSI X3.285, “Metamodel for the Management of Shareable Data”, which you can find in HTML at http://www.lbl.gov/~olken/X3L8/drafts/Metamodel/MetaModel_ToC.html and in Word and PDF format (filenames beginning dpX3-285) at ftp://sdct-sunsrv1.ncsl.nist.gov/x3l8/x3l8docs/x3.285/docs/ .
Registration Authority (ISO 11179).
a location or set of distributed locations where documents pointed at by a registry reside, and from which they can be retrieved by conventional (http, ftp) means, perhaps with an additional authentication/permissions layer.
Responsible Organization (ISO 11179).
Submitting Organization (ISO 11179).
This is a list of IETF (and other) documents relating to URNs (Uniform Resource Names), originally drawn up by Murray Altheim of Sun and updated by Terry Allen. The documents he thinks most important are marked with an asterisk.
Charter of the current IETF WG
Requests For Comments
*Uniform Resource Identifiers (URI): Generic Syntax (RFC 2396)
*Resolution of Uniform Resource Identifiers using the Domain Name System (RFC 2168)
A Trivial Convention for using HTTP in URN Resolution (RFC 2169)
Architectural Principles of Uniform Resource Name Resolution (RFC 2276) (expresses the author's point of view, which is not consensus)
Using Existing Bibliographic Identifiers as Uniform Resource Names (RFC 2288)
Internationalized Uniform Resource Identifiers (IURI)
*URI Resolution Services Necessary for URN Resolution (RFC 2483)
*Resolution of Uniform Resource Identifiers using the Domain Name System
*The Naming Authority Pointer (NAPTR) DNS Resource Record
URN Namespace Definition Mechanisms
*A URN Namespace for IETF Documents , approved by the IESG for publication as an Informational RFC.
Assignment Procedures for the URI Resolution using DNS (RFC 2168)
Requirements for Human Friendly Identifiers
An Architecture for Supporting Human Friendly Identifiers
In this document, “XML-related entity” means any XML or SGML entity necessary for the processing of an XML document, or documentation of such an entity.