CP RSS Channel
About Our Sponsors
Articles & Papers
Technology and Society
|News: Cover Stories|
|W3C Publishes Approved TAG Finding on Associating Resources with Namespaces.|
W3C has published "Associating Resources with Namespaces" as an Approved TAG Finding from the W3C Technical Architecture Group (TAG). The document addresses the question of how ancillary information (schemas, stylesheets, documentation) can be associated with an XML namespace. It offers guidance on how a namespace document can be optimally designed for humans and machines such that information at the namespace URI conforms to web architecture good practice.
This TAG finding addresses TAG issue 'namespaceDocument-8': "What should a namespace document look like?" The issue was raised on January 14, 2002 by Tim Bray in reference to a 1998 Web architecture document that said "The namespace document (with the namespace URI) is a place for the language publisher to keep definitive material about a namespace. Schema languages are ideal for this." Bray: I disagree quite strongly. Schema languages as they exist today represent bundles of declarative syntactic constraints. This is a small subset of 'definitive material'. RDDL represents my current thinking as to what a "namespace document" ought to be like..."
The chartered mission of the W3C TAG is to "document and build consensus around principles of Web architecture and to interpret and clarify these principles when necessary; to resolve issues involving general Web architecture brought to the TAG; and to help coordinate cross-technology architecture developments inside and outside W3C. The TAG consists of eight persons elected (by the W3C Advisory Committee) or appointed, and a Chair. The W3C Team appoints the Chair of the TAG, and three TAG participants are appointed by the Director. TAG has three public mailing lists, including an 'Announce' list for publication of URIs for TAG minutes, IRC logs, meeting summaries, findings, new issues, resolved issues, and drafts of architecture documents. The TAG Charter recognizes that no set of documents (including Draft, Approved, and Archival Findings) "will ever answer all the hard questions, so interpretation and subsequent refinement of the W3C architecture will certainly be necessary." The TAG has published thirteen (13) "Approved Findings" since May 2002.
The new TAG finding on "Associating Resources with Namespaces" defines a conceptual model for identifying related resources that is simple enough to garner community consensus as a reasonable abstraction. It demonstrates how RDDL 1.0 is one possible concrete syntax for this model, and shows how other concrete syntaxes could be defined and identified in a way that would preserve the model. The specification also provides guidance on use of identifiers for indivisual terms within an XML namespace. Finally, the TAG finding discusses the use of a namespace URI as a suitable "key" for the nature of a resource encoded in an XML vocabulary, or for the purpose of a resource.
As described in the 2004 Web Architecture Recommendation, the purpose of an XML namespace is to allow the deployment of XML vocabularies (e.g., in which element and attribute names are defined) in a global environment and to reduce the risk of name collisions in a given document when vocabularies are combined. However, as documented in the TAG Finding, names in a namespace can, in theory at least, be defined to identify any thing or any number of things. The names in a namespace form a collection:
- sometimes it is a collection of element names — DocBook and XHTML, for example
- sometimes it is a collection of attribute names — XLink, for example)
- sometimes it is a collection of functions, e.g., XQuery 1.0 and XPath 2.0 Data Model
- sometimes it is a collection of properties. e.g., FOAF
- sometimes it is a collection of concepts, e.g., WordNet
- ... and many other uses which are likely to arise
The TAG finding on "Associating Resources with Namespaces" observes, moreover, that there's "no requirement that the names in a namespace only identify items of a single type; elements and attributes can both come from the same namespace as could functions and concepts or any other homogeneous or heterogeneous collection you can imagine. The names in a namespace can, in theory at least, be defined to identify any thing or any number of things. Given the wide variety of things that can be identified, it follows that an equally wide variety of ancillary resources may be relevant to a namespace. A namespace may have documentation (specifications, reference material, tutorials, etc., perhaps in several formats and several languages), schemas (in any of several forms), stylesheets, software libraries, applications, or any other kind of related resource. The names in a namespace likewise may have a range of information associated with them.
How might information be associated with namespace names or terms? A Namespace document is described in the Architecture of the World Wide Web, Volume One within the section "XML-Based Data Formats" as a URI-addressable information resource that contains useful information, machine-usable and/or human-usable, about terms in a namespace. A namespace document documents a namespace — the namespace as a whole, and in some cases, the terms/names in that namespace.
The Web Architecture Recommendation's description of a namespace document anticipates that the information documenting the namespace might be of interest to both humans and machines. It therefore recommends a good practice for namespace documents: "The owner of an XML namespace name SHOULD make available material intended for people to read and material optimized for software agents in order to meet the needs of those who will use the namespace vocabulary."
Specifically, a human person might want to:
- understand the purpose of the namespace
- learn how to use the markup vocabulary in the namespace
- find out who controls it and associated policies
- request authority to access schemas or collateral material about it
- report a bug or situation that could be considered an error in some collateral material
and processor might want to:
- retrieve a schema, for validation
- retrieve a style sheet, for presentation
- retrieve ontologies, for making inferences
The RDF model is used in the TAG finding to describe how integrate the semantics of terms in a namespace into the semantic web. The document is quick to clarify, however, that this use of RDF for modelling the abstraction does not place any onus on implementors to use RDF technologies to locate ancillary resources, nor does it require that authors writing namespace documents understand semantic web technologies or RDF. "Directing humans or machines to related resources is by no means the only kind of information about a namespace that a namespace document might provide. For humans, ordinary language, and for machines, GRDDL and RDFa may be used to provide additional relevant information — for example the type and intended use of the things identified by individual names in the namespace. The RDF model defined [however] does not constrain whether or how such additional information may be provided."
Section 3 of the TAG finding ("Namespace Document Formats") describes two formats deployed explicitly to address the question of namespace documents, and a third format which can be seen as simultaneously providing a unified view of these two formats and also providing a model to make other new formats available. A RDDL 1.0 document encodes the nature and purpose of the related resource in a rddl:resource element. RDDL 2.0 proposes encoding the nature and purpose of the related resource directly on the HTML a element. A third approach (GRDDL) "provides a mechanism for gleaning resource descriptions from XML. Employing GRDDL allows an author to associate a transformation with a document; the result of applying that transformation is an RDF model. For well-known transformation URIs, an application can be written to extract the data directly from the source markup without actually running an XSLT transformation. When an application wants to support arbitrary GRDDL transformations, a pair of well-known GRDDL transformation URIs for RDDL 1.0 and RDDL 2.0 allows one to unify both RDDL variants and the GRDDL case. Stylesheets which will actually produce RDDL-models from RDDL 1.0 and from RDDL 2.0 are available."
Section 4 of the TAG finding ("Namespace URIs and Namespace Documents") addresses the topic of responses to requests for resources which are not information resources. Namespace documents are descriptive documents, but namespaces themselves are not information resources — so what should be done? "How should namespace documents, as opposed to namespaces themselves, be identified and retrieved?" The TAG finding concludes that "broadly speaking, there are two distinct patterns of namespace naming: one is virtually universal for namespaces identifying names in XML document vocabularies, and one at least dominant for namespaces identifying constituents of Semantic Web ontologies. Different solutions to the namespace document identification/retrieval problem are appropriate in these two cases."
In "the XML language case" (where namespaces to distinguish element and attributes names in one XML language from all others), the TAG recommends that servers respond with an HTTP 303 redirection from the namespace URI to a related URI for the namespace document. In "the Semantic Web case" (where a namespace URI ending with a hash (#) to identify the namespace is sometimes used), varying kinds of 302 or 303 redirection may be used. "When both human-readable and RDF-format descriptions of a namespace are available, with the latter being derived from the former via GRDDL, it is good practice to make the GRDDL-derived description available at its own URI. Whether this is done using a static copy or by applying the GRDDL-specified transformation on demand is an implementation detail."
In Sections 6 and 7 of the TAG finding, Nature Keys and Purposes from RDDL 1.0 are presented. In this model, for any ancillary resource related to a namespace, "we say that each of these resources has a nature and serves a purpose." For XML vocabularies, the namespace URI is often suitable as a key for the nature of a resource encoded in that vocabulary. For other resources, the URI of a media type or normative specification is appropriate. "Purpose" encodes a relationship between a namespace, another resource, and the nature of that resource. For example, with respect to a particular namespace, the purpose of an W3C XML Schema might be validation of documents in that namespace..."
Key specification referenced in this TAG finding include:
Resource Directory Description Language (RDDL). Cited as "RDDL 1.0." February 18, 2002. Edited by Jonathan Borden (The Open Healthcare Group) and Tim Bray (Antarcti.ca Systems). A RDDL document, called a Resource Directory, provides a package of information about some target, including: (1) Human-readable descriptive material about the target; (2) A directory of individual resources related to the target, each directory entry containing descriptive material and linked to the resource in question... The Resource Directory Description Language is an extension of XHTML Basic 1.0 with an added element named resource. This element serves as an XLink to the referenced resource, and contains a human-readable description of the resource and machine readable links which describe the purpose of the link and the nature of the resource being linked to..."
Resource Directory Description Language (RDDL) 2.0. January 18, 2004. Edited by Jonathan Borden (The Open Healthcare Group) and Tim Bray (Antarctica Systems). "This document is a working draft that contains substantial input from the W3C Technical Architecture Group, produced in connection with the work on its issue namespaceDocument-8. It is the consensus of the TAG that RDDL is a suitable format for use as a "Namespace Document", that is to say as a representation yielded by dereferencing a URI in use as an XML Namespace Name. While this document has no official standing, it is the intention of the TAG to seek guidance from the W3C membership and the larger community on the question of whether and how to progress this document and the use of RDDL."
Architecture of the World Wide Web, Volume One. W3C Recommendation. 15-December-2004. Edited by Ian Jacobs (W3C) and Norman Walsh (Sun Microsystems, Inc). Developed by W3C's Technical Architecture Group (TAG). Latest version URI: http://www.w3.org/TR/webarch/. "The World Wide Web uses relatively simple technologies with sufficient scalability, efficiency and utility that they have resulted in a remarkable information space of interrelated resources, growing across languages, cultures, and media. In an effort to preserve these properties of the information space as the technologies evolve, this architecture document discusses the core design components of the Web. They are identification of resources, representation of resource state, and the protocols that support the interaction between agents and resources in the space. We relate core design components, constraints, and good practices to the principles and properties they support."
Namespaces in XML 1.1 W3C Recommendation. 04-February-2004. Edited by Tim Bray (Textuality), Dave Hollander (Contivo, Inc), Andrew Layman (Microsoft), and Richard Tobin (University of Edinburgh and Markup Technology Ltd). "XML namespaces provide a simple method for qualifying element and attribute names used in Extensible Markup Language documents by associating them with namespaces identified by IRI references. Documents [...] containing multiple markup vocabularies, pose problems of recognition and collision. Software modules need to be able to recognize the elements and attributes which they are designed to process, even in the face of 'collisions' occurring when markup intended for some other software package uses the same element name or attribute name. These considerations require that document constructs should have names constructed so as to avoid clashes between names from different markup vocabularies..."
RDFa Primer: Bridging the Human and Data Webs." W3C Working Draft. 20-June-2008. Today's web is built predominantly for human consumption. Even as machine-readable data begins to appear on the web, it is typically distributed in a separate file, with a separate format, and no correspondence between the human and machine versions. As a result, web browsers can provide only minimal assistance to humans in parsing and processing web data: browsers only see presentation information. We introduce RDFa, which provides a set of HTML attributes to augment visual data with machine-readable hints. We show how to express simple and more complex datasets using RDFa, and in particular how to turn the existing human-visible text and links into machine-readable data without repeating content. This document provides only a Primer to RDFa. The normative specification of RDFa can be found in RDFa in XHTML: Syntax and Processing. A Collection of Attributes and Processing Rules for Extending XHTML to Support RDF.
"HTML documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported into a user's desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo's creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
RDFa is a specification for attributes to be used with languages such as HTML and XHTML to express structured data. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't need to repeat significant data in the document content. This document only specifies the use of the RDFa attributes with XHTML. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure..."
Gleaning Resource Descriptions from Dialects of Languages (GRDDL). W3C Recommendation. 11-September-2007. Edited by Dan Connolly (W3C). GRDDL is a mechanism for Gleaning Resource Descriptions from Dialects of Languages. This GRDDL specification introduces markup based on existing standards for declaring that an XML document includes data compatible with the Resource Description Framework (RDF) and for linking to algorithms (typically represented in XSLT), for extracting this data from the document. The markup includes a namespace-qualified attribute for use in general-purpose XML documents and a profile-qualified link relationship for use in valid XHTML documents. The GRDDL mechanism also allows an XML namespace document (or XHTML profile document) to declare that every document associated with that namespace (or profile) includes gleanable data and for linking to an algorithm for gleaning the data. A corresponding GRDDL Use Case Working Draft provides motivating examples. A GRDDL Primer demonstrates the mechanism on XHTML documents which include widely-deployed dialects known as microformats. A GRDDL Test Cases document illustrates specific issues in this design and provides materials to aid in test-driven development of GRDDL-aware agents..."
The W3C Technical Architecture Group (TAG), originally chartered in 2001, was formed to "document and build consensus around principles of Web architecture and to interpret and clarify these principles when necessary." The current Technical Architecture Group (TAG) Charter (27-October-2004), documentation in the within the W3C Process, Web site, and mailing list archives provide full details about composition and activities.
As of 2008-06-26, the W3C TAG has 9 members; 5 elected, 3 appointed, and 1 chair:
- Tim Berners-Lee (W3C - Chair)
- Ashok Malhotra (Oracle)
- Noah Mendelsohn (IBM)
- David Orchard (BEA)
- T. V. Raman (Google)
- Jonathan Rees (Science Commons)
- Henry Thompson (U. of Edinburgh)
- Norm Walsh (Sun [--> Mark Logic])
- Stuart Williams (HP)
- [As of 1 February 2008, Dan Connolly provides staff support for the TAG]
"There are a number of architectural principles that underlie the development of the World Wide Web. Some of these are well-known; others are less well-known or accepted. It is important for the growth and interoperability of the Web that these principles be documented and generally agreed to.
Web architectural principles are debated, developed, and documented both inside and outside of W3C. For instance, W3C Working Groups use the Recommendation track to build consensus around principles that fall within the scope of the Working Group's charter and expertise. The W3C Team has published architecture documents as informal Web pages on the W3C site or as W3C Notes...
As W3C has grown, there have been more frequent requests (from W3C Members and other parties) for documentation of architectural principles that cross multiple technologies. People ask, "How do W3C technologies fit together? What basics must people know before they start developing a new technology?" Some discussions and debates within W3C have highlighted the need for documented architectural principles as well as a process for resolving disagreements about architecture...
To improve the effectiveness of Working Groups, to reduce misunderstandings and overlapping work, and to improve the consistency of Web technologies developed inside and outside W3C, the Consortium established the Technical Architecture Group (TAG)... [excerpted from the Charter]
"...Web architecture refers to the underlying principles that should be adhered to by all Web components, whether developed inside or outside W3C. The architecture captures principles that affect such things as understandability, interoperability, scalability, accessibility, and internationalization.
For understandability, it is important that specifications be built on a common framework. This framework will provide a clearer picture of how specifications for Web technology work together.
For interoperability, there are some principles that cross Working Group boundaries to allow technical specifications to work together. For example, W3C has adopted an architectural principle that XML should be used for the syntax of Web formats unless there is a truly compelling reason not to... This principle allows broad applicability of generic XML tools and is more likely to lead to general protocol elements that are useful for multiple purposes.
For scalability, it is important to base current work on wide applicability and future extensibility. For example, it is a common principle in designing specifications to avoid single points of control (e.g., a single registry that all specification writers or developers must use).
W3C's Web Accessibility Initiative and Internationalization Activity are already producing Architectural Recommendations in the areas of accessibility and internationalization, respectively... [excerpted from the Charter]
|Receive daily news updates from Managing Editor, Robin Cover.|