Cover Pages: XML Daily Newslink: Wednesday, 04 November 2009

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Sun Microsystems, Inc. http://sun.com

Headlines

Social Meaning on the Web: From Wittgenstein to Search Engines
Updated Draft IETF IRI Working Group Charter
An IRI/URI Namespace for International Object Identifiers (OIDs)
Last Call Working Draft: Evaluation and Report Language (EARL) 1.0 Schema
Pointer Methods in RDF 1.0
The Evolution of Online Identity
surfrdf: Object RDF Mapper to Surf the Semantic Web

Social Meaning on the Web: From Wittgenstein to Search Engines
Harry Halpin and Henry S. Thompson, IEEE Intelligent Systems

One could hypothesize that the biggest question for the Web is whether multiple agents in a decentralized information space can share meaning via the use of uniform resource identifiers (URIs). On the hypertext Web, this bet was trivial; most of the time a URI would identify a Web page by virtue of allowing access to the Web page itself. However, even in the Web's earliest stages, URIs were for more than just accessing Web pages: they united the previous disparate protocols of the Internet into a single seamless and smooth space of information, where any network-accessible object could be given a URI.

The goal of the Semantic Web is to give URIs to "real objects and imaginary concepts" as well as to the "relationships between them." However, there is a fly in the ointment: a Web browser cannot simply access a real object like the Eiffel Tower via HTTP! So, the original question of what a URI identifies, which we could answer earlier by trivially accessing a Web page, transforms into the open question of how agents can determine what non-Web-accessible thing a URI on the Semantic Web identifies in a decentralized manner. This is the defining problem for the evolution of the Web into the Semantic Web...

There are two opposing, yet plausible, stories about how these URIs in the Semantic Web get their meaning. In the first story, as advocated by Berners-Lee and others with a background in Web architecture, a URI gets its meaning from its owner. This seems to be a plausible enough story, since in the original hypertext Web, this is precisely how URIs worked: the owner of the URI had the authority to host Web pages or other network- accessible objects on the host whose name began the URI itself.

However, the Semantic Web presents a disturbing question: what, if anything, should be accessible from these URIs for real-world things and imaginary concepts? Obviously, a straightforward response would be to host some sort of accurate description that is accessible (perhaps via redirection) from the URI, such as a picture of the Eiffel Tower and some data about the Eiffel Tower in a language like RDF (Resource Description Framework, the primary knowledge representation language of the Semantic Web). The success of linked data illustrates that the hosting of these descriptions over HTTP is critical. However, there are edge cases. What if the description is ambiguous? What if there are multiple descriptions for what appears to be the same thing? [...]

Updated Draft IETF IRI Working Group Charter
Larry Masinter, Posting to 'public-iri' Discussion List

An update to the Draft Charter for a possible IETF IRI Working Group has been published. Internationalized Resource Identifiers (IRIs), as defined in IETF Request for Comments #3987, are "a complement to the Uniform Resource Identifier (URI). An IRI is a sequence of characters from the Universal Character Set (Unicode/ISO 10646). A mapping from IRIs to URIs is defined, which means that IRIs can be used instead of URIs, where appropriate, to identify resources." IRIs extend the syntax of URIs to a much wider repertoire of characters, because "the infrastructure for the appropriate handling of characters from local scripts is now widely deployed in local versions of operating system and application software. Software that can handle a wide variety of scripts and languages at the same time is increasingly common. Also, increasing numbers of protocols and formats can carry a wide range of characters... For many people, handling Latin characters (A-Z) is as difficult as handling the characters of other scripts is for those who use only the Latin alphabet. Many languages with non-Latin scripts are transcribed with Latin letters. These transcriptions are now often used in URIs, but they introduce additional ambiguities..."

The proposed IETF IRI Working Group is currently scoped to produce a new version of RFC 3987 IRI specification, a new version of the RFC 2368 "The mailto URL scheme", and a new version of IETF RFC 4395 "Guidelines and Registration Procedures for New URI Schemes". The update to RFC 3987 may be split into separate documents in order to focus review: (1) Handling of Internationalized domain names in IRIs (Best Common Practice); (2) Internationalization Considerations in IRIs—Guidelines for BIDI, character ranges to avoid, special considerations (Best Common Practice); (3) Syntax, parsing, comparison of IRIs (Standards Track).

The primary focus of the IETF IRI Working Group is to resolve conflicting uses, requirements and best practices for internationalized URL/URI/IRI and various other forms, among many specifications and committees, while moving toward consistent use of IRIs among the wide range of Internet applications that use them.

A primary role of this working group is to bring together a core group to resolve conflicts between various documents in preparation. For this working group to succeed, agreement of the affected communities, including the following, is required (for example): IDNA Requirements for Domain Names in Identifiers (IETF IDNABIS Working Group), HTTPBIS definition of 'http:' URI scheme, EAI Email Address Internationalization, Unicode Consortium (IDNA, TR 46), HTML5 definition of 'URLs', W3C XML Core, W3C I18N Core (W3C Internationalization Core Group), W3C TAG, and ICANN (Internet Corporation for Assigned Names and Numbers).

An IRI/URI Namespace for International Object Identifiers (OIDs)
John Larmouth and Olivier Dubuisson (eds), IETF Internet Draft

IETF has published a revised Internet Draft of An IRI/URI Namespace for International Object Identifiers (OIDs). The document is a product of the joint ISO/IEC and ITU-T ASN.1 and OID group.

The draft defines the IRI/URI scheme for International Object Identifiers, where the syntax and semantics of the IRI is specified using the International Object Identifier tree specified in ITU-T X.660. ITU-T Recommendation X.660 'Information technology — Open Systems Interconnection — Procedures for the operation of OSI Registration Authorities — General procedures and top arcs of the ASN.1 Object Identifier tree' defines a generic registration-hierarchical-name tree and the specific form of this RH-name tree called an ASN.1 Object Identifier (OID) tree, including registration of the top-level arcs of the OID tree. It specifies procedures that are referenced by other parts of the ITU-T Rec. X.660 series - ISO/IEC 9834 multi-part standard for the operation of International Registration Authorities.

"This scheme can be used by any specification requiring an IRI or URI based on the international OID tree to identify an object or to retrieve information associated with that object. The 'oid' IRIs are used for two purposes. The first is identification of objects such as XSD or ASN.1 specifications, where the only operation is obtaining the identified object from a repository, known by context. In this case, there will normally be only a single 'oid' IRI value that will identify the object in the repository. The second is the retrieval of information associated with any node of the object identifier tree using the OID Resolution System (ORS)..."

Last Call Working Draft: Evaluation and Report Language (EARL) 1.0 Schema
Shadi Abou-Zahra and Michael Squillace (eds), W3C Technical Report

W3C announced the publication of several new or updated working draft documents relating to the "Evaluation and Report Language (EARL) 1.0" specification. "EARL is a machine-readable format for expressing test results. The primary motivation for developing EARL is to facilitate the processing of test results, such as those generated by Web accessibility evaluation tools, using a vendor-neutral and platform-independent format.

A Last Call Working Draft has been published for "Evaluation and Report Language (EARL) 1.0 Schema." Use of this schema "enables any person, software application, or organization to assert test results for any test subject tested against any set of criteria. The test subject might be a Web site, an authoring tool, a user agent, or some other entity. The set of criteria may be accessibility guidelines, formal grammars, or other types of quality assurance requirements. Thus, EARL is flexible with regard to the contexts in which it can be applied."

A First Public Working Draft is available for the "Evaluation and Report Language (EARL) 1.0 Guide." The objectives of this document are: (1) To provide an introduction to the use of EARL and its associated vocabularies in different scenarios; (2) To clarify its key concepts and their representation in the classes and properties of the formal ontology; (3) To explain the usage of different portions of the EARL vocabulary; (4) To show how to aggregate and process EARL reports; (5) To demonstrate how to extend and customize EARL.

Related working drafts are also updated: HTTP Vocabulary in RDF 1.0, Representing Content in RDF 1.0, and Evaluation and Report Language (EARL) 1.0 Requirements.

See also: the W3C news item

Pointer Methods in RDF 1.0
Carlos Iglesias and Michael Squillace (eds), W3C Technical Report

W3C has published a revision of the specification for Pointer Methods in RDF 1.0, updating the working draft of 2009-03-10. The document has been produced by members of the Evaluation and Repair Tools Working Group (ERT WG) as part of the W3C Web Accessibility Initiative (WAI) Technical Activity. The Working Group encourages feedback about this document by developers and researchers who have interest in software-supported evaluation and validation of Web sites, and by developers and researchers who have interest in Semantic Web technologies for content description, annotation, and adaptation. In particular, feedback from the groups involved in the W3C Semantic Web Activity, especially the Semantic Web Coordination Group, the Semantic Web Deployment Working Group, the Semantic Web Interest Group, and the POWDER Working Group, would be greatly appreciated.

Abstract: "This specification contains a framework for representing pointers—entities that permit identifying a portion or segment of a piece of content—making use of the Resource Description Framework (RDF). It also describes a number of specific types of pointers that permit portions of a document to be referred to in different ways. When referring to a specific part of, say, a piece of Web content, it is useful to be able to have a consistent manner by which to refer to a particular segment of a Web document, to have a variety of ways by which to refer to that same segment, and to make the reference robust in the face of changes to that document. The technical content in this specification is part EARL, but can be reused in other contexts too...

The document introduces a vocabulary constructed using the Resource Description Framework (RDF), to enable certain parts within a document, particularly HTML and XML documents, to be pointed to in an accurate way. The document introduces a series of RDF classes and properties that can be used to point to parts of a document in different ways. Note that some pointers may be more appropriated to operate on the character or byte serialization of the resources and others for structured documents, such as XML documents, where character or byte based pointing mechanisms may be considered a bad practice..."

The Evolution of Online Identity
Scott Charney, IEEE Security and Privacy

"When I look out on the horizon to think about emerging Internet trends, I think that as a society we are beginning to see changes that can improve how we manage our identities online. In large part, these changes are necessary because, to reduce online crime, we must significantly improve how we authenticate ourselves on various computer systems...

If we agree that an identity metasystem's benefits outweigh its risks, the challenge is to create this IPPbased identity metasystem. Such a system requires five components. First, for consumers to obtain robust digital credentials, we need organizations capable of conducting IPP. The IPP locations must be ubiquitous but can be either public or private institutions. For example, public (or quasi-public) institutions that currently engage in IPP activities include the Department of Motor Vehicles, which issues not just driver's licenses but identification cards; post offices, which proof identities for passports; schools...

Second, we need organizations to manage identity claims, including revoking certificates when credentials are lost. In some cases, the IPP entity might also issue and manage the IT infrastructure necessary to transmit claims and revoke certificates. Third, we need easy-to-use formats that are supported by widely available technology. For example, magnetic stripes are familiar to consumers, and the security issues associated with such technology might not be problematic if the only data encoded on the stripe is meant to be public—such as data signed with a private key that is meant to be shared and then verified with a public key...

Fourth, we need to ensure social, political, economic, and information technology alignment. For example, at the same time consumers obtain such certificates, governments and businesses must build the infrastructure necessary to consume such identities, and policy makers must create a regulatory framework that advances—or at least does not inhibit—the identity metasystem. Fifth, it must be remembered that criminals are creative, adaptive, and persistent. Therefore, any identity metasystem must have a carefully constructed and comprehensive threat model..."

surfrdf: Object RDF Mapper to Surf the Semantic Web
Peteris Caune, SIOC-Dev Posting

Members of the surfrdf Project have announced the release of SuRF 1.0.0 Beta software. This version includes some significant changes and improvements in interface, thus the major version number shift.

"SuRF is an 'Object - RDF' Mapper based on the popular rdflib python library. It exposes RDF triple sets as sets of resources and integrates them into the Object Oriented paradigm of Python in a similar manner as the ActiveRDF does for Ruby.

New features in the surfrdf 1.0.0 Beta version: (1) Improved resource querying. Can mix any of these features together (filter resources by attribute values; filter resources using SPARQL filter expressions; limit, offset, order ascending/descending specify graph/context where resources should be loaded from and later saved to; eager-load resource attributes) (2) Improved attribute querying. All the querying features available at resource level are also available at attribute level. (3) Growing amount of documentation and examples. Still big gaps there but the situation is improving..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors