Cover Pages: XML Daily Newslink: Monday, 22 March 2010

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

OASIS Public Review: Telecom SOA Requirements Version 1.0
Open Data Protocol (OData): Welcome to the New OData.org
The Salmon Protocol for Unifying Web Conversations
Using International Standard Book Numbers as Uniform Resource Names
Working Draft for Symmetric Key Services Markup Language (SKSML) V1.0
Is SPIN the Schematron of RDF?
W3C Widget Configuration and Packaging
The 'application/tei+xml' Media Type
Museum Data Exchange: Learning How to Share

OASIS Public Review: Telecom SOA Requirements Version 1.0
Enrico Ronco (ed), OASIS Public Review Draft

Members of the OASIS SOA for Telecom (SOA-Tel) Technical Committee have released an approved Committee Draft of Telecom SOA Requirements Version 1.0 for public review through May 18, 2010. Previously, the TC produced the document Telecom SOA Use Cases and Issues Version 1.0. This OASIS TC was chartered in November 2008 to "identify gaps in standards coverage for using Service Oriented Architecture (SOA) techniques in a telecom environment; particularly for Telecom operators/providers. The combined term 'provider/operator' means a company that utilizes a telecoms network to provide service to the subscriber community, and they may or may not own the network assets or services they are providing."

The "Requirements" specification "collects requirements to address technical issues and gaps of SOA standards (specified by OASIS and other SDOs) utilized within the context of Telecoms. For each of the issues within such document, specific requirements are provided. Where possible, non prescriptive solution proposals to the identified issues and requirements are also described, in order to possibly assist those Technical Committees (within OASIS and other SDOs) responsible for the development and maintenance of the SOA related standards.

Section 2 of this "Requirements" document presents 'Issues on Intermediaries Handling': "Some existing specifications upon which Service Oriented Architectures are currently based on and implemented (such as W3C's WS-Addressing, W3C's SOAP, OASIS's WS-Notification) do not consider the presence of intermediaries in the specified message exchange patterns (in the transactions between the actors that implement the services), or they don't consider the possible situations in which such intermediaries can be involved. For this reason, intermediaries handling within SOA implementations is currently achieved via workarounds or proprietary solutions..." Section 3 presents 'Issues on Security'; Section 4 presents 'Issues on Management'; Section 4 presents 'Issues on SOA collective standards usage'.

The Telecom SOA Use Cases and Issues Version 1.0 document was the first deliverable produced within the OASIS SOA for Telecom (SOA-TEL) TC. Its objective is to collect potential technical issues and gaps of SOA standards (specified by OASIS and other SDOs) utilized within the context of Telecoms.

Open Data Protocol (OData): Welcome to the New OData.org
Staff, OData.org Web Site Announcement

Developers of the Open Data Protocol (OData) have announced the launch of a new web site with access to: "(1) The OData SDK, including sample OData services, client libraries for most platforms, server libraries for the .NET Framework and great samples. (2) An overview of the OData protocol, both from the technology and the scenario points of view, as well the official specifications themselves. (3) A comprehensive set of articles on how to get started with OData across platforms and languages. (4) A representative set of OData producers for you to use to test your client-side tools and to get a feel for the range of uses that OData supports. (5) A collection of OData consumer tools and technologies for you to use against existing OData producers or your own new ones. (6) The OData blog managed by the Microsoft Data Services team and containing OData-related articles and links for OData everywhere. (7) A growing list of Frequently Asked Questions about OData-related tools and technologies...

Another key goal of this site is to foster the OData community, and to be as open and responsive to community suggestions as possible. To make this a reality we plan to create and host a Mailing list and archive, as well as a publically editable Wiki. But we want your opinion, is this the right thing to do? At the same time we are looking to engage with IETF and W3C to explore how to get broader adoption of the OData extentions and conventions..."

The Open Data Protocol (OData) is an open protocol for sharing data. It provides a way to break down data silos and increase the shared value of data by creating an ecosystem in which data consumers can interoperate with data producers in a way that is far more powerful than currently possible, enabling more applications to make sense of a broader set of data. Every producer and consumer of data that participates in this ecosystem increases its overall value... OData follows many of the principles of REST, where REST is a software architectural style for distributed hypermedia systems like the World Wide Web... The simplest OData service can be implemented as simply as a static file that follows the OData ATOM or JSON payload conventions.. A specific client library is not necessary to consume an OData feed because all interactions with an OData feed are done using URIs to address resources and standard HTTP verbs (GET, POST, PUT, DELETE, etc) to act on those resources. Therefore, any platform with a reasonably complete HTTP stack is enough to make communicating with a data service simple. That said, a number of client libraries are available which allow for development at a higher level of abstraction..."

Rationale: "There is a vast amount of data available today and data is now being collected and stored at a rate never seen before. Much, if not most, of this data however is locked into specific applications or formats and difficult to access or to integrate into new uses. Public data is often unfortunately held private or needlessly buried behind random, inefficient, and cumbersome interfaces. Open Data is a general movement for all types of data, scientific or other, enabling data to be accessible programmatically for public or commercial use. The Open Data Protocol (OData) provides a way to unlock your data and free it from silos that exist in applications today, making it easy for data to be shared in a manner that follows the philosophy of Open Data. OData enables a new level of data integration across a broad range of clients, servers, services, and tools..."

The Salmon Protocol for Unifying Web Conversations
John Panzer, Salmon Protocol Announcement

Developers of the community specification for The Salmon Protocol have announced the release of a stable (reviewable) version of the protocol in draft format. The Salmon Protocol is "an open, simple, standards-based solution that lets aggregators and sources unify the conversations. It focuses initially on public conversations around public content...

Salmon is in fact based on and compatible with AtomPub. Salmon greatly enhances interoperability and usability by specifying a distributed identity mechanism for identifying the author and intermediary involved, provides a discovery mechanism, and specifies how things should be linked together. By not requiring pre-registration or federation but still allowing for verifiable identification, it provides a usable, level playing field for all parties involved...

Conversations are becoming distributed and fragmented on the Web. Content is increasingly syndicated and re-aggregated beyond its original context. Technologies such as RSS, Atom, and PubSubHubbub allow for a real time flow of updates to readers, but this leads to a fragmentation of conversations. The comments, ratings, and annotations increasingly happen at the aggregator and are invisible to the original source...

As updates and content flow in real time around the Web, conversations around the content are becoming increasingly fragmented into individual silos. Salmon aims to define a standard protocol for comments and annotations to swim upstream to original update sources—and spawn more commentary in a virtuous cycle. It's open, decentralized, abuse resistant, and user centric..."

Using International Standard Book Numbers as Uniform Resource Names
Maarit Huttunen and Alfred Hoenes (eds), IETF Internet Draft

An initial version -00 (*bis) IETF Internet Draft has been published for the specification Using International Standard Book Numbers as Uniform Resource Names. From the abstract: "The International Standard Book Number, ISBN, is a widely used identifier for monographic publications. Since 2001, there has been a URN (Uniform Resource Names) namespace for ISBNs. The namespace registration was performed in RFC 3187 and applies to the ISBN as specified in the original ISO Standard 2108-1992. To allow for further growth in use, the successor ISO Standard, ISO 2108-2005, has defined an expanded format for the ISBN, known as 'ISBN-13'. This document replaces RFC 3187 and defines how both the old and new ISBN standard can be supported within the URN framework and the syntax for URNs defined in RFC 2141. An updated namespace registration is included, which describes how both the old and the new ISBN format can share the same namespace.

This draft version is the outcome of work started in 2008 and brought to the IETF as a contribution to a much larger effort to revise the basic URN RFCs, in order to bring them in alignment with the current URI Standard (IETF STD 63, RFC 3986), ABNF, and IANA guidelines, and to establish a modern URN resolution system for bibliographic identifiers...

As a rule, ISBNs identify finite, manageably-sized objects, but these objects may still be large enough that resolution into a hierarchical system is appropriate. The materials identified by an ISBN may exist only in printed or other physical form, not electronically. In such a case, the URN:ISBN resolver should nevertheless be able to supply bibliographic data, possibly including information about where the physical resource is stored in the owning institution's holdings. There may be other resolution services supplying a wide variety of information resources or services related to the identified books. National libraries shall be among the organizations providing persistent URN resolution services for monographic publications, independent of their form.

Authors' note: "This work is part of the PersID project to establish an international network of stable resolution services for Persistent Identifiers (URNs), in particular for Bibliographic identifiers. To give this work a more solid ground, updates to the basic URN-related documents (RFC 2141 and RFC 3406) and URN namespace-specific RFCs are planned, and brought to the IETF, in order to bring these documents in alignment with current IETF Full Standards and IANA procedures. The ultimate goal is to establish a dedicated 'urnbis' WG in the IETF, hopefully by this summer, with an ambitious schedule to bring revised documents initially targetting PS / BCP status to the IESG, with the goal of fast progression of RFC 2141-bis on the Standards Track.

Working Draft for Symmetric Key Services Markup Language (SKSML) V1.0
Anil Saldhana (et al, eds), OASIS Working Draft

Members of the OASIS Enterprise Key Management Infrastructure (EKMI) Technical Committee have updated the Symmetric Key Services Markup Language (SKSML) Version 1 specification based upon feedback from OASIS members. This TC was chartered to "define symmetric key management protocols, including those for: requesting a new or existing symmetric key from a server; requesting policy information from a server related to caching of keys on the client; sending a symmetric key to a requestor, based on a request; sending policy information to a requestor, based on a request; other protocol pairs as deemed necessary."

SKSML is "an XML-based messaging protocol, by which applications executing on computing devices may request and receive symmetric key-management services from centralized key-management servers, securely, over networks. Applications using SKSML are expected to either implement the SKSML protocol, or use a software library—called the Symmetric Key Client Library (SKCL)—that implements this protocol. SKSML messages are XML messages that can be transported safely between the server and the client either using strong Transport layer security and/or XML encryption or within a SOAP layer, protected by a Web Services Security (WSS) header...

SKSML uses XML for encapsulating its requests and responses and can thus, be used on any platform that supports this protocol. Using a scheme that concatenates unique Domain identifiers (Private Enterprise Numbers issued by the IANA), unique SKS Server identifiers within a domain and unique Key identifiers within an SKS server, SKSML creates Global Key Identifiers (GKID) that can uniquely identify symmetric keys across the internet. SKSML relies on RSA crytographic key-pairs and digital certificates enabling XML encryption (confidentiality) and XML signatures (for authenticity and message integrity). Using secure key-caching enabled through centrally-defined policies, SKSML supports the request and receipt of KeyCachePolicy elements by clients for the use of symmetric encryption keys even when the client is disconnected from the network and an SKS server.

SKSML provides significant flexibility for defining policies on how symmetric encryption keys may be used by client applications. The KeyUsePolicy element allows Security Officers to define which applications may use a specific key, days and times of use, location of use, purpose of use, key-sizes, encryption algorithms, etc..."

See also: the OASIS EKMI TC overview

Is SPIN the Schematron of RDF?
Bob DuCharme, Blog

"Christian Fuerber and Martin Hepp have published a paper titled 'Using SPARQL and SPIN for Data Quality Management on the Semantic Web' for the 2010 Business Informations Systems conference in Berlin. TopQuadrant's Holger Knublach designed SPIN, or the SPARQL Inferencing Notation, as a SPARQL-based way to express constraints and inferencing rules on sets of triples, and Fuerber and Hepp have taken a careful, structured look at how to apply it to business data.

I knew that data quality was a specific discipline within IT, but I hadn't looked at it very closely. This paper gives a nice overview of the area before moving on to describing their work. It also describes the value that a systematic approach to data quality can bring to semantic web applications, but I don't think anyone needs any convincing there; it's often the first issue people bring up when they hear about the very idea of Linked Data on the web."

Abstract for 'Using SPARQL and SPIN for Data Quality': The quality of data is a key factor that determines the performance of information systems, in particular with regard to: (1) the amount of exceptions in the execution of business processes and (2) the quality of decisions based on the output of the respective information system. Recently, the Semantic Web and Linked Data activities have started to provide substantial data resources that may be used for real business operations. Hence, it will soon be critical to manage the quality of such data. Unfortunately, we can observe a wide range of data quality problems in Semantic Web data.

In this paper, we (a) evaluate how the state of the art in data quality research fits the characteristics of the Web of Data, (b) describe how the SPARQL query language and the SPARQL Inferencing Notation (SPIN) can be utilized to identify data quality problems in Semantic Web data automatically and this within the Semantic Web technology stack, and (c) evaluate our approach..."

See also: the conference paper

W3C Widget Configuration and Packaging
Nathan Good, IBM developerWorks

"The W3C Widget Packaging and Configuration specification is an emerging specification for configuring, packaging, and deploying widgets. W3C widgets are components that are made up of HTML, cascading style sheets (CSS), JavaScript files, and other resources such as images. You can use widgets in devices for small applications such as calendars, weather reports, chats, and more.

One advantage of using widgets rather than normal Web applications is that they can be downloaded once and used many times after that, just like non-Web applications that are installed on a device. This allows users to save on bandwidth because the only data they transfer is the data used by the widget and not the widget files themselves. Widgets often provide a rich user experience, such as interactive calendars and even games. You can use widgets in mobile devices, where the advantage of downloading the widget once and using it over and over can save on data transfer costs.

As of January 2010, the W3C 'Widget Packaging and Configuration' specification is in candidate recommendation state. This means that the W3C believes the specification is in a stable state and encourages developers to create implementations of the specification. The goal of the W3C widget specification is to propose a standard method for building and packaging widgets. There are currently many different vendors that have widgets, and almost all of them implement their own proprietary application program interface (API) and packaging format.

This article introduces the W3C Packaging and Configuration specification, showing you how you can package HTML, CSS, and JavaScript files into a widget that can be deployed to a device that implements the W3C widget specification. Because this is an emerging specification, the implementation choices for devices that render the widgets are limited. If you want to see the widgets in action, you need to download some extra applications if you don't already have them installed..."

The 'application/tei+xml' Media Type
Laurent Romary and Sigfrid Lundberg (eds), IETF Internet Draft

Members of the Text Encoding Initiative (TEI) consortium have published an initial level -00 specification to define the 'application/tei+xml' Media Type. The memo defines the 'application/tei+xml' media type for markup languages defined in accordance with the Guidelines for Text Encoding and Interchange. This new media type is being defined in order to increase the possibilities for generic XML processing of TEI XML-encoded information.

TEI is a consortium which "collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics. Since 1994, the TEI Guidelines have been widely used by libraries, museums, publishers, and individual scholars to present texts for online research, teaching, and preservation. In addition to the Guidelines themselves, the Consortium provides a variety of supporting resources, including resources for learning TEI, information on projects using the TEI, TEI-related publications, and software developed for or adapted to the TEI...

Like other markup languages, the TEI language defines a 'tag set' of XML 'elements' that are used to encode texts, along with 'attributes' used to modify the elements. Because the TEI Guidelines seek to provide a framework for encoding (in theory) any genre of text from any period in any language, the full TEI tag set is extremely rich, consisting of nearly 500 elements (by comparison, DocBook has around 400, XHTML 1.0 around 90). In practice, most TEI users routinely use a much smaller subset of the full language. For example, the documentation section you are reading was composed in TEI using about 30 unique tags.

Markup elements in the TEI tag set fall into two broad categories, those used to capture 'metadata' about the text being encoded (authorship and responsibility, bibliographical information, manuscript description, revision history, etc.), and those used to encode the structural features of the document itself, such as sections, headings, paragraphs, quotations, highlighting, and so on...

Museum Data Exchange: Learning How to Share
Guenter Waibel, Ralph LeVan, Bruce Washburn; D-Lib Magazine

"The Museum Data Exchange is a project funded by the Andrew W. Mellon Foundation, which brought together nine art museums and OCLC Research to model data sharing in the museum community. The project created tools to extract CDWA Lite XML records out of collections management systems and share them via OAI-PMH. OCLC Research harvested 900K records from participating museums and analyzed them for standards conformance and interoperability. This article describes the free or open source tools; lessons learned in harvesting museum data; findings from the data analysis; and the state of data sharing and its applications in the museum community...

The Museum Data Exchange (MDE) project outlined in this paper attempts to lower the barrier for adoption of this data sharing strategy by providing free tools to create and share CDWA Lite XML descriptions, and helped model data exchange with nine participating museums... The suite of tools which emerged as part of the MDE project includes COBOAT and OAICatMuseum 1.0. COBOAT is a metadata publishing tool developed by Cogapp (a digital media company headquartered in the UK) as a by-product of many museum contracts which required accessing and processing data from collections management systems. It transfers information between databases (such as collections management systems) and different formats. As part of this project, Cogapp created an open-source plug-in module which trained the tool to convert data into CDWA Lite XML.

OAICatMuseum 1.0 is an OAI-PMH data content provider supporting CDWA Lite XML which allows museums to publish the data extracted with COBOAT. While COBOAT and OAICatMuseum can be used separately, they do make a handsome pair: COBOAT creates a MySQL database containing the CDWA Lite XML records, which OAICatMuseum makes available to harvesters... Beyond CDWA Lite, OAICatMuseum also offers Dublin Core for harvesting, as mandated by the OAI-PMH specification. The application creates Dublin Core from the CDWA Lite source data on the fly via a stylesheet..."

The schema is defined in "CDWA Lite: Specification for an XML Schema for Contributing Records via the OAI Harvesting Protocol." CDWA Lite is "an XML schema to describe core records for works of art and material culture based on the Categories for the Description of Works of Art (CDWA) and Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images (CCO). The purpose of this schema is to describe a format for core records for works of art and material culture, based on the data elements and guidelines contained in the CDWA and CCO. (CCO is based on a subset of the CDWA categories and VRA Core.) CDWA Lite records are intended for contribution to union catalogs and other repositories using the Open Archives Initiative (OAI) harvesting protocol. Elements 1 through 19 in this schema are for descriptive metadata, based on CDWA and CCO..."

See also: the XML Schema


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors