Cover Pages: XML Daily Newslink: Wednesday, 01 September 2010

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

Open Publishing Distribution System (OPDS) Catalogs Version 1.0 Release
Using the Common Information Model for Building Semantic Services
OASIS Transformational Government Framework Technical Committee
W3C Publishes Voice Extensible Markup Language (VoiceXML) 3.0
OGC Issues CFP for OWS Shibboleth/SAML Interoperability Experiment
K-Anonymity Privacy Protection Model Needs a Little Help
IESG Last Call Review for 'application/tei+xml' Media Type
ActiveState Adds Python Modules: GUIs, Databases, Cryptography

Open Publishing Distribution System (OPDS) Catalogs Version 1.0 Release
Peter Brantley, Blog

"The open ebook community and the Internet Archive are pleased to announce the release of the first production version of the Open Publishing Distribution System (OPDS) Catalog format for digital content. OPDS Catalogs are an open standard designed to enable the discovery of digital content from any location, on any device, and for any application.

Based on the widely implemented Atom Syndication Format, OPDS Catalogs have been developed since 2009 by a group of ebook developers, publishers, librarians, and booksellers interested in providing a lightweight, simple, and easy to use format for developing catalogs of digital books, magazines, and other content. OPDS is a profile of the Atom Syndication Format (IETF RFC 4287) that allows ebook publishers to share URLs for ebooks and the metadata about them.

If you are familiar already with Atom, OPDS basically provides some new link relations that lets OPDS aware clients identify URLs where ebooks can be downloaded from. While it doesn't specifically leverage an RDF serialization, it is fundamentally about linking library-land data on the web, and typed links at that... An OPDS Catalog is a set of one or more Atom Feeds, which are themselves listings of Atom Entries. The Atom Feeds that make up all OPDS Catalogs can be divided into two groups: Navigation Feeds, which create a hierarchy for browsing, and Acquisition Feeds, which list available electronic publications. Each Atom Entry in an Acquisition Feed includes basic metadata about the publication, the publication's format, and how the publication can be acquired. These included Atom Entries may be minimal subsets of the complete metadata, with a link to a more extensive, standalone representation URI...

OPDS Catalogs are the first component of the Internet Archive's BookServer Project, a framework supporting open standards for discovering, lending, and vending books and other digital content on the web... OPDS Catalogs, which are easily produced from simple descriptive metadata, can be harvested by search engines and aggregated by online retailers; their design supports independent reading systems, bookstores, the development of portable bookshelves, and other applications facilitating the use of digital materials..."

Using the Common Information Model for Building Semantic Services
Boris Lublinsky, InfoQueue

"One of the requirements for a successful SOA implementation is semantic interoperability of the service's messages. This typically requires basing service messaging/interfaces on the industry-specific Common Information Model (CIM). Because the CIM is abstract, in the majority of cases this CIM has to be extended for various reasons from necessity to include additional data elements for integration reasons to adding elements required for carrying out business logics by the Service component...'

According to a recent developerWorks presentation: 'SOA based architectural styles gets complex when combined with Common information model (CIM) because in reality the designer needs to extend the CIM for various reasons spanning the spectrum of including additional data elements for integration reasons to including elements required for carrying out business logics by the Service component. When the impact of such extensions on the core CIM is understood, it becomes easy to manage...

In a typical SOA based environment, there exists the core and extended CIM. The extended CIM in turn can be dichotomized as weaker and weakest CIM. Each layer possesses own category of entry points, has unique characteristics and needs to be maintained separately. Note that a service may typically encompass entities from all these three layers...

[For example] the CORE CIM layer comprises of the raw industry published models such as the TeleManagement Forum's Shared Information Data model (SID) in telecommuication industry and the IBM financial services model names the IFW in finance industry which contains the abstract entities with attributes and relationships. The weaker CIM layer encompasses the extended CIM entities which convey pure business meaning for existence -- 'business meaning' because the modeler needs to always visualize the manifestation of these elements in the Service interface and not include unless otherwise warranted from business perspective, further represent them through a well abstracted terminology. In the weakest CIM layer, objects, attributes, interfaces, and methods are required purely for integration purposes. For example the message header related entities, entities which are required for controlling the results, connecting the service operations and the weaker/core CIM layers..."

See also: the developerWorks article

OASIS Transformational Government Framework Technical Committee
Staff, OASIS Announcement

OASIS Members have submitted a draft TC charter to establish the OASIS Transformational Government Framework Technical Committee. According to the 'Statement of Purpose': "At a time when virtually every government is now an 'e-government' — with websites, e-services and e-government strategies proliferating around the world, even in the least developed countries—it is now clear that Information and Communication Technology is no magic bullet. Duplicated IT expenditure, wasted resources, no critical mass of users for online services, and limited impact on core public policy objectives—this has been the reality of many countries' experience of e-Government...

The work of the TC will be to define an overall framework that encompasses these rules, principles and processes. And supporting the framework there will be a need for the TC to identify and provide some use cases and guidance on adoption as there is more than one entry point into making the transformation.

The major deliverable will be a Framework for Transformational Government. Included in this Framework will be: (1) a Transformational Government Reference Model; (2) definitions of a series of policy products necessary to implement the change; (3) a value chain for citizen service transformation; (4) a series of guiding principles; (5) a business model for change; (6) a delivery roadmap; (7) a checklist of critical success factors. Supporting this Framework will be a number of Use Cases and other guidance advice on its adoption.

Anticipated audience or users of the work include: government and other public sector officials responsible for eGovernment policy, strategy, and implementation; other public or global Institutions that provide advice and guidance on implementing eGovernment Progammes; providers of software and services to Governments..."

W3C Publishes Voice Extensible Markup Language (VoiceXML) 3.0
Scott McGlashan, Daniel Burnett, Rahul Akolkar (et al, eds), W3C Technical Report

Members of the W3C Voice Browser Working Group have published an updated Working Draft for Voice Extensible Markup Language (VoiceXML) Version 3.0. Voice XML is used to create interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. In this Working Draft a 'Revised Legacy' profile description is provided to match current WG thinking, Section 5.4 'SIV Resource' is removed (since it is now covered along with the recognition resource in section), and the Event Model of Section 4.4 had been revised to match the WG members' current thinking about DOM events as the underlying model for all flow control. Open issues are highlighted in the diff-marked version of the specification.

VoiceXML 3.0 "explains the core of VoiceXML 3.0 as an extensible framework—how semantics are defined, how syntax is defined and how the two are connected together. In this document, the "semantics" are the definitions of core functionality, such as might be used by an implementer of VoiceXML 3.0. The definitions are represented as English text, SCXML syntax, and/or state chart diagrams. The term "syntax" refers to XML elements and attributes that are an application author's programming interface to the functionality defined by the "semantics".

Within the Core document, all the functionality of VoiceXML 3.0 is grouped into modules of related capabilities. Modules can be combined together to create complete profiles (languages). This document describes how to define both modules and profiles. In addition to describing the general framework, this document explicitly defines a broad range of functionality, several modules and two profiles...

In Version 3.0, the Voice Browser Working Group has developed the detailed semantic descriptions of VoiceXML functionality that versions 2.0 and 2.1 lacked. The semantic descriptions clarify the meaning of the VoiceXML 2.0 and 2.1 functionalities and how they relate to each other. Detailed semantics for new functionality are now defined: new functions include, for example, speaker identification and verification, video capture and replay, and a more powerful prompt queue. These semantic descriptions for these new functions are also represented in this document as English text, UML state chart visual diagrams and/or textual SCXML representations... Organization of functionality into modules makes it easier to understand what happens when modules are combined or new ones are defined. In contrast, VoiceXML 2.0 and 2.1 had a single global semantic definition (the FIA), which made it difficult to understand what would happen if certain elements were removed from the language or if new ones were added..."

OGC Issues CFP for OWS Shibboleth/SAML Interoperability Experiment
Staff, Open Geospatial Consortium Announcement

"The Open Geospatial Consortium has issued a call for participation in the OGC Web Services (OWS) Shibboleth Interoperability Experiment (IE). This OGC Interoperability Experiment is designed to advance best practice for implementing standards on federated security in transactions involving geospatial data and services.

The Interoperability Experiment, initiated by OGC Members Cadcorp, EDINA, and Snowflake Software, will demonstrate use of Security Assertion Markup Language (SAML) with OGC Web Services, including use of Shibboleth. This IE will build on practices from the European Spatial Data Infrastructure Network (ESDIN) project and on results from previous OGC initiatives on authentication.

Shibboleth is an open source software package released by the Internet2 Consortium based on the SAML standard from OASIS. European National Mapping Agencies and leading European universities have been advancing the use of Shibboleth in operational spatial data infrastructures as part of the ESDIN project.

A 'birds of a feather' informal discussion meeting will take place during the week of the OGC Technical Committee meeting in Toulouse, France, 20-23 September 2010. The virtual kickoff meeting will take place on September 30, 2010. To finalize this activity, a best practice report will be presented at the OGC Technical Committee meeting in Sydney, Australia, 29 November - 03 December 2010..."

See also: The Shibboleth System

K-Anonymity Privacy Protection Model Needs a Little Help
Loring Wirbel, ACM News

"A new method of providing anonymity to large data sets has raised excitement in realms as diverse as social networks and medical records. But the 'K-anonymity' protection model, in which so-called 'nonkey attributes' (gender, Zip code, birth date, etc.) are suppressed or generalized, appears to need a little help.

Some researchers propose using it in conjunction with complementary methods such as 'L-diversity.' External Link A group based at Stanford University has proposed adding a clustering method to K-anonymity to broaden its appeal in modern networks. In fact, other derivatives of K-anonymity already are being proposed for real-time privacy enablers in social networks such as Facebook...

The problem addressed by K-anonymity arose due to the unexpected power of large-scale data mining. In order to identify epidemics or predict purchasing patterns, researchers often publish charts with survey results listing 'quasi-identifiers,' or information such as gender that was not considered central to learning a user's identity. It turns out that when multiple quasi-identifiers are displayed, a unique individual corresponding to that set can be found in 87 percent of cases...

Meanwhile, a group at Cornell University has defined a way to re-define queries to databases to preserve diversity by modeling background knowledge as a probability space... Will any of these methods have near-term payoff? [Researchers] at The University of Colorado at Boulder already have developed a K-anonymity tool for the Facebook API called 'Social-K,' External Link in which K-anonymity rules are modified to adjust to the real-time nature of privacy tools within Facebook. The CU Boulder researchers were looking for the special case of Facebook to Netflix links..."

IESG Last Call Review for 'application/tei+xml' Media Type
Laurent Romary and Sigfrid Lundberg (eds), IETF Internet Draft

The Internet Engineering Steering Group (IESG) has received a request to consider the specification for The 'application/tei+xml mediatype an an IETF Informational RFC. The IESG plans to make a decision in the next few weeks, and solicits final comments on this action; please send substantive comments to the IETF mailing lists by 2010-09-30.

"The Text Encoding Initiative Consortium is an international organization whose mission is to develop and maintain guidelines for the digital encoding of literary and linguistic texts. The Consortium publishes the Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange: an international and interdisciplinary standard that is widely used by libraries, museums, publishers, and individual scholars to represent all kinds of textual material for online research and teaching...

The TEI Guidelines are the most significant output of the TEI Consortium's work, and the TEI is committed to disseminating them widely. They are published online at the TEI web site as HTML, PDF, and XML source, and also in print through the University of Virginia Press. The Guidelines are published in English, but their central components are being translated into a number of languages including French, Spanish, German, Chinese, and Japanese in the short term, and extending to include Hindi, Italian, Polish, Romanian, and Slovenian in the longer term..."

Registration of the 'application/tei+xml' media type is intended to increase the possibilities for generic XML processing of TEI documents on the Internet. The IETF I-D defines the 'application/tei+xml' media type in accordance with IETF RFC 3023. By virtue of TEI XML content being XML, it has the same considerations when sent as 'application/tei+xml' as does XML in general... According to the the TEI Guidelines, "att.internetMedia in the TEI Infrastructure module provides attributes for specifying the type of a computer resource using a standard taxonomy. In addition to global attributes, the 'mimeType' attribute (MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type: status is mandatory when applicable and data type is 'data.word'. This attribute class "provides attributes for describing a computer resource, typically available over the internet, according to standard taxonomies. At present only a single taxonomy is supported, the Multipurpose Internet Mail Extensions Media Type system. This system of typology of media types is defined by the Internet Engineering Task Force in RFC 2046. The list of types is maintained by the Internet Assigned Numbers Authority..."

ActiveState Adds Python Modules: GUIs, Databases, Cryptography
Staff, ActiveState Announcement

"ActiveState, dynamic language experts offering solutions for Perl, Python, and Tcl, is adding key Python open source packages to its ActivePython Business, Enterprise, and OEM Editions specifically to help enterprise developers. Python modules have been added for Graphical User Interface (GUI) development, secure connections with a wider range of proprietary and open source databases and incorporation of core cryptographic capabilities to ensure secure, authenticated connections to databases, servers and web services.

The newly released modules include M2Crytpo, PyQt, wxPython, popular database connectors for PostgreSQL, MySQL pyODBC and well known proprietary databases. In sum, these additions provide enterprise developers with highly popular, widely used modules for securely extending and enhancing Python development projects in the enterprise and across the cloud.

M2Crypto is one of the primary Python tools for providing security and is the most complete Python wrapper for OpenSSL. OpenSSL is a cryptographic library that provides implementations of the industry's best-regarded algorithms including encryption algorithms such as 3DES ('triple DES'), AES and RSA, as well as message digest algorithms and message authentication codes. ActivePython Business and Enterprise Editions include service level guarantees to support business-critical systems

PyQt and wxPython are two of the most popular GUI Toolkits in the Python developer community. PyQt is a blending of Python programming language and the successful Qt library and is extremely useful as a rapid prototyping tool for applications. PyQt implements around 300 classes and over 5,750 functions and methods..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors