Cover Pages: XML Daily Newslink: Wednesday, 29 December 2010

Newsletter Archive: http://xml.coverpages.org/newsletterArchive.html
A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

More from James Clark on MicroXML
OASIS Announces Public Review of OpenDocument Specification
IETF Internet Draft: JSON Web Token (JWT)
Two Voice Specifications Updated: VoiceXML 2.0 and SCXML
Introducing the HTML/XML Task Force
CouchDB for Creating Offline Web Applications on Mobile/Stationary Devices

More from James Clark on MicroXML
James Clark, Blog

"There's been lots of useful feedback to my previous post [about MicroXML as part of the future of XML, where I see a number of different possible directions: (a) XML 2.0, something that is intended to replace XML 1.0, but has a high degree of backward compatibility with XML 1.0; (b) XML.next, something that is intended to be a more functional replacement for XML, but is not designed to be compatible, however, rich enough that there would presumably be a way to translate JSON or XML into it; (b) MicroXML, a subset of XML 1.0 that is not intended to replace XML 1.0, but is intended for contexts where XML 1.0 is, or is perceived as, too heavyweight]...

I summarize my current thinking: It's important to be clear about the objectives. First of all, MicroXML is not trying to replace or change XML. If you love XML just as it is, don't worry: XML is not going away. Relative to XML, my objectives for MicroXML are that it be (1) Compatible: any well-formed MicroXML document should be a well-formed XML document; (2) Simpler and easier: easier to understand, easier to learn, easier to remember, easier to generate, easier to parse; (3) HTML5-friendly, thus easing the creation of documents that are simultaneously valid HTML5 and well-formed XML.

JSON is a good, simple, extensible format for data. But there's currently no good, simple, extensible format for documents. That's the niche I see for MicroXML. Actually, extensible is not quite the right word; generalized (in the SGML sense) is probably better: I mean something that doesn't build-in tag-names with predefined semantics. HTML5 is extensible, but it's not generalized.

There are a few technical changes that I think are desirable. Namespaces: It's easier to start simple and add functionality later, rather than vice-versa, so I am inclined to start with the simplest thing that could possibly work: no colons in element or attribute names (other than xml:* attributes); 'xmlns' is treated as just another attribute. This makes MicroXML backwards compatible with XML Namespaces, which I think is a big win. DOCTYPE declaration: [...] the goal of HTML5-friendliness has to be balanced against the goal of simple and easy and, in this case, I think simple and easy wins. For the same reason, I would leave the DOCTYPE declaration out of the data model. Here's an updated grammar..."

OASIS Announces Public Review of OpenDocument Specification
Staff, OASIS Announcement

The Open Document Format for Office Applications (OpenDocument) Version 1.2 specification (base document and Parts 1-3) is available for public review through January 01, 2011. Editors for this Committee Specification Draft 06 / Public Review Draft 02 include Michael Brauer, David A. Wheeler, Patrick Durusau, Eike Rathke, Robert Weir, and Dennis Hamilton. The specification was produced by members of the OASIS Open Document Format for Office Applications (OpenDocument) TC.

Open Document Format for Office Applications (OpenDocument) Version 1.2 "specifies the characteristics of an XML-based application-independent and platform-independent digital document file format, as well as the characteristics of software applications which read, write and process such documents. This standard is applicable to document authoring, editing, viewing, exchange and archiving, including text documents, spreadsheets, presentation graphics, drawings, charts and similar documents commonly used by personal productivity software applications. This standard has three parts, in adition to the base document.

This standard, for illustrative purposes, describes functionality using terminology common in desktop computing environments that contain a display terminal, keyboard and mouse, attached to a computer hosting an operating system with a graphical user interface which includes user interface controls such as input controls, command buttons, selection boxes, etc. However, the standard is not limited to such environments. It also supports the use of alternative computing environments, other form factors, non-GUI consumers and producers, and the use of assistive technologies, using analogous user interface operations.

Part 1 ('OpenDocument Schema') defines an XML schema for office documents. Office documents includes text documents, spreadsheets, charts and graphical documents like drawings or presentations, but is not restricted to these kinds of documents. The XML schema for OpenDocument is designed for transformations using XSLT and processing with XML-based tools. Part 2, 'Recalculated Formula (OpenFormula) Format, defines a formula language for OpenDocument documents. OpenFormula is a specification of an open format for exchanging recalculated formulas between office applications, in particular, formulas in spreadsheet documents. OpenFormula defines data types, syntax, and semantics for recalculated formulas, including predefined functions and operations. Part 3. 'Packages;, defines the package format for OpenDocument documents. A package file stores the XML content of a document as separate parts together with associated binary data as file entries in a single package file. These file entries may be compressed to further reduce the storage taken by the package. This package is a Zip file... A package may contain multiple sub documents, but only a single document can be contained in the root of the package..."

See also: the OASIS announcement [TOC]

IETF Internet Draft: JSON Web Token (JWT)
Michael Jones, Dirk Balfanz, John Bradley (et al, eds), IETF Internet Draft

IETF has published an initial level -00 Standards Track Internet Draft for the JSON Web Token (JWT) specification. JSON Web Token (JWT) defines a token format that can encode claims transferred between two parties. The claims in a JWT are encoded as a JSON object that is digitally signed.

Details: "JSON Web Token (JWT) is a simple token format intended for space constrained environments such as HTTP Authorization headers and URI query parameters. JWTs encode the claims to be transmitted as a JSON object as defined in RFC 4627 ('The application/json Media Type for JavaScript Object Notation') that is base64url encoded and digitally signed. The suggested pronunciation of JWT is the same as the English word 'jot'... JWT is a string consisting of three JWT Token Segments: the JWT Envelope Segment, the JWT Claim Segment, and the JWT Crypto Segment, in that order, with the segments being separated by period ('.') characters.

As per RFC 4627, the JSON object consists of zero or more name/value pairs (or members), where the names are strings and the values are arbitrary JSON values. These members are the claims represented by the JWT. The JSON object is base64url encoded to produce the JWT Claim Segment. An accompanying base64url encoded JSON envelope object describes the signature method used. The names within the object must be unique. The names within the JSON object are referred to as Claim Names. The corresponding values are referred to as Claim Values.

JWTs contain a signature that ensures the integrity of the content of the JSON Claim Segment. This signature value is carried in the JWT Crypto Segment. The JSON Envelope object MUST contain an "alg" parameter, the value of which is a string that unambiguously identifies the algorithm used to sign the JWT Claim Segment to produce the JWT Crypto Segment... The members of the JSON object represented by the Decoded JWT Claim Segment contain the claims. Note however, that the set of claims a JWT must contain to be considered valid is context-dependent and is outside the scope of this specification. There are three classes of JWT Claim Names: Reserved Claim Names, Public Claim Names, and Private Claim Names..."

See also: the HTML version [TOC]

Two Voice Specifications Updated: VoiceXML 2.0 and SCXML
Staff, W3C Announcement

The W3C Voice Browser Working Group has published updated Working Drafts for two specifications: Voice Extensible Markup Language (VoiceXML) 3.0 and State Chart XML (SCXML): State Machine Notation for Control Abstraction.

"VoiceXML 3.0 is a modular XML language for creating interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

The updated VoiceXML 3.0 WD explains the core of VoiceXML 3.0 as an extensible framework that describes how semantics are defined, how syntax is defined, and how the two are connected together. In this document, the 'semantics' are the definitions of core functionality, such as might be used by an implementer of VoiceXML 3.0. The definitions are represented as English text, SCXML syntax, and/or state chart diagrams. The term 'syntax' refers to XML elements and attributes that are an application author's programming interface to the functionality defined by the 'semantics'. Within this document, all the functionality of VoiceXML 3.0 is grouped into modules of related capabilities. Modules can be combined together to create complete profiles (languages). The document also describes how to define both modules and profiles. In addition to describing the general framework, this document explicitly defines a broad range of functionality, several modules and two profiles..."

SCXML ('State Chart Extensible Markup Language') a general-purpose event-based state machine language that can be used in many ways, including with VoiceXML. It provides a generic state-machine based execution environment based on CCXML and Harel State Tables. SCXML can be used as a high-level dialog language controlling VoiceXML 3.0's encapsulated speech modules (voice form, voice picklist, etc.), or as a a voice application metalanguage, where in addition to VoiceXML 3.0 functionality, it may also control database access and business logic modules. SCXML can also serve as a multimodal control language in the MultiModal Interaction framework, combining VoiceXML 3.0 dialogs with dialogs in other modalities including keyboard and mouse, ink, vision, haptics, etc. It may also control combined modalities such as lipreading (combined speech recognition and vision) speech input with keyboard as fallback, and multiple keyboards for multi-user editing. Similarly, it may serve as the state machine framework for a future version of CCXML..."

Introducing the HTML/XML Task Force
Norman Walsh, W3C Announcement

A W3C announcement was posted for a new HTML/XML Task Force mailing list. This task force was created by the W3C TAG as a way to focus attention on the TAG issue 'HTML-XML-Divergence-67' and was announced by Tim Berners-Lee during his presentation at the Technical Plenary in October 2010.

The task force initially consists of members Robin Berjon, Michael Champion, James Clark, John Cowan, Michael Kay, Yves Lafon (W3C staff contact), Noah Mendelsohn, Henri Sivonen, and Norman Walsh (Task Force Chair).

Walsh writes in the notice: "Clearly there's a lot of room for discussion about precisely how the task force should even consider addressing the problem presented to it. There is, in fact, room for discussion about the precise nature of the problem, though Tim clearly outlined some of the more apparent issues in his TPAC slides.

The task force had its first meeting and began to wrestle with those questions as a necessary first step towards progress... Hopefully this is the beginning of a process that will lead us to a place of mutual understanding and consensus about how to build (at least parts of) the web of the future..."

CouchDB for Creating Offline Web Applications on Mobile/Stationary Devices
Dietmar Krueger, IBM developerWorks

One of the greatest challenges for mobile applications is the synchronicity of data. An interesting solution to the problem is to use the NoSQL database CouchDB. [Apache] CouchDB, a document-oriented database, is an alternative to SQL databases. With CouchDB you can use cloud functions on mobile devices, work offline with a locally deployed application on a local data storage, and share data with the rest of the cloud when going online again. In this article, learn the CouchDB concepts by creating and deploying a sample application.

This article is a technical presentation on creating offline applications with CouchDB. A prototype of a simple inventory management application demonstrate the CouchDB technology with JSON storage and standard synchronization facilities. A similar application based on HTML5 concepts was introduced in a previous article covering the creation of offline Web applications on mobile devices with HTML5, but synchronization was not addressed. For this article, I migrated the application using storage and standard synchronization facilities of the CouchDB environment.

There are four major components of the sample application architecture of the CouchDB and the HTML5/SQL solution from the previous article: HTML, JavaScript, local data storage, and remote data storage. The core of the HTML5 and CouchDB application. It has the model role and contains the displayed data and the (default) render information. The HTML elements of the page are organized in a hierarchy of the HTML Document Object Model (DOM) tree. The JavaScript component contains the controller functions of the HTML5 and CouchDB application. HTML elements are bound via event handlers to JavaScript functions. JavaScript can access the HTML DOM tree of the application with all user interface elements and can use it as data input for computation...

As to local data storage: the SQL database of the HTML5 application is based on a schema and uses joins to combine data from multiple tables. The data storage of a CouchDB application has no schemas; the documents are stored and retrieved as JSON documents. There is no need to assemble data using joins. for remote data storage, the application infrastructure consists of a network of data storage nodes replicating to each other. In the world of relational SQL databases, it is necessary to write or manage complicated replication infrastructure. In the NoSQL CouchDB architecture, a default replication framework is provided. Actually performing the merge of conflicting documents is an application-specific function..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors