Cover Pages: XML Daily Newslink: Monday, 22 November 2010

Newsletter Archive: http://xml.coverpages.org/newsletterArchive.html
A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

What REST is Really About
IETF SIXPAC: SIP Integration with XMPP in Presence Aware Clients
W3C First Public Working Draft: HTML5 Web Messaging
AT&T Ups the Ante in Speech Recognition
MarkLogic Toolkits Enable Collaboration and Sharing in Microsoft Office
First IETF Draft: OAuth Client Instance Extension
Solve Cloud-Related Big Data Problems with MapReduce
People Power Rides 'Internet of Things' to Smart Grid

What REST is Really About
Bob DuCharme, Blog

"I had thought that 'RESTful' meant 'easily accessible with an HTTP GET, even when something isn't HTML'. Shortly after a RESTafarian pointed out that there was more to it than that, I went to Brian Sletten's excellent presentation 'REST: Information Architecture for the 21st Century' at the Semantic Technologies conference and I learned a lot more about what being RESTful implies. During the presentation

I asked Brian whether Roy Fielding's 2000 doctoral thesis that originally laid out what REST was all about was readable, for a PhD thesis, and he assured me that it was. He was right. Anyone with a basic understanding of software architecture issues can and should read Fielding's thesis... Keep in mind that this was published ten years ago, about a century in Internet time. It's more relevant than ever, and I recommend that you put it high on your reading list.

Excerpt: "REST components perform actions on a resource by using a representation to capture the current or intended state of that resource and transferring that representation between components. A representation is a sequence of bytes, plus representation metadata to describe those bytes. Other commonly used but less precise names for a representation include: document, file, and HTTP message entity, instance, or variant... REST is defined by four interface constraints: identification of resources; manipulation of resources through representations; self-descriptive messages; and, hypermedia as the engine of application state...

[I now better appreciate the important role of content negotiation in REST:] This abstract definition of a resource... provides generality by encompassing many sources of information without artificially distinguishing them by type or implementation [and] allows late binding of the reference to a representation, enabling content negotiation to take place based on characteristics of the request. Finally, it allows an author to reference the concept rather than some singular representation of that concept, thus removing the need to change all existing links whenever the representation changes, assuming the author used the right identifier..."

See also: Roy Fielding's dissertation [TOC]

IETF SIXPAC: SIP Integration with XMPP in Presence Aware Clients
Peter Saint-Andre, IETF Draft Charter Proposal

A draft charter for an IETF Working Group 'SIXPAC (SIP Integration with XMPP in Presence Aware Clients)' has been published along with the launch of a new IETF new non-WG mailing list. According to the proposal, the SIXPAC WG will define a small number of SIP and XMPP extensions to solve several use cases in dual-stack endpoints

Overview: "Both the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP) are widely deployed technologies for real-time communication over the Internet. In order to offer a complete suite of features as well as communication across multiple networks, several user-oriented software applications support both SIP and XMPP, and more software developers have expressed interest in building such 'dual-stack' solutions. Unfortunately, it is difficult to provide a good end-user experience in such applications because SIP and XMPP are not 'aware' of each other.

Because both SIP and XMPP are easily extended through new SIP headers and XMPP elements, it should be possible to provide tighter integration within dual-stack SIP/XMPP user agents to improve the user experience. Any such extensions should meet the following criteria: be completely optional and backwards-compatible for all endpoints, and work without changes to deployed infrastructure such as existing SIP and XMPP servers, B2BUAs, firewalls, etc.

The proposed SIXPAC WG will define a small number of SIP and XMPP extensions for dual-stack endpoints: (1) Including SIP-based availability states in XMPP presence—limited to basic presence and availability states only, not the full range of PIDF extensions; (2) Correlating an XMPP IM session with a SIP voice/video session, and vice-versa; (3) Advertising a SIP account address over XMPP and an XMPP account address over SIP..."

W3C First Public Working Draft: HTML5 Web Messaging
Ian Hickson (ed), W3C Technical Report

Members of the W3C Web Applications Working Group have published the First Public Working Draft for HTML5 Web Messaging. This specification defines two mechanisms for communicating between browsing contexts in HTML documents.

Implementors "should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the aforementioned mailing lists and take part in the discussions...

Web browsers, for security and privacy reasons, prevent documents in different domains from affecting each other; that is, cross-site scripting is disallowed. While this is an important security feature, it prevents pages from different domains from communicating even when those pages are not hostile. This section introduces a messaging system that allows documents to communicate with each other regardless of their source domain, in a way designed to not enable cross-site scripting attacks.

Messages in server-sent events, Web sockets, cross-document messaging, and channel messaging use the message event... Authors should check the 'origin' attribute to ensure that messages are only accepted from domains that they expect to receive messages from. Otherwise, bugs in the author's message handling code could be exploited by hostile sites. Furthermore, even after checking the 'origin' attribute, authors should also check that the data in question is of the expected format. Otherwise, if the source of the event has been attacked using a cross-site scripting flaw, further unchecked processing of information sent using the 'postMessage()' method could result in the attack being propagated into the receiver. Authors should not use the wildcard keyword (*) in the 'targetOrigin' argument in messages that contain any confidential information, as otherwise there is no way to guarantee that the message is only delivered to the recipient to which it was intended..."

AT&T Ups the Ante in Speech Recognition
Marguerite Reardon, CNET News.com

"If you've ever been frustrated using a voice activated customer agent or have scratched your head while reading an unintelligible voice-to-text message, AT&T says help is on the way. The company, which has invested more than one million research hours over the past 20 years in speech and language recognition technology, says that it's developed technologies that will not only make these traditional voice activated services more accurate but will extend voice activation to other modes of communication.

Earlier this week, AT&T Labs researchers showed off some of the technologies they've been working on at their labs here. Most of the applications showcased are not yet ready for prime-time commercial use. Researchers said they have no idea when these services will find their way into products. But bits and pieces are already in products developed by AT&T and the company's partners.

Voice activated remotes already exist. But AT&T's technology goes far beyond what's currently available today, said Michael Johnston, a principal researcher at AT&T Labs. Many of these other applications respond to prerecorded commands. AT&T's application not only identifies words, but it also uses other principles of language such as syntax and semantics to interpret and understand the meaning of the request. The system is designed to get more accurate over time as it learns the speech patterns of large numbers of users.

In addition to understanding and correctly interpreting language, AT&T is also developing voice technology that mimics natural voices. Its AT&T Natural Voices technology builds on text-to-speech technology to enable any communication to be spoken in a variety of languages including, English, German, Spanish, French or Italian when text is processed through the AT&T cloud based service..."

MarkLogic Toolkits Enable Collaboration and Sharing in Microsoft Office
Staff, MarkLogic Announcement

"Organizations around the world are constantly creating new content with Microsoft Office. Unfortunately, much of this content falls to the wayside after being used for its original purpose and loses value. MarkLogic Corporation has announced the availability of updated toolkits that allow content authors to quickly access, reuse, and repurpose content created within Microsoft Office. MarkLogic toolkits are available for: (1) Microsoft Word: allows for intelligent information authoring and dynamic assembly for reuse when creating new content. (2) Microsoft Excel: search across spreadsheets and workbooks for text, formulas, and metadata to improve information reuse and discovery. (3) Microsoft PowerPoint: easily create new custom presentations by searching and retrieving information that already exists in your library of presentations, documents, and spreadsheets.

'Many content authors only want to use the tools they are comfortable with,' said Ken Chestnut, vice president of product marketing, MarkLogic. 'With the updated MarkLogic toolkits, content authors can find and reuse content from other documents while allowing them to stay within Microsoft Office. This allows them to spend more time authoring and less time trying to navigate across multiple user interfaces'...

The toolkits and connectors contain customizable applications for Microsoft Word, Excel, and PowerPoint that enable authors to tag, search, and reuse previously created content. Developers can then build rich applications for Microsoft Office to extend the functionality of the software suite. Once a document has been created, it is stored within MarkLogic Server and can be searched and updated to increase collaboration and content sharing between different teams within an organization.

Microsoft Office 2007 features a new native XML file format (known as Office Open XML) for its core applications, Word, Excel, and PowerPoint. Custom-built for high performance XML processing, MarkLogic Server is ideal for Office 2007 files. MarkLogic Toolkit for Word simplifies working with Office Open XML for granular search, dynamic assembly, transformation, and delivery with MarkLogic Server. By leveraging the underlying XML, information applications built with MarkLogic Server and Toolkit for Word can round-trip Word documents without complicated and error-prone conversion between formats..."

First IETF Draft: OAuth Client Instance Extension
Justin Richer (ed), IETF Internet Draft

An initial level -00 IETF Internet Draft was published for the OAuth Client Instance Extension specification. This document defines two client instance extension parameters for OAuth 2.0 user authorization requests.

Details: "This extension to the OAuth 2 protocol defines two additional parameters that a client can include in requests to the user authorization endpoint which can be used to identify an instance of a given client and distinguish it from other instances of the same client that a user may authorize. These are intended to aid in the user experience and to be used in addition to the client identifier as defined in OAuth 2...

A given client identifier may represent more than one access grant for a given user within a system protected by OAuth. For example, a user may authorize the same installed client on both a laptop and a desktop computer. Each of these would have the same client identifier but be issued different tokens and will have been granted access separately. This extension is intended to allow the two client instances to identify themselves to the authorization server in a way that the user could later differentiate which tokens belonged to which copy of the client.

An OAuth client capable of using the web-server flow could allow the user to interact with it through another means such as email or SMS. In this case, the OAuth client is a single entity with a single client ID which in turn could have multiple distinct grants per user. For example, in an email-proxied system, a user could grant access to the email proxy using multiple separate email addresses. In each of these, the client is the proxy itself, but the grant is being made on behalf of a particular email account. This extension is intended to allow the proxy client to identify to the authorization server which address is being requested..."

Solve Cloud-Related Big Data Problems with MapReduce
Noah Gift, IBM developerWorks

At times, you need to be able to access more physical and virtual resources to achieve complex compute-intensive results, but setting up a grid system within an organization can face resource, logistical, and technical hurdles; even some political ones. Cloud computing comes to the rescue in this case. It also combines perfectly with the MapReduce function for handling lots of Big Data computations by making it both transparent and irrelevant where two numbers get added together.

The MapReduce programming mode was developed at Google... One of the reasons for the success of the MapReduce system is that it is designed to be a simple paradigm for writing code that needs to massively parallel. It was inspired by the functional programming aspects of Lisp and other functional languages. A key selling point for MapReduce is its ability to abstract the operational parallelization semantics—how parallel programming works—away from the developer.

This is great if you work at a company that has thousands of machines lying around, but that is almost never the case. And even in the case of an organization that has spare capacity, there are often many technical, political, and logistical hurdles to overcome to set up a grid in that organization. Suddenly, cloud computing becomes a not only obvious, but compelling idea. With cloud, as a developer, you can write a script that provisions any number of machines, runs a MapReduce job, then be charged only for the time you used on each system. This time could be 10 minutes or 10 months, but it is just as simple in either case.

There is no shortage of cloud-based MapReduce options available both as open source and commercial offerings. You can easily take the lessons from this article and apply them to petabytes of logfiles; that is the essence of why the MapReduce abstraction is a useful tool, especially in a cloud environment..."

People Power Rides 'Internet of Things' to Smart Grid
Martin LaMonica, CNET News.com

The best path to energy-efficient electronics is connecting them to the Internet, according to People Power. The Silicon Valley-based company has launched a system that uses embedded networking chips and Internet software, called the Energy Services Platform, to monitor and control plugged-in devices for better efficiency. It says it's working with some business partners and expects its products to be available in the first quarter of next year.

People Power is targeting manufacturers with its networking chip which can be embedded in electronic devices for remote control and monitoring. There are dozens of companies seeking to reduce waste in electronics with energy monitoring and control technologies, with many developing home energy management systems made available through utilities.

People Power, by contrast, is targeting electronics manufacturers, such as Japanese office equipment and appliance manufacturers. It has developed an embeddable networking module that will connect equipment, such as TVs and copy machines, to the Internet over a wireless network. Once connected, electronics can be monitored for power consumption and controlled to improve efficiency. A person could, for example, view how much electricity different plugged-in devices use, turn them on and off from a smartphone, and schedule when to turn them off..."

According to the announcement: "People Power is working with some of the world's leading companies—Ricoh Innovations, Inc., Texas Instruments, D-Link, and others—to reduce energy waste and help their customers save money. ESP works with OSIAN (Open Source IPv6 Automation Network)-enabled devices that have incorporated People Power's SuRF Module, as well as with devices that are Wi-Fi or ZigBee-enabled, to offer an end-to-end solution for energy management. The technology is unique in that it can scale up to support the massive traffic requirements of the 'Internet of Things' while still providing real-time monitoring and control of energy down to the plug level. The company is working with manufacturers to embed its technology into appliances, power strips, electronics and office equipment—all of which will work seamlessly with People Power's new cloud-based Energy Services Platform..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors