Cover Pages: XML Daily Newslink: Thursday, 10 June 2010

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Oracle Corporation http://www.oracle.com

Headlines

W3C Invites Comment on XSLT Version 2.1 Requirements and Use Cases
The Ten Most Common XSLT Programming Mistakes
Last Call Review: Mathematical Markup Language (MathML) Version 3.0
IESG Approves Common YANG Data Types Specification as Proposed Standard
Toward a Saner Standards Process
Cisco, With Newly Merged Tandberg, Pushes Telepresence Standard
Why Bill Gates Is Embracing Clean Technology

W3C Invites Comment on XSLT Version 2.1 Requirements and Use Cases
Petr Cimprich (ed), W3C Technical Report

Technical comment is invited for the W3C First Public Working Draft of Requirements and Use Cases for XSLT 2.1. This specification was produced by members of the W3C XSL Working Group, which is part of the XML Activity. The Working Group expects to eventually publish this document as a Working Group Note. The Working Group requests that errors in the document be reported using W3C's public Bugzilla system; please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make... XSLT Version 2.1 is a language for transforming XML documents into other XML documents, and constitutes a revised version of the XSLT 2.0 Recommendation published on 23-January-2007. The primary purpose of the changes in this version of the language is to enable transformations to be performed in streaming mode, where neither the source document nor the result document is ever held in memory in its entirety. XSLT 2.1 is designed to be used in conjunction with XPath 2.1. XSLT shares the same data model as XPath 2.1, which is defined in the Data Model, and it uses the library of functions and operators. XPath 2.1 and the underlying function library introduce a number of enhancements, for example the availability of higher-order functions. Some of the functions that were previously defined in the XSLT 2.0 specification, such as the format-date and format-number functions, are now defined in the standard function library to make them available to other host languages. XSLT 2.1 also includes optional facilities to serialize the results of a transformation, by means of an interface to the serialization component...

The WD document provides a representation of requirements and use cases for XSL Transformations (XSLT) Version 2.1, published as a W3C Working Draft on 11-May-2010. The Requirements lists enhancements requested over time that may be addressed in XSLT 2.1. The document is organized in three major sections: Requirements, Real-World Scenarios, and Tasks. Sample Data are provided in document Appendices. There are sixteen (16) Requirements, including Enabling Streamable Processing; Modes and Schema-awareness; Composite Keys; The 'xsl:analyze-string' Instruction Applied to an Empty Sequence; Context Item for a Named Template; Traditional Hebrew Numbering; Separate Compilation of Stylesheet Modules; The 'start-at' Attribute of 'xsl:number'; Allowing 'xsl:variable' before 'xsl:param'; Combining 'group-starting-with' and 'group-ending-with'; Improvements to Schema for Stylesheets; Setting Initial Template Parameters; Invoking XQuery from XSLT; Enhancement to Sorting and Grouping; Enhancement to Conditional Modes; Default Initial Template.

Real World Scenarios illustrate when real users reach limits of existing XML transformation standards. These use cases are elaborated in form of short stories, and include: Transforming MPEG-21 BSDL; Validation of SOAP Digital Signatures; Transformation of the RDF Dump of the Open Directory; Transformations on a Cell Phone; XSL FO Multiple Extraction/Processing; EFT/EDI Transformation.

Tasks are examples of relatively simple transformations whose definitions in XSLT 2.0 are not easy, straightforward or even possible. Some of these tasks are difficult solely because of the fact that one or more input or output XML documents is so large that the entire document cannot be held in memory. Other difficulties are related to merging and forking documents, restricted capabilities to iterate and the lack of common constructs (dynamic evaluation of expressions, try/catch). The transformation task illustrating troubles with huge XML documents can be defined in XSLT 2.0; the processor can even recognize that there is no need to keep the entire document in memory and can run the transformation in a memory-efficient way in some cases. But there no guarantee of this behavior. New facilities suggested for XSLT 2.1 aim to guarantee that a transformation must be processed in a streaming manner. Enumerated tasks include Splitting Flat Data; Splitting Nested Data; Joining; Concatenation; Adding Children; Renaming and Counting Nested Elements; Renaming and Counting Nested Elements and Counting Other Elements; Filtering According to Attribute; Filtering According to Child; Histogram; Hierarchical to Flat; Flat to Hierarchical; CSV Result; Local Sorting; Resolving References; Multiple Extraction/Processing; Grouping; Iterations; Making Explicit Sections; Merging Sorted Sequences.

The Ten Most Common XSLT Programming Mistakes
Michael Kay, Blog

"In response to a user recently, I told him he had fallen into the most common elephant trap for XSLT users. Rather than being annoyed, which I half expected, he thanked me and asked me if I could tell him what the next most common elephant traps were. Although some of us have been helping users avoid these traps for many years, I don't recall seeing a list of them, so I thought I would spend half an hour compiling my own list...

(1) Matching elements in the default namespace... If the source document contains a default namespace declaration 'xmlns="something"', then every time you refer to an element name in an XPath expression or match pattern, you have to make it clear you are talking about names in that namespace. (2) Using relative paths: 'xsl:apply-templates' and 'xsl:for-each' set the context node; within the 'loop', paths should be written to start from this context node... (3) Variables hold values, not fragments of expression syntax... Some people imagine that a variable reference '$x' is like a macro, expanded into the syntax of an XPath expression by textual substitution—rather like variables in shell script languages. It isn't: you can only use a variable where you could use a value...

(4) Template rules and 'xsl:apply-templates' are not an advanced feature to be used only by advanced users. They are the most basic fundamental construct in the XSLT language. Don't keep putting off the day when you start to use them. If you aren't using them, you are making your life unnecessarily difficult... (5) XSLT takes a tree as input, and produces a tree as output. Failure to understand this accounts for many of the frustrations beginners have with XSLT. XSLT can't process things that aren't represented in the tree produced by the XML parser (CDATA sections, entity references, the XML declaration) and it can't generate these things in the output either... (6) Namespaces are difficult. There are no easy answers to getting them right: this probably needs another article of its own; the key is to understand the data model for namespaces... (7) Don't use disable-output-escaping: Some people use it as magic fairy dust; they don't know what it does, but they hope it might make things work better... This attribute is for experts only, and experts will only use it as an absolute last resort...

(8) The 'xsl:copy-of' instruction creates an exact copy of a source tree, namespaces and all. If you want to copy a tree with changes, then you can't use 'xsl:copy-of'. Instead, use the identity-template coding pattern... (9) Don't use [xsl:variable name="x"][xsl:value-of select="y"/] [/xsl:variable]. Instead use [xsl:variable name="x" select="y"/]... The latter is shorter to write, and much more efficient to execute, and in many cases it's correct where the former is incorrect. (10) When you need to search for data, use keys. As with template rules, don't put off learning how to use keys or dismiss them as an advanced feature...."

See also: the Saxonica home page

Last Call Review: Mathematical Markup Language (MathML) Version 3.0
David Carlisle, Patrick Ion, Robert Miner (eds), W3C Technical Report

The W3C Math Working Group has published a Last Call Working Draft for Mathematical Markup Language (MathML) Version 3.0. This document produced as part of the W3C Math Activity. The Last Call period starts on 10-June-2010 and ends on 01-July-2010. The Math WG hopes that following the clear disposition of any comments this draft may provoke that it will put forward the MathML 3.0 specification for Proposed Recommendation status. This Last Call Public Working Draft specifies a final version of the Mathematical Markup Language (MathML) Version 3.0 which has been under active development. It follows the immediately previous Candidate Recommendation draft dated 15-December-2009. During the Candidate Recommendation review period of testing and implementation the Math Working Group, with feedback from outside, came up with enough corrections of typos and clarifications, and a reformulation of an algorithm that this has an impact, albeit small, on the normative aspects of the specification. So the Math Working Group puts forward for active review this Second Last Call draft. The Working Group believes it has completed its work on revising MathML with this document.

Mathematical Markup Language (MathML) Version 3.0 is an XML application for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the World Wide Web, just as HTML has enabled this functionality for text. MathML can be used to encode both mathematical notation and mathematical content. About thirty-eight of the MathML tags describe abstract notational structures, while another about one hundred and seventy provide a way of unambiguously specifying the intended meaning of an expression.

This specification of the markup language MathML is intended primarily for a readership consisting of those who will be developing or implementing renderers or editors using it, or software that will communicate using MathML as a protocol for input or output. It is not a User's Guide but rather a reference document.

MathML can be used to encode both mathematical notation and mathematical content. About thirty-eight of the MathML tags describe abstract notational structures, while another about one hundred and seventy provide a way of unambiguously specifying the intended meaning of an expression. Additional chapters discuss how the MathML content and presentation elements interact, and how MathML renderers might be implemented and should interact with browsers. Finally, this document addresses the issue of special characters used for mathematics, their handling in MathML, their presence in Unicode, and their relation to fonts. While MathML is human-readable, authors typically will use equation editors, conversion programs, and other specialized software tools to generate MathML. Several versions of such MathML tools exist, both freely available software and commercial products, and more are under development..."

IESG Approves Common YANG Data Types Specification as Proposed Standard
Juergen Schoenwaelder (ed), IETF Approved Specification

The Internet Engineering Steering Group (IESG) has announced the approval of Common YANG Data Types as an as a IETF Proposed Standard. The document was produced by members of the IETF NETCONF Data Modeling Language Working Group. Consensus was reached among all interested parties before requesting the publication of this document. There are multiple independent implementations of YANG today, both commercial and freely-available code and verification tools. David Partain is the IETF document shepherd; Dan Romascanu is the responsible Area Director.

The specification "introduces a collection of common data types to be used with the YANG data modeling language, derived from the built-in YANG data types. The definitions are organized in several YANG modules. The 'ietf-yang-types' module contains generally useful data types. The 'ietf-inet-types' module contains definitions that are relevant for the Internet protocol suite. The derived types are generally designed to be applicable for modeling all areas of management information..."

Additionally, on June 09, 2010, IESG announced the publication of the companion core specification as an IETF Proposed Standard: YANG: A Data Modeling Language for the Network Configuration Protocol (NETCONF). That document describes the syntax and semantics of the YANG language, how the data model defined in a YANG module is represented in the Extensible Markup Language (XML), and how NETCONF operations are used to manipulate the data. YANG is a data modeling language used to model configuration and state data manipulated by the Network Configuration Protocol (NETCONF) protocol, NETCONF remote procedure calls, and NETCONF notifications. All YANG definitions are specified within a module that is bound to a particular XML Namespace, which is a globally unique URI. A NETCONF client or server uses the namespace during XML encoding of data...

YANG models the hierarchical organization of data as a tree in which each node has a name, and either a value or a set of child nodes. YANG provides clear and concise descriptions of the nodes, as well as the interaction between those nodes. YANG structures data models into modules and submodules. A module can import data from other external modules, and include data from submodules. The hierarchy can be augmented, allowing one module to add data nodes to the hierarchy defined in another module. This augmentation can be conditional, with new nodes appearing only if certain conditions are met. YANG models can describe constraints to be enforced on the data, restricting the appearance or value of nodes based on the presence or value of other nodes in the hierarchy. These constraints are enforceable by either the client or the server, and valid content MUST abide by them... YANG modules can be translated into an equivalent XML syntax called YANG Independent Notation (YIN), allowing applications using XML parsers and XSLT scripts to operate on the models. The conversion from YANG to YIN is loss-less, so content in YIN can be round-tripped back into YANG..."

Toward a Saner Standards Process
Simon St. Laurent, O'Reilly Radar

"[...] Many standards processes, though not as many as I'd like, actually are culling and cleaning prior work. XML was culling and cleaning SGML. XSL was culling and cleaning DSSSL. XLink was culling and cleaning HyTime. Both of those processes, though imperfect, succeeded in producing something smaller and more usable. In the case of XML, that smaller and more usable transformed the way computing works. In the case of XLink, they produced a spec, but it's rarely used. XSL wound up somewhere in the middle.

There's a political problem here, however. While we need software developers to experiment in real code, letting us figure out which things work in reality, those same developers generally want to maximize the return on their work. Often that means the reason they would invest in a standards process to steer it to do what they want...

My proposal combines the necessary role of software developers with a standardization process run by the direct consumers of that software, not its creators. That means two phases of development: (1) Invention: A very loosely-directed phase which opens with a call for proposals, possibly a meeting that generates a loose description of the work to be done. Developers can band together and form alliances to build work that answers to that description. Hopefully, multiple groups will take up the challenge, producing alternatives for exploration. (2) Selection: A formal group of customers - customers who don't work for the implementers and inventors - evaluates the results of the development phase to figure out what pieces work most easily. They may be able to standardize in a single round, or they may have to select some parts while leaving others open for further development and later standardization...

There are pieces of this idea already in the works. The early XML crowd was definitely built around people whose livings depended on using markup rather than creating software. In the CSS world, vendors experiment pretty freely with new possibilities as clearly-marked extensions, allowing developers to try them out and determine how well they work before committing to their broad use. The W3C makes a point of trying to include companies that use their specifications as well as companies that build software around their specifications. Boeing has been a canonical example of that in the XML world..."

Cisco, With Newly Merged Tandberg, Pushes Telepresence Standard
Stephen Lawson, InfoWorld

"Cisco and Tandberg made their big post-merger entrance at the InfoComm conference in Las Vegas, promoting an interoperability protocol that will come on a product in July 2010 and introducing some other new products. The $3.4 billion merger that closed in April 2010 brought together Cisco's fast-growing TelePresence Meeting System business with Tandberg's established lineup of lower end products. It also merged two of the biggest players in video collaboration, presumably carving out a path toward greater interoperability among many of the systems already installed in enterprises...

Cisco is pushing for even broader compatibility across the industry with the Telepresence Interoperability Protocol (TIP), which in July 2010 will ship in a product for the first time. Cisco developed TIP before the Tandberg acquisition closed and had already begun licensing it free to other vendors, among them Tandberg. Now it is delivering the protocol on its Tandberg TelePresence Server...

TIP is designed to make multiscreen high-definition videoconferencing platforms work together to the point that they know on which screen to place each incoming video stream. It's the first protocol for doing this among multiple vendors' systems... TIP can also make the multiple streams used in such sessions appear as one stream so they can better traverse security mechanisms such as firewalls and session border controllers, without being broken up...

Cisco appears to be serious about making TIP an industrywide standard: In July, the company will make it an open-source project, and by August 2010, it will submit TIP to a standards body. It has not yet chosen the body, but when that entity crafts a standard out of TIP, Cisco will adopt it..."

Why Bill Gates Is Embracing Clean Technology
Patrick Thibodeau, ComputerWorld

"It should not be a surprise that tech industry titans, including Microsoft's chairman and former CEO, Bill Gates, are pushing the U.S. to dramatically increase spending for clean energy technology. The IT links to clean technology get stronger by the day, and if the push in Washington by Gates and venture capitalist John Doerr to get the federal government to triple its research and development spending succeeds, Silicon Valley is certain to benefit...

[According to Gene Wang], CEO of People Power, Wang's 18-month-old company has developed what it calls an Open Source Home Area Network (OSHAN) that connects household appliances and other energy-using devices to a Web-based portal for energy tracking. It has a kit, available for $150, called SuRF, for Sensor Ultra-Radio Frequency, that developers can use to create wireless energy sensors...

Wang's company, based in Palo Alto, Calif., is funded through venture capital but also with federal help, through a U.S. Small Business Innovation Research-Small Business Technology Transfer (SBIR-STTR) grant from the U.S. Department of Energy...

John Doerr, Bill Gates and other business leaders were due to meet Thursday with President Barack Obama. Wang said China, in particular, is focusing billions of dollars on green technologies... the U.S. also has an obligation to the world to adopt clean tech, because of its huge energy consumption on a per-capita basis...the U.S. has been investing heavily in smart grid and smart meter technology, but Wang said the focus is too narrow: 'Smart meters actually remove jobs; smart meters really benefit the utilities'..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors