Cover Pages: XML Daily Newslink: Wednesday, 07 May 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
IBM Corporation http://www.ibm.com

Headlines

Full Validation of Atom Feeds Containing Extensions
IBM Launches Its Own Data Center Alliance
Canonical XML Version 1.1 Released as a W3C Recommendation
New Release: Apache Axis2 Java and Apache Axis2/C Version 1.4
Solution Deployment Descriptor (SDD), Part 1: An Emerging Standard for Deployment Artifacts
Java Platform To Get Modularity, OSGi Support
Sun Launches OpenSolaris
Google: Unicode Conquers ASCII on the Web
Microsoft DAISY XML Add-In and DAISY Pipeline Support Accessibility
JavaOne: Sun Rolls Out JavaFX
Don't Show Me Problems Show Me Answers, And Don't Show Me Them Either!

Full Validation of Atom Feeds Containing Extensions
Makoto Murata and Hisashi Miyashita, Technical Report

MURATA Makoto writes: "I wrote a note about full validation of Atom feeds containing extensions such as OpenSearch and Google Calendar. Hisashi Miyashita, my co-author, implemented an NVDL validator for such validation." The RELAX NG schema (hereafter atom.rnc) in RFC 4287 (The Atom Syndication Format) does not provide full validation of atom feeds containing extensions. Rather, the schema focuses on top-level constructs of atom feeds; it skips extension elements and attributes, even when extension elements further contain constructs of atom feeds. Some specifications (e.g., RFC 4685 (Atom Threading Extensions) for atom extensions provide schemas for extension elements and attributes. These extension schemas focus on extension elements and attributes, and are typically written in RELAX NG. However, such extension schemas are not referenced from atom.rnc. As a result, these schemas do not provide full validation of atom feeds containing extensions. They are useful for documentation, but they are not usable for validating atom feeds. One might wonder whether atom.rnc and extension schemas can be combined to form a single RELAX NG schema against which atom feeds containing extensions are fully validated. To the best of our knowledge, our earlier work is the only example of such combined schemas. We combined a variation of atom.rnc and three schemas for atom extensions thereby successfully providing full validation. However, we do not believe that this all-in-one approach provides a reliable basis for full validation of atom and its extensions. The all-in-one approach requires that (1) schema authors understand schema customization techniques (e.g., the combine feature of RELAX NG) very well, (2) they avoid pitfalls caused by wildcards, and (3) they understand customization points of all schemas to be combined. In this document, we advocate the use of Namespace-based Validation Dispatching Language (ISO/IEC 19757-4) for full validation of atom feeds containing extensions. Schema authors for atom extensions first create schemas dedicated to the extensions. They then create NVDL scripts for combining these schemas and atom.rnc. Controlled by NVDL scripts, the NVDL engine decomposes atom feeds containing extension elements or attributes into (1) extension-free atom and (2) extensions so that (1) and (2) are validated separately. As an example, an NVDL script for Google Calendar is presented. This NVDL script reveals that embedded atom entries in Google Calendar XML documents have validation errors.

See also: NVDL references

IBM Launches Its Own Data Center Alliance
Jacqueline Emigh, BetaNews

In a world already populated by the Green Grid Alliance and other industry groups oriented to energy efficiency, IBM has just rolled out a data center alliance with some similar interests. As explained in an interview, Rich Lechner, VP of IBM's Enterprise Systems, said that that his company's new alliance program for enterprise data centers will work hand-in-hand with other industry groups, including Green Grid, DMTF (Desktop Management Task Force), and SNIA (Storage Networking Industry Alliance). "IBM was itself a founding member of the Green Grid Alliance," Lechner noted. IBM also belongs to DMTF and SNIA, and many of the members of IBM's own new data center alliance are affiliated with one or more of the other groups. The Green Grid Alliance—which also numbers Sun and Microsoft among its more than 150 members—is using the DMTF's WBEM (Web-based Enterprise Management) standard for distributed computing as the basis for interfaces it is creating as part of its own standardized technology for managing energy use across multiple vendors' platforms. But Green Grid officials have also been careful to point out that, with their own special focus on green computing, they view their work as separate and distinct from that of either the DMTF or SNIA. Lechner acknowledged that the main reason behind IBM's own new alliance is to ensure interoperability, especially between IBM and major third-party data center partners such as Sun, Novell, Red Hat, VMWare, Juniper Networks, Citrix, Emulex, and Eaton: "We'll identify key standards, define implementations, and integrate them into our point of view and product plans... vendors that support energy efficiency standards can enable end-to-end management and monitoring of power and cooling of hardware in the data center. Such standards pave the way to product interoperability for greater data center efficiency by setting caps on energy use, shifting resources to meet business requirements, or adjusting workloads to avoid higher billing rates."

See also: the IBM announcement

Canonical XML Version 1.1 Released as a W3C Recommendation
John Boyer and Glenn Marcy (eds), W3C Technical Report

W3C announced that the XML Core Working Group has published "Canonical XML Version 1.1" as a W3C Recommendation. The term 'canonical XML' refers to XML that is in canonical form. The XML canonicalization method is the algorithm defined by this specification that generates the canonical form of a given XML document or document subset. The term XML canonicalization refers to the process of applying the XML canonicalization method to an XML document or document subset. Canonical XML Version 1.1 is a revision to Canonical XML Version 1.0 to address issues related to inheritance of attributes in the XML namespace when canonicalizing document subsets, including the requirement not to inherit 'xml:id', and to treat 'xml:base' URI path processing properly. Any XML document is part of a set of XML documents that are logically equivalent within an application context, but which vary in physical representation based on syntactic changes permitted by XML 1.0 and Namespaces in XML 1.0. This specification describes a method for generating a physical representation, the canonical form, of an XML document that accounts for the permissible changes. Except for limitations regarding a few unusual cases, if two documents have the same canonical form, then the two documents are logically equivalent within the given application context. Note that two documents may have differing canonical forms yet still be equivalent in a given context based on application-specific equivalence rules for which no generalized XML specification could account. Canonical XML Version 1.1 is applicable to XML 1.0 and defined in terms of the XPath 1.0 data model. It is not defined for XML 1.1. A companion "Implementation Report C14N 1.1" has been published to demonstrate fulfillment of the Candidate Recommendation Exit Criteria for Canonical XML 1.1. Test documents have been developed with a range of usages of attributes in the XML namespace, and correct and compatible results shown for these tests by at least two implementations.

New Release: Apache Axis2 Java and Apache Axis2/C Version 1.4
Deepal Jayasinghe, Blog

Members of the Apache Software Foundation Axis2 development team announced the release of Apache Axis2 1.4 and Apache Axis2/C version 1.4.0. Based on the Axis2 architecture, there are two implementations of the Apache Axis2 Web services engine - Apache Axis2/Java and Apache Axis2/C. "Apache Axis2 is the core engine for Web services. It is a complete re-design and re-write of the widely used Apache Axis SOAP stack, built on the lessons learnt from Apache Axis. Just over 8 months since the 1.3 release, we are very proud to announce the release of Apache Axis2 version 1.4. Apache Axis2 is a complete re-design and re-write of the widely used Apache Axis engine and is a more efficient, more scalable, more modular and more XML-oriented Web services framework. It is carefully designed to support the easy addition of plug-in 'modules' that extend its functionality for features such as security and reliability. Modules supporting WS-Security/Secure-Conversation (Apache Rampart), WS-Trust (Apache Rahas), WS-Reliable Messaging (Apache Sandesha) and WS-Eventing (Apache Savan) will be available soon after the Apache Axis2 1.4 release. Apache Axis2 Version 1.4 comes with performance improvements and a number bug fixes over the 1.3 release. The Axis2 Maven Main Repository has the latest jars as well.

Solution Deployment Descriptor (SDD), Part 1: An Emerging Standard for Deployment Artifacts
Julia McCarthy and Brent Miller, IBM developerWorks

The Solution Deployment Descriptor (SDD) is an emerging standard from the Organization for the Advancement of Structured Information Standards (OASIS). SDD defines the format for metadata about the requirements, inputs, and results of the deployment artifacts that are processed to deploy software resources. SDD also defines the format for metadata about the aggregation of multiple deployment artifacts into a solution. This metadata can be used to guide deployment as well as a variety of other deployment-related activities, such as deployment planning. Fundamentally, SDD provides a standard way to encode and externalize deployment information. One Solution Deployment Descriptor consists of two paired XML descriptor files, a package descriptor, and a deployment descriptor, that contain deployment metadata; that is, descriptive information about deployment artifacts and the aggregation of those artifacts. The package descriptor describes the identity and contents of the deployment package. The deployment package consists of a set of artifacts used to perform deployment lifecycle operations on a group of related resources that make up a solution. Artifact processing results in the deployment of software. The deployment descriptor describes the inputs, requirements, variability, and results of the deployment artifacts. This article presents the concepts behind SDD and describes what SDD is. It discusses SDD and provides a high-level overview of the support provided by the SDD for the expression of deployment-related knowledge through standardized, externalized metadata. The article is written for those who want to understand where the SDD fits into the deployment world and those who intend to learn the details of the SDD standard, especially those who are involved in developing, integrating, or deploying software. A general knowledge of the current state of software deployment technology, especially in complex environments, is assumed. Future articles will describe in more detail how to use the SDD. In these articles you will explore the elements of the SDD that realize the concepts described in this introductory article.

Java Platform To Get Modularity, OSGi Support
Paul Krill, InfoWorld

Upcoming versions of the Java platform will be fitted with capabilities such as flexibility, OSGi support, and modularity, Sun Microsystems officials announced at the JavaOne conference in San Francisco. Road maps were detailed for Java Platform, Enterprise Edition (Java EE) 6 and Java Platform, Standard Edition (Java SE) 7. Java SE serves as a base Java platform, with the Enterprise version adding enterprise-level capabilities. With Java EE 6, Sun seeks to increase flexibility in implementing the platform. With version 6, profiles will be created based on specific needs, such as a Web profile focused on Web developers, Chinnici said. The Web profile is not fully defined yet, but will feature technologies that appear in the vast majority of Web applications. Other profiles are expected such as a telecommunications profile that features SIP (Session Initiation Protocol) services. Profiles can be created by filing a Java Specification Request as part of the Java Community Process. Scripting languages will be made first-class citizens on the Java platform as well. Web development will be made easier through annotations across Web APIs. Developers should see a reduced need to edit web.xml descriptors. Third-party libraries will self-register, removing a common source of errors for developers. Another feature of version 6 is an API for REST-ful (Representational State Transfer) Web services. Also planned is a pruning process by which certain parts of the Java platform can be made optional, according to Roberto Chinnici, Java EE platform lead at Sun: "The typical candidate is those technologies that have been superseded effectively by new ones... for EE 6, the theme is what I like to call 'rightsizing,' which essentially means making the platform the right size for you."

Sun Launches OpenSolaris
Henry Kingman, Linux-Watch

Sun Microsystems officially launched OpenSolaris (OS) today. Available pre-built as a combo live/install CD, the initial "2008.05" binary distribution download of the OS features a GNOME user interface, highly fault tolerant ZFS root filesystem, IPS package managment, and "DTrace" tuning tools. The news came at Sun's CommunityOne warmup for its big annual JavaOne show this week. Debian creator Ian Murdoch, now vice president for developer and community marketing at Sun, delivered the opening keynote. At the show, Sun also announced that OpenSolaris computing services will be offered by Amazon via its EC2 (Elastic Compute Cloud) service suite, which will also continue to offer MySQL following Sun's acquisition of MySQL A.B., one of the companies that offers commercial licenses for the open source database. Features include: (1) ZFS used for root filesystem: A fault-tolerant filesystem claimed to support multiple simultaneous drive failures. ZFS is a new kind of file system that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management. ZFS presents a pooled storage model that completely eliminates the concept of volumes and the associated problems of partitions, provisioning, wasted bandwidth and stranded storage. Thousands of file systems can draw from a common storage pool, each one consuming only as much space as it actually needs. The combined I/O bandwidth of all devices in the pool is available to all filesystems at all times. (2) IPS: (image packaging system) package management. pkg(5), the image packaging system, is an attempt to design and implement a software delivery system with interaction with a network repository as its primary design goal. Other key ideas are: safe execution for zones and other installation contexts, use of ZFS for efficiency and rollback, preventing the introduction of incorrect or incomplete packages, and efficient use of bandwidth. (3) DTrace: The binaries in the OpenSolaris release are built with support for this profiling/analyzing tool, which is also obviously included in the distribution OpenSolaris is licensed under Sun's CDDL license, accepted as an "open source" license by the Open Source Initiative (OSI), despite incompatibilities with many other open source licenses, such as the GPL. Sun also licensed Java under the CDDL for many years, before offering its Java runtimes and development kits under the GPL more recently. Aaid Stephen Lau, OpenSolaris Governing Board member. "OpenSolaris provides an ideal environment for students, developers and early adopters looking to learn and gain experience with innovative technologies like ZFS, Zones and DTrace. And yes, it uses bash by default."

See also: the announcement

Google: Unicode Conquers ASCII on the Web
Stephen Shankland, CNET News.com

"I picture it happening this way. The Roman alphabet is on the run, pursued by a much larger army of Arabic characters with long scimitar-like ligatures, Chinese characters that look like throwing stars, and European peasant letters bristling with umlauts, cedillas, and tildes. Unicode has overtaken ASCII as the most popular character encoding scheme on the World Wide Web, [according to Mark Davis, Google's senior international software architect]. Also vanquished at almost exactly the same time was the Western European encoding. Unicode is a character encoding standard that gracefully accommodates dozens of languages as well as Roman characters with diacritical marks. ASCII, a tried-and true, decades-old standard, is limited to 128 or 256 characters and has a hard time extending beyond the range of a century-old Remington typewriter. Google's a fan of Unicode Web sites. When it processes data from Web sites, it converts it into Unicode first if it's not already there. That improves international search abilities." Davis (blog): "Google has just begun supporting Unicode 5.1, less than one month after it was released. It's now available in search, so people speaking languages such as Malayalam can now search for words containing the new characters in Unicode 5.1. Web pages can use a variety of different character encodings, like ASCII, Latin-1, or Windows 1252, or Unicode. Most encodings can only represent a few languages, but Unicode will handle anything from Chinese to French to Arabic. We have long used Unicode as the internal format for all the text we search: any other encoding is first converted to Unicode for processing..."

See also: XML and Unicode

Microsoft DAISY XML Add-In and DAISY Pipeline Support Accessibility
Staff, Microsoft Announcement

Microsoft Corp. has joined with industry and advocacy group leaders worldwide to launch new software that will make it easier for anyone to create documents and content that will be accessible for blind and print-disabled individuals. The new 'Save as DAISY XML' add-in, designed for Microsoft Office Word 2007, Word 2003 and Word XP, allows users to save Open XML-based text files into DAISY XML, the foundation of the globally accepted DAISY Standard for reading and publishing navigable multimedia content. The add-in was created through an open source project with Microsoft, Sonata Software Ltd. and the Digital Accessible Information SYstem (DAISY) Consortium and can be downloaded by Microsoft Office Word users for free. Also released today is the newest version of the DAISY Pipeline, a free downloadable transformation suite that supports the seamless conversion of DAISY XML into DAISY Digital Talking Book (DTB) format. Together these technologies provide a comprehensive solution for converting text documents into accessible formats for people with print disabilities. Information about other technologies that can convert DAISY XML into DAISY DTB format and other products that support the DAISY standard is available on the DAISY Web site. This new 'Save as DAISY XML' tool presents the opportunity for organizations and independent software vendors to consider ways in which the technology may be employed to meet the needs of those not yet served by text-only or audio-only formats. Corporations such as insurance agencies, healthcare providers and companies that publish training manuals require a method to deliver fully accessible documents to their customers and employees with different needs. The open source nature of the Open XML to DAISY XML translation project enables technologists to utilize the source code and other resources for their own applications. As Open XML adoption continues to expand across the software industry for use on various platforms, including Linux, Windows, Mac OS and the Palm OS, solution providers interested in creating their own Open XML to DAISY XML translators can reference information available through the SourceForge open source project site.

JavaOne: Sun Rolls Out JavaFX
Dan Farber, CNET NEWS.com

Following a flurry of T-shirts catapulted by Java creator James Gosling and a hot dance troop performance, 75 hours of JavaOne got under way this week. Sun Microsystems' software chief, Rich Green, took the stage to talk about consumers... Enterprises have to recognize that the enterprise moat barriers are coming down, he added, with consumers driving innovation. As part of Sun's effort to enable consumers to innovate, Green introduced JavaFX, a rich Internet application environment set to compete with Adobe Systems' AIR and Microsoft's Silverlight. He showed a JavaFX application with Flickr and Twitter feeds running in Facebook within the browser, and then he dragged it out of the browser—to the desktop. The same application also was shown running on a Java-enabled phone via JavaFX Mobile. Sun is hoping to tap into 2.2 billion mobile devices and the vast majority of desktop PCs that are Java-enabled. JavaFX was shown running on Google's Android mobile platform. Green noted that 85 percent of cell phones, 91 percent of desktops, and 100 percent of all Blu-ray Disc players will run JavaFX. Sun also plans to deliver JavaFX from the cloud and to gather instrumented user action data via JavaFX that goes back to developers. It could be used for advertising or to provide information to customers... According to Stewart's ZDNet blog, "JavaFX is just one part (albeit a very snazzy part) of many enhancements to the Java runtime which includes the Java Update 10 browser plugin that would enable JavaFX developers to target the browser with animations and vector art. But JavaFX is part of a larger Java ecosystem and is in some ways a lynchpin to allow developers and designers to create RIA experiences across a lot of devices. As Coté mentions, this is a lot like Adobe's Open Screen Project and I think it shows an industry trend of moving towards a more cohesive multi-demensional platform. Java has been down this road before so anyone counting them out isn't giving them enough credit. They have a LONG way to go especially when you look at Adobe's RIA strengths and Microsoft's very enthusiastic entry into the space. But I think JavaFX will be a breath of fresh air for people and will help in expanding the RIA footprint further."

See also: the ZDNet blog

Don't Show Me Problems Show Me Answers, And Don't Show Me Them Either!
Rick Jelliffe, O'Reilly Blog

Alex Brown is [...] trying to faithfuly fulfill his normal committee responsibilities, which include checking through standards. Alex has long been involved in Data Quality issues for publishing professionally, and has been very involved in the development of ISO DSDL at SC34, which includes RELAX NG and Schematron. So what is it that Alex found about ODF [validation errors] that has caused the fuss? It is quite technical, but the gist is this, as I understand it: if a schema is not itself valid, no documents can be formally valid against it. When the invalid part of the schema is only detected at run-time when exercised by a particular instance document structure, and the document does not contain such a triggering instance, the implementation may report that the document is valid, but that is a false positive. And you make look at the schema and say 'I know what was intended, and the false positive is in fact correct against the intent of the schema' but this is lucky accident, i.e. hacking, not formal validity. The particular issue is quite interesting because it relates to an area in a W3C Schema standard where the user requirements for XSD could not be supported by the facet model used, and where XSD fudges it. OASIS RELAX NG, also to an extent inherited this problem. The problem is with attributes of type ID in the ODF schema. Alex Brown has provided a very simple fix, which I hope gets adopted into ODF 1.2. The problem with IDs is this. XML inherits ID type attributes from SGML. They have various constraints, which include that they are XML names (tokens), that their values are unique within the document, and that an element can only have one ID attribute... Alex has found the fix for ODF, but I think RELAX NG and XSD could well have some extra clarifaction text (non-normative) to stop basic mistakes. If a schema, whether DTD, XSD or RELAX NG, says something is an ID, it has all the semantics of an XML ID... Alex Brown is right that the schema has a flaw, and right to point it out and offer a fix; Rob Weir is right that it is unnecessary for this to be a static error (the positive point I would infer from his over-reacting blog), but wrong that the way to fix it is to turn off validating that constraint...

See also: the smoke test


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors