Cover Pages: XML Daily Newslink: Friday, 05 October 2007

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
EDS http://www.eds.com

Headlines

Thinking XML: Firefox 2.0 and XML
Integrity Without Confidentiality
W3C Last Call Working Draft for XHTML Role Attribute Module
Apache Abdera Version 0.3.0
Updated Draft TAG Finding: Associating Resources with Namespaces
A Practical Application of SOA
IBM Updates Its SOA-Enhanced Product Portfolio
Mathematical Markup Language (MathML) 3.0: Updated Working Draft
IE 7 Update Drops WGA Validation Requirement

Thinking XML: Firefox 2.0 and XML
Uche Ogbuji, IBM developerWorks

This article explains how the latest Firefox release updates XML processing. Web browsers are perhaps the hottest sort of software right now, given their emerging role as the new application platform. These are particularly exciting times for software development, what with the re-emergence of dynamic HTML technologies as Asynchronous JavaScript + XML (Ajax), the revival of Microsoft Internet Explorer development, and more. The relentless pace of development in the Mozilla project has since led to the release of Firefox 2.0, building on the Gecko 1.8.1 Web rendering engine. Some of the developments in Firefox 2.0 touch on XML processing. (1) Less control over Web feeds: If you host a Web feed such as RSS or Atom you might include XSLT in order to turn that stylesheet into some other representation for the user. In Firefox 1.5, the browser dutifully loads [the XSLT] and displays the results; you have to view source to see the actual XML. In Firefox 2.0, the browser ignores the stylesheet PI and uses a custom Firefox view... After considerable debate in the user community the Firefox developers decided to stand their ground, and as things stand, the behavior will be the same in future Firefox versions; the new behavior is similar to that of Internet Explorer and Apple Safari. (2) Microsummaries, also called Live Titles are a neat new feature in Firefox 2.0 where you instruct the browser to substitute some useful content from a Web site in place of its title, particularly in bookmarks. A Web site can offer a microsummary, or the user can create one. The latter case is known as a "microsummary generator"; it requires XML and XSLT processing on the part of the user. (3) SAX: There is now a SAX parser framework for the XPCOM component system of Mozilla. This should allow people to develop extensions that process XML efficiently, if none of the other higher level processing technologies are suitable. XPCOM integration means you can handle SAX events with C++ or JavaScript code, or with any other language with XPCOM bindings. (4) OpenSearch: OpenSearch is an XML standard developed at the Amazon A9 incubator. It provides several XML formats and other conventions to describe and use search engines. Firefox has always had strong support for extensible search engine plug-ins, and version 2.0 introduces OpenSearch support so that search features can be extended using facilities that are also compatible with Internet Explorer and other browsers. Firefox supports OpenSearch 1.1, which is presently in beta, so it's possible that updates will be required to keep compatibility with Firefox and OpenSearch. Even more significant XML features will come in Firefox 3.0, which is in alpha testing. Expect a full release in the first half of 2008. It includes some very significant bug fixes and new features for XML processing.

Integrity Without Confidentiality
James Clark, James Clark's Random Thoughts Blog

People often focus on confidentiality as being the main goal of security on the Web; SSL is portrayed as something that ensures that when we send a credit card number over the web, it will be kept confidential between us and the company we're sending it to. I would argue that integrity is at least as important, if not more so. I'm thinking of integrity in a broad sense, as covering both ensuring that the recipient receives the sender's bits without modification and that the sender is who the recipient thinks it is. I would also include non-repudiation: the sender shouldn't be able to deny that they sent the bits. Consider books in the physical world. There are multiple mechanisms that allow us to trust in the integrity of the book [...] In the digital world, if we want to rely on something published on a web site, it's hard to know what to do. We can hope the web site believes in the philosophy that Cool URIs don't change; unfortunately such web sites are a minority. We can download a local copy, but that doesn't prove that the web site was the source of what we downloaded. What's needed is the ability to download and store something locally that proves that a particular entity was a valid representation of a particular resource at a particular time. SSL is based on using a handshake to create a secure channel between two endpoints. In order to provide the necessary proof, you would have to store all the data exchanged during the session. It would work much better to have something message-based, which would allow each request and response to be separately secured. Another crucial consideration is caching. Caching is what makes the web perform. SSL is the mainstay of security on the Web. Unfortunately there's the little problem that if you use SSL, then you lose the ability to cache. A key step to making caching useable with security is to decouple integrity from confidentiality. A shared cache isn't going to be very useful if each response is specific to a particular recipient. On the other hand there's no reason why you can't usefully cache responses that have been signed to guarantee their integrity. I think this is one area where HTTP can learn from WS-Security, which has message-based security and cleanly separates signing (which provides integrity) from encryption (which provides confidentiality). But of course WS-* doesn't have the caching capability that HTTP provides (and I think it would be pretty difficult to fix WS-* to do caching as well as HTTP does). My conclusion is that there's a real need for a cache-friendly way to sign HTTP responses. Being able to sign HTTP requests would also be useful, but that solves a different problem.

W3C Last Call Working Draft for XHTML Role Attribute Module
Mark Birbeck, Shane McCarron, et al. (eds), W3C Technical Report

W3C announced the release of a Last Call Working Draft for "XHTML Role Attribute Module A Module to Support Role Classification of Elements." The document was produced by the W3C XHTML 2 Working Group as part of the HTML Activity. The XHTML Role Attribute defined in the Working Draft specification allows the author to annotate XML Languages with machine-extractable semantic information about the purpose of an element. Use cases include accessibility, device adaptation, server-side processing, and complex data description. This attribute can be integrated into any markup language based upon XHTML Modularization. This module is designed to be used to help extend the scope of XHTML-family markup languages into new environments. It has been developed in conjunction with the accessibility community and other groups to make it easier to describe the semantic meaning of XHTML-family document content. XHTML Role Attribute Module is not a stand-alone document type. It is intended to be integrated into other host languages such as XHTML. A conforming XHTML Role Attribute Module document is a document that requires only the facilities described as mandatory in this specification and the facilities described as mandatory in its host language. Compact URIs: In order to allow for the scoped expression of role values, the specification uses a superset of of QNames that allows the contraction of all URIs . These Compact URIs are called CURIEs. XHTML role attribute takes as its value one or more whitespace separated CURIEs. Any non-qualified value MUST be interpreted in the XHTML namespace, and MUST be taken from the list defined in this section. The attribute describes the role(s) the current element plays in the context of the document. This can be used, for example, by applications and assistive technologies to determine the purpose of an element. This could allow a user to make informed decisions on which actions may be taken on an element and activate the selected action in a device independent way. It could also be used as a mechanism for annotating portions of a document in a domain specific way (e.g., a legal term taxonomy)... XHTML Modularization 1.1 describes an abstract modularization of XHTML and implementations of the abstraction using XML Document Type Definitions (DTDs), and XML Schemas. This modularization provides a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms. This specification is intended for use by language designers as they construct new XHTML Family Markup Languages. This specification does not define the semantics of elements and attributes, only how those elements and attributes are assembled into modules, and from those modules into markup languages. This second version of this specification includes several minor updates to provide clarifications and address errors found in the first version. It also provides an implementation using XML Schemas.

See also: XHTML Modularization 1.1

Apache Abdera Version 0.3.0
Staff, Apache Incubator PMC Announcement

Members of the Apache Abdera Project have announced the release of Apache Abdera version 0.3.0, featuring an open source Atom implementation. This release updates Abdera 0.2.2, published in February 2007. Binary and source distributions are available for download; builds are available for Java 1.5 and Java 1.4.2. The goal of the Apache Abdera project is to build a functionally-complete, high-performance implementation of the IETF Atom Syndication Format (IETF RFC 4287) and Atom Publishing Protocol specifications. The Atom Publishing Protocol (APP) is an application-level protocol for publishing and editing Web resources. The protocol is based on HTTP transfer of Atom-formatted representations. The protocol supports the creation of Web Resources and provides facilities for Collections (Sets of Resources, which can be retrieved in whole or in part), for Services (Discovery and description of Collections), and for Editing (Creating, editing, and deleting Resources). The IESG has approved the "The Atom Publishing Protocol" specification as a Proposed Standard. Abdera Version 0.3.0 provides significant enhancements. From 0.3.0 Release Summary: (1) Support for the Atompub final draft; (2) Refactored and simplified Server framework; (3) Refactored and simplified AbderaClient; (4) ExtensionFactory can now provide the MIME type for extension elements; (5) Improved extensibility; (6) Updated dependencies; (7) XPath support improvements; (8) Geotagging extensions; (9) Simple Sharing extensions; (10) WSSE Authentication; (11) Experimental Bidi extensions; (12) ExperimentalAtompub features extensions; (13) Feed paging extensions; (14) Feed license extensions; (15) XML Encryption with Diffie-Hellman key exchange; (16) Spring integration support; (17) Extensions now packaged in separate jars for modular distribution Abdera is an effort undergoing incubation at the Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects.

See also: Atom references

Updated Draft TAG Finding: Associating Resources with Namespaces

Norman Walsh and Henry S. Thompson (eds) Members of the W3C Technical Architecture Group (TAG) have published a revised version of the Draft TAG Finding on "Associating Resources with Namespaces." The draft responds to problems some commentators had with the proposed RDF model. The document addresses the question of how ancillary information (schemas, stylesheets, documentation, etc.) can be associated with a namespace. The names in a namespace form a collection: (1) Sometimes it is a collection of element names -- DocBook and XHTML, for example; (2) sometimes it is a collection of attribute names—XLink, for example; (3) sometimes it is a collection of functions—XQuery 1.0 and XPath 2.0 Data Model; (4) sometimes it is a collection of properties—FOAF; (5) sometimes it is a collection of concepts (WordNet), and many other uses are likely to arise. There's no requirement that the names in a namespace only identify items of a single type; elements and attributes can both come from the same namespace as could functions and concepts or any other homogeneous or heterogeneous collection you can imagine. The names in a namespace can, in theory at least, be defined to identify any thing or any number of things. Given the wide variety of things that can be identified, it follows that an equally wide variety of ancillary resources may be relevant to a namespace. A namespace may have documentation (specifications, reference material, tutorials, etc., perhaps in several formats and several languages), schemas (in any of several forms), stylesheets, software libraries, applications, or any other kind of related resource. The names in a namespace likewise may have a range of information associated with them. A user encountering a namespace might want to find any or all of these related resources. In the absence of any other information, a logical place to look for these resources, or information about them, is at the location of the namespace URI itself. The question remains: how can we best provide both human and machine readable information at the namespace URI such that we can achieve the good practice identified by web architecture? One early attempt was RDDL; RDDL 1.0 is an XLink-based vocabulary for connecting a namespace document to related resources and identifying their nature and purpose. This finding therefore attempts to address the problem by considering it in a more general fashion. We: define a conceptual model for identifying related resources that is simple enough to garner community consensus as a reasonable abstraction for the problem; show how RDDL 1.0 is one possible concrete syntax for this model; show how other concrete syntaxes could be defined and identified in a way that would preserve the model.

See also: additional TAG findings

A Practical Application of SOA
Scott M. Glen and Jens Andexer, IBM developerWorks

In Darwinian terms, SOA is the natural evolution of previous distributed architectural styles, such as distributed component object model (DCOM), Common Object Request Broker Architecture (CORBA), and Enterprise JavaBeans (EJB), but embraces standards, particularly those based around XML, to provide a greater degree of interoperability. There's also an explicit emphasis on business alignment, which wasn't prevalent with previous architectural incarnations. This lets SOA provide an ideal platform for business process-driven development, enabling business analysts to truly participate in the software development life cycle -- one of its biggest differentiators. However, simply adopting SOA alone doesn't guarantee a successful project, and some projects should not adopt an SOA approach at all. Thankfully we seem to be learning from those painful lessons of the past. For example, the knowledge gained in creating patterns and associated antipatterns that emerged from J2EE experiences has been used to construct similar best practices around SOA. IBM has been successful in developing reusable patterns and blueprints for SOA applications and industry-specific models that aid in architectural decision making and provide methodologies for service identification. Using such artifacts can have a dramatic impact on the costs of introducing an SOA. As with any new technology, there are associated start-up costs, and SOA can appear to be additionally front loaded. The emphasis on reuse and flexibility comes at a cost, and this can provide little motivation to evangelize SOA at a project level, where the benefits won't necessarily accrue to the project. Pattern-based accelerators and off-the-shelf assets can help reduce lead time, but ultimately there needs to be a degree of investment in an initial SOA project. Evidence does, however, suggest that organizations embarking on the SOA journey will see those costs returned in the medium term as subsequent SOA projects across the enterprise extend and reuse elements of the service catalogue, reducing their development costs.

IBM Updates Its SOA-Enhanced Product Portfolio
Antone Gonsalves, InformationWeek

IBM has introduced upgrades of its various software products that ensure the integrity of business processes within service-oriented architectures. Among the updated products is the WebSphere Process Server, which enables transactions to recover reliably when the target applications or servers are unavailable. Other enhanced products include the WebSphere Message Broker and MQ, which feature Web services support. IBM also upgraded its Tivoli Composite Application Manager for SOA, which provides information on service flows across the SOA environment; WebSphere DataPower XML Security Gateway, which provides hardware-based security for XML and Web services; and the IBM Information Server, which enables information to be provided as reusable services. IBM also introduced SOA configurations to reduce deployment time when reusing legacy and packaged applications in an SOA environment. The configurations provide best practices and step-by-step implementation guides. To support data governance, IBM introduced Optim, a new offering from the acquisition of Princeton Softech, which IBM completed in September. Optim captures and manages data at the business-record level and helps protect confidential data in complex application environments. IBM also updated its professional services and tools to support its SOA governance environment, including the WebSphere Service Registry and Repository, Rational Asset Manager, Rational Tester for SOA Quality, and Rational Performance Tester extension for SOA Quality. These offerings help customers manage services, assets, and processes. IBM has added support for Web 2.0 capabilities in some of its SOA offerings, including the latest releases of WebSphere Commerce, WebSphere Message Broker, and WebSphere Portal. The new capabilities enable users to create Web 2.0 interfaces, remix content, and more easily access services.

See also: the announcement

Mathematical Markup Language (MathML) 3.0: Updated Working Draft
David Carlisle, Patrick Ion, Robert Miner (eds), W3C Technical Report

W3C's Math Working Group has published an updated Working Draft for the "Mathematical Markup Language (MathML) Version 3.0" specification. MathML is an XML application for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the World Wide Web, just as HTML has enabled this functionality for text. MathML can be used to encode both mathematical notation and mathematical content. About thirty-five of the MathML tags describe abstract notational structures, while another about one hundred and seventy provide a way of unambiguously specifying the intended meaning of an expression. Additional chapters discuss how the MathML content and presentation elements interact, and how MathML renderers might be implemented and should interact with browsers. Finally, this document addresses the issue of special characters used for mathematics, their handling in MathML, their presence in Unicode, and their relation to fonts. While MathML is human-readable, in all but the simplest cases, authors use equation editors, conversion programs, and other specialized software tools to generate MathML. Several versions of such MathML tools exist, and more, both freely available software and commercial products, are under development. This specification of the markup language MathML is intended primarily for a readership consisting of those who will be developing or implementing renderers or editors using it, or software that will communicate using MathML as a protocol for input or output. It is not a User's Guide but rather a reference document.

IE 7 Update Drops WGA Validation Requirement
Peter Galli, eWEEK

Microsoft is making its Internet Explorer 7 browser available to all Windows XP users—even those using pirated software—and installation will no longer require that the operating system first be validated as genuine. The company said the move is about security and ecosystem safety, because if even one user in a network is not using the security enhancements provided in IE 7, that user places the entire network at risk. The update was available beginning October 4, 2007 for Windows XP Service Pack 2, Windows XP 64-Bit Edition and Windows Server 2003 Service Pack 1 users. The update also comes hot on the heels of the news that Microsoft has allowed its OEM and retail partners to offer Windows XP for an additional five months, until June 30, 2008, after receiving complaints that customers are not ready to switch to Vista. Microsoft has also made some changes to IE 7 for Windows XP that Reynolds said were requested by customers. The menu bar is now visible by default; the how-to section of the IE 7 online tour has been updated; and when the browser is first opened, users will be presented with a new overview. Steve Reynolds, program manager for Microsoft: "Microsoft takes its commitment to help protect the entire Windows ecosystem seriously, and we're taking a step to help make consumers safer online. We feel the security enhancements to Internet Explorer 7 are significant enough that it should be available as broadly as possible, and this means making it available to all users of IE 7-compatible Windows operating systems. We've also included a new MSI installer that simplifies deployment for IT administrators in enterprises."

See also: InfoWorld


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors