Cover Pages: XML Daily Newslink: Wednesday, 07 March 2007

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
IBM Corporation http://www.ibm.com

Headlines

Architectural Vision for HTML/XHTML2/Forms Chartering
W3C Relaunches HTML Activity
Oracle's Kurian on the Next Application Platform
Enterprise SOA the Apache Way
Liberty Alliance Offers Guide for Identity Roll-Out
PHP5's XML Parsing Techniques for Large or Complex XML Documents
Swift for Grid Workflow Management
The Tools of the Master Forecaster

Architectural Vision for HTML/XHTML2/Forms Chartering
Dan Connolly, Chris Lilley, Tim Berners-Lee; W3C Vision Document

W3C's discussion around the re-chartering of the HTML-related work was extensive. In the interest of providing a convenient summary, this document discusses the overall architectural vision behind the chartering of these groups,and how they fit into the wider pattern of the Interaction Domain and the overall Web Architecture. W3C has in general assumed that XML is the correct way forward and that implementations will fall into line as necessary over time. For the mobile market, and for non-HTML client technologies like SMIL, SVG, MathML, Timed-Text and so forth, this has indeed happened. For the desktop browser market, however, tag soup markup has persisted much longer than we would have expected or hoped. There are several ways to approach this situation, given that pretending the situation does not exist is not acceptable: (1) Try to force users and implementers to greater adoption of the existing XHTML 1.x (2) Create a new language, with a different media type, which is more extensible, more accessible, has richer semantics, and so forth; older user agents which do not understand this format will not request it, and will reject it. (3) Create independent but related languages for different audiences. This has a clear and obvious drawback relative to a single language, and yet can be considered especially if XML forms a common parsing model. It would have been possible (and there were some calls for this) for the primarily desktop oriented, consumer oriented language to have only a tag-soup serialization. However, that would certainly have a negative and divisive effect on the Web architecture. Gratuitous incompatibilities with XML should be strenuously avoided. Instead, the charter calls for two equivalent serializations to be developed by the HTML WG, corresponding to a single DOM (or infoset, though tag soup cannot be considered to have an infoset currently, while it can have a DOM). This ensures that decisions are not made which would preclude an XML serialization. It allows the two serializations to be inter-converted automatically. Having new language features, there is an incentive for content authors to use it; and having client-side implementations means that there is the possibility to really use it. Of these, W3C has chosen the third approach. If this new HTML-family format is widely used, and if it can be reliably converted to XML if it is not already serialized in that form (reliably meaning not only that formatting is the same but the structure is the same, and the semantics are not altered) then XML-based workflows can create and consume this content. Meanwhile, enterprise-strength needs are met by XHTML2, which includes XForms. The two formats are differentiated by deployment strategy and expected field of use.

W3C Relaunches HTML Activity
Staff, W3C Announcement

W3C has announced a new Charter for the HTML Working Group, extending through 2010. The group will maintain and produce incremental revisions to the HTML specification, which includes the series of specifications previously published as XHTML version 1. Both XML and 'classic HTML' syntaxes will be produced. The Working Group will define conformance and parsing requirements for 'classic HTML', taking into account legacy implementations; the Group will not assume that an SGML parser is used for 'classic HTML'. New publications and milestones include: (1) A language evolved from HTML4 for describing the semantics of documents and applications on the World Wide Web; this will be a complete specification, not a delta specification; (2) An extensible, serialized form of such a language, using XML; (3) A serialized form of such a language using a defined, non-XML syntax compatible with the 'classic HTML' parsers of existing Web browsers; (4) Document Object Model (DOM) interfaces providing APIs for such a language; (5) Forms and common UI widgets such as progress bars, datagrids, menus, and other controls; (6) APIs for the manipulation of linked media, editing APIs and user-driven WYSIWYG editing features. The HTML WG is encouraged to provide a mechanism to permit independently developed vocabularies such as Internationalization Tag Set (ITS), Ruby, and RDFa to be mixed into HTML documents. Whether this occurs through the extensibility mechanism of XML, whether it is also allowed in the classic HTML serialization, and whether it uses the DTD and Schema modularization techniques, is for the HTML WG to determine. The Group will create a comprehensive test suite for the HTML specification. The Group will ensure that validation tools are available, possibly from third parties, for the HTML specification. Validation does not mean DTD validation; validation using schemas (such as W3C XML Schema, RelaxNG, Schematron) and validation which is tolerant of extensions in other namespaces (for example using ISO DSDL/NVDL) is encouraged, as well as automated checking of items from the specification prose. The Group will monitor, track, and encourage implementation of HTML, both during Candidate Recommendation and afterwards, to encourage adoption. In addition to the new HTML and XHTML 2 Working Groups, W3C is also pleased to recharter the Hypertext Coordination Group and charter the Forms Working Group. The Forms Working Group will continue work on the XForms architecture, which has seen significant adoption in a variety of platforms.

See also: the announcement

Oracle's Kurian on the Next Application Platform
Colleen Frye, TheServerSide.com

Thomas Kurian, senior vice president of development for Oracle middleware platform products, is delivering a keynote at the Java Symposium on the next application platform. In this interview he talks about some of the key elements of that platform, including POJO-based development, orchestration, integration with open source frameworks like Spring and 'a la carte' Java EE 5 compliance. Kurian: "One of the primary goals of the EJB 3.0 specification, which includes JPA, was simplification. One of the biggest and most striking simplifications in EJB 3.0 is the use of Plain Old Java Objects, or POJOs, instead of container-generated components. This allows Java developers to build EJB 3.0 applications in their favorite Java IDE and deploy to a Java EE 5 application server, without any additional compilation or preprocessing. The new JPA specification embraces the POJO approach that has been the preferred one in object-relational mapping products like Oracle TopLink. It incorporates the best of commercial and open source object-relational mapping frameworks while embracing the EJB 3.0 approach of intelligent defaults to reduce the amount of work a developer needs to do to access relational databases from Java. The orchestration layer improves on the current generation of application platforms in key areas of performance, security, and management, and it's based on the Service Component Architecture (SCA) and Service Data Objects (SDO) specifications we co-developed with other industry players. It uses a normalized messaging format between application service components, bypassing the marshalling and unmarshalling of data and letting us optimize for in-memory communication where possible, all great performance improvements. It also enables a consistent security model for authentication and encryption of data regardless of what service protocols are used. Another important aspect of SCA is the standards-based approach for consolidating all of the metadata and deployment descriptors associated with the various integration components of a composite application such as an ESB, business process engine, etc. This vastly simplifies management activities such as project versioning, dependency analysis, and end-to-end monitoring of composite applications. We expect this will drive a natural consolidation and rationalization of design-time tools.

Enterprise SOA the Apache Way
Kyle Gabhart, XML.com

Apache.org has been buzzing with Service-Oriented Architecture (SOA) and Web Services activity for the last several years. If you browse to the Apache Web Services Project, you'll find a list of more than twenty (20) SOA and Web Services projects with another handful currently housed in the Apache incubator. Within this broad range of frameworks and tools, a few enterprise SOA technology stacks have emerged. In this article, we explore one of the Apache SOA stacks, Synapse + Axis2 + Tomcat. Synapse is a service mediation framework capable of filtering, routing, and transforming messages based on a flexible XML configuration. Axis2 is a second-generation service engine that hosts SOAP 1.1/1.2 and REST services, as well as provides support for a host of other WS-* standards. Tomcat is the standard Java enterprise web server. Service-Oriented Architecture promises agility and alignment of business and technical objectives. The combination of three Apache projects—Axis2, Synapse, and Tomcat—results in a pretty compelling Apache SOA stack. Within the larger context of this service-oriented defect tracking system, we focus on the specific use case of collecting defect data from the customer feedback system and defining some simple rules for escalating certain "high-priority" messages. In traditional software development, business logic changes require code updates, followed by compilation and a complete testing cycle. SOA offers agility by allowing businesses to configure services as needed to meet changing business requirements. The Synapse mediation framework delivers on SOA's promise of agility by defining a robust configuration capability backed by the Synapse Configuration Language, allowing developers to configure message processing via one or more mediators. In our use case, we explore how this SOA stack would enable a business to configure message routing and escalation business rules to prioritize customer feedback originating from key accounts. It is easy to see how these rules could be reconfigured and even expanded to adapt to changing business goals. Changes in business strategy might easily lead to a change in how customer feedback is processed and incorporated into bug fixes, R&D, and even Customer Relationship Management (CRM). This same agility and flexibility can then be realized throughout the enterprise as more and more of the business's Information Systems technology is assimilated into an overarching service-oriented enterprise.

See also: Apache Synapse

Liberty Alliance Offers Guide for Identity Roll-Out
Jeremy Kirk, InfoWorld

The Liberty Alliance Project has published a document outlining the legal issues enterprises should consider when networking their Web applications and identity systems with those of other businesses. The primer on creating circles of trust raises awareness around issues such as privacy, technical terminology, indemnification and dispute resolution. From the Liberty document, "Liberty Alliance Contractual Framework Outline for Circles of Trust": "This document provides guidance on suggested business structures and terminology for a Liberty enabled technology deployment necessary to create a legally binding Circle of Trust (CoT). Its purpose is to facilitate a Liberty enabled deployment of identity management specifications and technology by assisting stakeholders and their legal and executive management teams in the identification of the legal structure best suited for their deployment... This document describes the rationale for using a contractual framework for the Circle of Trust, offers practical guidance for developing those contractual frameworks, discusses considerations that should be taken into account in selecting and structuring the contractual framework, and describes other Liberty guidance documents that may be useful as references or starting points for terminology and certain other aspects of the contractual framework. Liberty has developed and continues to develop identity federation based specifications, guidelines and educational materials to help businesses, governments, and individuals establish and operate solutions for identity federated and identity-based web services applications. Liberty anticipates that participants to a Liberty enabled deployment will enter into contractual relationship(s) that delineate their rights, obligations, remedies, and allocation of risk with respect to the deployment. This document provides guidance on addressing contractual relationship structures that will likely be part of every Liberty enabled deployment, and provides practical outlines and checklists for developing those contractual relationships."

See also: the text

PHP5's XML Parsing Techniques for Large or Complex XML Documents
Cliff Morgan, IBM developerWorks

This second article in a three-part series discusses XML parsing techniques of PHP5, focusing on parsing large or complex XML documents. PHP5 offers an improved variety of XML parsing techniques. James Clark's Expat SAX parser, now based on libxml2, is no longer the only fully functional game in town. Parsing with the DOM, fully compliant with the W3C standard, is a familiar option. Both SimpleXML and XMLReader, which is easier and faster than SAX, offer additional parsing approaches. All the XML extensions are now based on the libxml2 library by the GNOME project. This unified library allows for interoperability between the different extensions. In PHP5, there are totally new and rewritten extensions for parsing XML. Those that load the entire XML document into memory include SimpleXML, the DOM, and the XSLT processor. Those parsers that provide you with one piece of the XML document at a time include the Simple API for XML (SAX) and XMLReader. SAX functions the same way it did in PHP4, but it's not based on the expat library anymore, but on the libxml2 library. If you are familiar with the DOM from other languages, you will have an easier time coding with the DOM in PHP5 than previous versions. SimpleXML shares many of the advantages of the DOM and is more easily coded. It allows easy access to an XML tree, has built-in validation and XPath support, and is interoperable with the DOM, giving it read and write support for XML documents. You can code documents parsed with SimpleXML simply and quickly. Remember however, that, like the DOM, SimpleXML comes with a price for its ease and flexibility if you load a large XML document into memory. The XMLReader extension is a stream-based parser of the type often referred to as a cursor type or pull parser. XMLReader pulls information from the XML document on request. It is based on the API derived from C# XmlTextReader. It is included and enabled in PHP 5.1 by default and is based on libxml2.

Swift for Grid Workflow Management
Greg Nawrocki, InfoWorld Blog

'Grid' doesn't necessarily mean a single program but could be a series of programs or even simple scripts that tie together to form what is often called a workflow. Essentially any set of processes for getting real work done. In the case of workflows the complexity of task management is added to by the complexities of data management for the effective sharing of data sets amongst the tasks of a workflow. The folks over at the Computation Institute at the University of Chicago, a close relative of those that brought us the Globus Toolkit understand these complexities and have introduced Swift. Swift is an open source (naturally) software component that can be used to manage access to Grid services including those associated with the Globus Toolkit. From the web site: "Swift is a system for the rapid and reliable specification, execution, and management of large-scale science and engineering workflows. It supports applications that execute many tasks coupled by disk-resident datasets - as is common, for example, when analyzing large quantities of data or performing parameter studies or ensemble simulations. The open source Swift software combines: (1) A simple scripting language to enable the concise, high-level specifications of complex parallel computations, and mappers for accessing diverse data formats in a convenient manner. (2) An execution engine that can manage the dispatch of many (100,000+) tasks to many (1000+) processors, whether on parallel computers, campus grids, or multi-site grids. According to the SwiftScript Language Reference Manual, "VDL is a language for workflow specification in Data Grid environments which defines a language for describing operations on typed data items and mechanisms for binding data items defined in this language to datasets stored on persistent storage. The binding between data item and dataset is based on the XDTM (XML dataset typing and mapping) model, which separates the declaration of the logical structure of datasets from their physical representation. The logical structure is specified via a subset of XML Schema, where a physical representation is defined by a mapping descriptor, which describes how each element in the dataset's VDL representation can be mapped to a corresponding physical structure such as a directory, file, or database table..."

See also: the language reference

The Tools of the Master Forecaster
Carlos A. Soto, Government Computer News

Business intelligence software can guide managers through difficult decisions, report anomalies or issues in an organization, and help managers check on the condition of their agency. BI software can examine the present state of affairs and analyze past performance trends. If used effectively, the right business intelligence tools can even predict the future... Two broad types of programs make up business intelligence software tools. The first type is the database, or the software-and-server combo that holds the data. Most often, organizations use transactional databases like an enterprise resource planning (ERP) database. Different flavors of ERP software include products from Oracle Corp. and SAP AG. Relational databases, like Microsoft SQL Server, are another common form of database, particularly in the federal sector. Despite the importance of these databases, the following review focuses on products that make up the second part of the business intelligence topology: reporting, querying and analyzing tools that extract information from the aggregated data sources, like SQL, and allow the user to find, manipulate and demonstrate the data.... Many business intelligence software companies use the analogy that business intelligence tools help diagnose the condition of a company or agency in a manner similar to how blood tests and X-ray scans are used to map the condition of a patient. ProfitMetrics takes that concept one step further by stating that the problem isn't in the analysis or numbers, but in the way the analysis and numbers are displayed. ProfitMetrics relies exclusively on Extensible Markup Language in configuring the charts and converting the numbers into visual artifacts. So an XML editor configures the way in which the data will be displayed by creating an XML Dashboard Description file and merging that file with the input file that contains the metrics or raw data of your organization. Finally a quick rendering engine produces a high-density dashboard that can be printed out in PDF format or viewed dynamically on the Web in a Scalable Vector Graphics (SVG) format... iDashboards complies with a multitude of Web and database standards such as Java 2 Enterprise Edition, XML, Open Database Connectivity (Oracle) and Microsoft Windows .Net to get your high-density reports up and running within a week, typically two to three days.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors