Cover Pages: XML Daily Newslink: Friday, 09 March 2007

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
IBM Corporation http://www.ibm.com

Headlines

CURIE Syntax 1.0: A Syntax for Expressing Compact URIs
Globus Announces First Escalation, Welcomes Six New Incubator Projects
Usage Schemas to Tame ODF and OpenXML Down-Conversions
Reevaluating XSLT 2.0
Mike Milinkovich on What Eclipse Sees in OSGi
Open-Source Architected Model-Driven Development in the Real World
Desktop Matters Conference: Java Swing Technologies Highlighted
CSS Text Level 3
DOA with SOA

CURIE Syntax 1.0: A Syntax for Expressing Compact URIs
Mark Birbeck and Shane McCarron (eds), W3C Technical Report

The W3C HTML Working Group and the W3C Semantic Web Best Practices and Deployment Working Group jointly have published the First Public Working Draft for "CURIE Syntax 1.0: A Syntax for Expressing Compact URIs." The aim of this document is to outline a syntax for expressing URIs in a generic, abbreviated syntax. While it has been produced in conjunction with the HTML Working Group, it is not specifically targeted at use by XHTML Family Markup Languages. Note that the target audience for this document is Markup Language designers, not the users of those Markup Languages. More and more grammars are expressing URIs in XML using QNames. Since QNames are invariably shorter than the URI that they express, this is obviously a very useful device. However, a major problem is that the origin of the notion of a QName is such that it does not allow all possible URIs to be expressed. A specific example of the problem this causes comes from the IPTC. They would like to be able to use attributes in their mark-up to carry metadata in their documents, and as a consequence sought to make extensive use of QNames to keep the amount of data being transferred as small as possible. In other words, instead of sending lots of long URIs, QNames were to be used to abbreviate them. However, the purpose of QNames in XML is to provide a way for XML elements that contain a colon to be interpreted as an element with a different name. For example, 'iptc:10112244' is not a valid QName simply because '10112244' is not a valid element name. Yet, in the IPTC example given, the whole reason for using a QName was to abbreviate the URI, and not to create a namespace qualified element name. This gives rise to an interesting problem; the definition of a QName insists on the use of valid XML element names, but an increasingly common use of QNames is as a means to abbreviate URIs, and unfortunately the two are in conflict with each other. This specification addresses the problem by creating a new data type whose purpose is specifically to allow for the abbreviation of URIs in exactly this way. This type is called a "CURIE" or a "Compact URI", and QNames are a subset of this. A CURIE is comprised of two components, a prefix and a suffix. The prefix is optional. When the prefix is supplied, it is separated from the suffix by a colon (:). To disambiguate a CURIE when it appears in a context where a normal URI may also be used, the entire CURIE is permitted to be enclosed in brackets ([, ]). The document is considered not yet stable, but has had extensive review over the last eight months. It is being released in a separate, stand-alone specification in order to speed its adoption and facilitiate its use in various specifications.

See also: W3C Semantic Web Activity

Globus Announces First Escalation, Welcomes Six New Incubator Projects
Staff, Globus Alliance

Globus recently announced that the GridWay Metascheduler project has completed incubation and is now a full Globus project. GridWay gives end users, application developers, and managers of Globus infrastructures a scheduling functionality similar to that found on local resource management systems. It uses GRAM for local job submission, MDS to discover resources, and GridFTP and RFT for job staging. Functionality includes advanced scheduling capabilities, detection and recovery from remote and local failure situations , the ability to submit, monitor, synchronize and control single, array, and interdependent jobs, the capability to monitor Globus resources and users, and functionality to extract Grid accounting information. Gridway has full support for the C and JAVA DRMAA GGF standard for the development of distributed applications on Globus services. In addition, six new projects have joined the dev.globus incubation process, bringing the total number of projects to twenty-two. The six new projects include: (1) Workflow Enactment Engine Project (WEEP) - aims to implement an easy to use and manage workflow enactment service for WS-I/WSRF services and orchestrate the services as described by Web Services Business Process Execution Language (WSBPEL) compliant document; the workflow engine is built up of several components that collectively represent the core architecture and follows the recommendations of the Workflow Reference Model. (2) OGRO (Open Grid Ocsp) is an Incubator project that adds support for the Online Certificate Status Protocol (RFC 2560) to the Globus Toolkit; OGRO is 100% Java and can be easily configurable through the Grid Validation Policy (GVP), a set of XML rules that mandates its behavior. (3) Data Distribution Manager (DDM) is an Incubator project that provides an efficient data distribution service for tracking, transporting and synchronizing large-scale, distributed data sets. (4) The Gavia Metascheduler (Gavia-MS) is an Incubator project that implement a metascheduler exposed as a web service. This is achieved by using the Globus Toolkit 4 as the grid middleware and Condor for its matchmaking capabilities. (5) The Gavia Job Submission Client (Gavia-JSC) is an Incubator project that implements a generic graphical user interface for job submission, monitoring and management that is tailored to work with a Globus 4 grid running the Gavia Metascheduler (Gavis-MS). (6) SJTU GridFTP GUI Client (SGGC) is an interactive GUI client for GridFTP.

See also: WEEP Version 1.0

Usage Schemas to Tame ODF and OpenXML Down-Conversions
Rick Jelliffe, O'Reilly Articles

Kitchen-sink standards are developed by committees and have to cope with a wide variety of different applications. If someone's software does something, there has to be some element or attribute or value stuck in. Sometimes the backdoor of properties (open ended value lists) is used, so that the schema can be simplified at the expense of enumerating possible values. But schemas like DOCBOOK, TEI, ODF, and OpenXML are classic kitchen sinks. There is an objective way to detect them: check their 'Structured Document Complexity Metric' (online) and if it is over 300, you probably have a kitchen sink. I gave some metrics earlier in Comparing Office Document Formats. Now the trouble with kitchen-sink schemas is that any particular set of documents will only use a subset of the total possible features. So writing a complete converter that accepts any possible input from a kitchen-sink schema and outputing them to some more targetted document type is a completely wasteful process. YAGNI. But, and here's the rub, every so often, someone will in fact use one of these strange often, someone will in fact use one of the elements you didn't expect. One way to cope with this is the usage schema. This is a schema derived from sampling representative documents. When new documents come in, you first validate them against the usage schema, and if there is a problem, escalate it to the roject management to schema, and if there is a problem, escalate it to the roject management to discuss how to handle it. It is a sign that the data is not what they expected. There are some tools to generate XSD usage schemas, but you can also generate them using Schematron. The tool I use first generates all three-level Xpaths found in the document, then makes a Schematron schema that reports if any node was found that was not caught by these XPaths. Very straightforward, but effective. Another use for usage schemas is for software development.

See also: Complexity Metrics

Reevaluating XSLT 2.0
Kurt Cagle, O'Reilly Articles

About XSLT 2.0 increasingly being used as a 'router' language, replacing such applications as Microsoft's BizTalk Server: " This is not a disparagement of BizTalk—it's actually one of the Microsoft technologies that I have actually endorsed on a regular basis, because it solves one of the thornier issues involved in creating complex data systems—how do you handle the intermediation of data coming from different data sources, and while I have some quibbles about the interface, I think BizTalk does its job admirably. It also served as a bridge technology for quite some time between the SQL and XML worlds, and it will continue to serve in that role for quite some time to come... In an AJAX oriented system, such invocations could (and generally should) be done asynchronously -- the first XSLT passes the initially processed XML to a second asynchronous transformation using result-document, which would then be retrieved as a message from a set of queued messages. In this particular case, the effective routing could be done solely within the first XSLT, with little need to create multiple synchronous chains of transformations, though its likely that the resulting transformation would need to include a link to the message queue to be queried for responses, possibly with a transaction identifier -- likely using the new XPath current-dateTime(). Such a system is a routing system—you are using the XSLT as a router for XML messages to be sent to the appropriate internal services, where each of those systems in turn exist either as URLS or as named extensions. That it also can serve as a validation system is not accidental—one of the powers of XML is that you can check for the validity of XML without the danger of instantiating the object in live form... XSLT 2.0 is able to assume a much more extensive work-horse mode than it has previously. Most of these modes have already been explored with older extended XSLT 1.0 processors, but because such implementations tended to differ in critical areas developers and IT managers tended to shy away from them for all but very specialized applications.

See also: XSLT Version 2.0

Mike Milinkovich on What Eclipse Sees in OSGi
Rich Seeley, SearchWebServices.com

At EclipseCon 2007, Mike Milinkovich, executive director of the Eclipse Foundation, headlined his press conference with a technical talk titled "The Importance of OSGi." The Open Service Gateway initiative has been the foundation of the Eclipse platform since the early part of this decade, before there was an Eclipse Foundation or an EclipseCon. As evidence of this close relationship, the OSGi Developer Conference is being co-located at EclipseCon in Santa Clara this week. In an interview [here excrepted], Milinkovich explained why OSGi, with its roots in embedded systems and computer games, is something enterprise developers working with Eclipse frameworks and tools need to pay attention to it. His talk opened with a slide showing the OSGi implementation in Eclipse Equinox underlying not only enterprise applications, but also service-oriented architecture and Rich Internet Applications (RIA) including Ajax. Milinkovich: "OSGi is a standards organization that started in 1999 around Java and initially set-top boxes. It's additionally evolved into mobile automotive applications and it recently started an enterprise expert group. We're seeing additional uptake OSGi in the middleware space as well... Quite a few middleware companies are starting to build middleware stacks on top of OSGi. IBM WebSphere 6.1 is built on top of Equinox, which is our implementation of OSGi. BEA announced their micro-server architecture (MSA), which is also based on OSGi. You are starting to see greater adoption in the server stacks from these various vendors... The Eclipse plug-in is an implementation of the OSGi bundle specification. That's where the synergy exists between the two organizations. OSGi is the standards organization. Eclipse is an open source organization that provides an implementation of the standard. Obviously, the synergy between open source and open standards is something that is one of the major change agents in software today.

See also: the press release

Open-Source Architected Model-Driven Development in the Real World
Steve Andrews and Stosh Misiaszek, IBM developerWorks

This article narrates how Number Six Software used Model Driven Development techniques to provide the United States' Veterans Administration with a health services portal that realized significant cost and quality improvements. The OMG MDA concept provides for the transformation of a platform-independent model (PIM) that describes the business concerns of the application into a platform-specific model (PSM) that describes how the application will be implemented. The target platform is typically described in terms related to the implementation language and a technical framework on top of that. J2EE is an example of a technical framework that consists of a series of interfaces in the Java programming language. Architected Model-Driven Development (AMDD) is a strategy that leverages the strengths of both application frameworks and MDA techniques. The general premise is that a model defines the entities and services of the application, and that model gets transformed to a PSM that is based on a combination of a technical framework and an application framework. Atlas provides a common way to model the primary concerns of an application: the maintained state (entities) and the exposed behavior (services). This model is also known as the application metadata. It is defined using XML, and it is based on the OMG Meta-Object Facility (MOF) and the Unified Modeling Language (UML). For the initial versions of Atlas, the decision was made to use the XML metadata-based approach in order to force more rigor in adhering to the modeling conventions required by the Atlas transformation engine. Model validations at this point are primarily syntactic in nature and revolve around validation against a series of XML Document Type Definitions (DTD). In the future, UML profiles and associated model validations may emerge in order to support visual modeling. The primary difference between Atlas models and those of other MDA solutions is one of abstraction. One of the goals of Atlas is to unburden the application developer from as much of the underlying architectural mechanisms as possible.

Desktop Matters Conference: Java Swing Technologies Highlighted
Paul Krill, InfoWorld

A potpourri of Java client application technologies is on the agenda at the Desktop Matters Conference. Again dominating the discussion are technologies pertaining to the Swing desktop client platform for Java. Presentations are featured on such projects as the Spring Rich Client Project, jMatter, and SwiXml. The desktop can be a better option than the Web in some circumstances, one attendee said. "For certain types of applications, desktop's the only way to go," said Rob Abbe, chief software architect at Captovation, which develops document capture software. He cited applications that need to interface with peripherals like high-speed scanners as an example. One technology featured, the open-source Spring Rich Client Project, shares code with the popular Spring application framework for Java but is geared to rich client applications. The mission of the project is to provide an elegant way to build rich client applications that leverage the Spring framework, he said. The initial focus is support for Swing applications, according to the project's Web site. Another open-source project, jMatter, is built for developing applications for small businesses. It leverages Naked Objects Architectural Pattern, which presents new ideas on how to build applications, said Eitan Suez, author of jMatter and president of Uptodata. Swing and the Hibernate object-relational persistence software are featured in jMatter. SwiXml is a GUI generating engine for Swing applications. GUIs are described in XML documents that are parsed at runtime and then rendered into javax.swing objects, according to the SwiXml Web site. SwiXml is used mostly for desktop applications. With SwixML, code is separated from the layout like with HTML, making for easier maintenance, said Wolf Paulus, developer of SwixML and a software architect at Cardiff. Another technology discussed, Canoo UltraLightClient, leverages Swing and a Web architecture. Application logic resides on the server rather than the client. A component-oriented programming model is leveraged. The UI is rendered in Swing rather than in HTML.

CSS Text Level 3
Michel Suignard (ed), W3C Technical Report

W3C announced that the CSS Working Group has released a Working Draft specification for "CSS Text Level 3". Formerly released in June 2005 under the title "CSS3 Text Effects Module," this draft is part of the Cascading Style Sheets (CSS) language Level 3. CSS level 3 includes all of level 2 and extends it with new selectors, fancy borders and backgrounds, vertical text, user interaction, speech and much more. This CSS3 module "CSS Text Level 3" defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white space handling, text decoration, and text transformation. White space processing in CSS interprets white space characters for rendering: it has no effect on the underlying document data. In the context of CSS, the document white space set is defined to be any space characters (Unicode value U+0020), tab characters (U+0009), or line break characters (defined by the document format: typically line feed, U+000A). Control characters besides the white space characters and the bidi formatting characters (U+202x) are treated as normal characters and rendered according to the same rules... In many writing systems, words are always separated by spaces or punctuation. In the absence of a hyphenation dictionary, a line break can occur only at these explicit word boundaries. In Chinese and Japanese typography, however, no spaces nor any other word separating characters are used. In these systems a line can break anywhere except between certain character combinations. Additionally the level of strictness in these restrictions can vary with the typesetting style... Text wrapping is controlled by the 'text-wrap' and 'word-wrap' properties... When restricted text-wrapping is enabled, UAs that allow breaks at punctutation other than spaces should prioritize breakpoints. For example, if breaks after slashes have a lower priority than spaces, the sequence "check /etc" will never break between the '/' and the 'e'. The UA may use the width of the containing block, the document language, and other factors in assigning priorities... the 'word-spacing' property specifies spacing behavior between words: the first 'word-spacing' value specifies the desired (optimum) spacing; the second value specifies the desired minimum spacing limit, and the third specifies the desired maximum spacing limit...

See also: W3C CSS references

DOA with SOA
Alex Bell, ACM Queue

"Many people have different ideas about what SOA is and is not. Thankfully, I have the benefit of a 13-year-old daughter in the household, so there is no shortage of expert opinion on any topic. I asked her what she thought was meant by service-oriented architecture. She told me that this was an approach used for constructing the buildings where she buys, among other things, her Hollister and American Eagle clothing. There are certainly different opinions about what SOA might be, but this one might be a bit extreme. Some projects might say they are dancing the SOA tango merely by using XML, WSDL, SOAP, and UDDI technologies. Others may believe they are saluting the SOA flagpole if they are using OOD and their classes are stateless. In actuality, SOA describes an architectural style that is independent of using a particular technology. This architectural style involves advertisement of services in some form of a registry that clients can use to introspect, discover, hook up to, and invoke services of their choosing. The properties associated with these services are described by SLAs (service-level agreements), which might be measured in terms of processing time, number of messages per minute, and number of rejected transactions. SOA is enabled by technologies such as those mentioned earlier, as well as others such as CORBA and DCOM, which have been around much longer... How is your SOA health? Do you presume that SOA can be enabled only by Web services? Do you believe that the benefit of properties such as encapsulation and abstraction are important only in "old-fashioned" architectural approaches? Has implementing your definition of SOA resulted in elimination of any major engineering activities? Pay close attention to how you answer. You will not want to miss the warning signs of potentially being DOA with SOA.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors