Cover Pages: XML Daily Newslink: Friday, 22 August 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
IBM Corporation http://www.ibm.com

Headlines

Making XQuery Control Structures Work for You
Freebase Parallax a Promising Semantic Web Search Tool
SOMA-ME: A Platform for the Model-Driven Design of SOA Solutions
A Runtime-Adaptable Service Bus Design for Telecom Operations Support Systems
This Week in HTML 5: Episode 3
Automatic Deployment Toolkit for an SOA Project Environment, Part 1
e-Forestry Industry Data Standards Schema Working Draft
Space Junkies Ask 'Who Owns The Moon?'

Cover Pages

W3C Member Submission for Creative Commons Rights Expression Language (ccREL)

Making XQuery Control Structures Work for You
Kurt Cagle, DevX.com

The XQuery language is the XML analogue of SQL, designed to augment XPath 2.0 by working with sets of values, not just with single scalar values. A colleague of mine, a database programmer who had spent some time working with an early XQuery implementation, once referred to the language as the smiley language. His comment served to highlight the fact that while XQuery has a structure similar to most languages have, the differences can trip you up dramatically. XQuery is not a hard language to learn, but it can be a language that makes you try to understand why it's not working. Xquery's control structures have been given the rather quaint acronym of FLOWR, shorthand for the most critical (but not the only) XQuery structures used by the language. FLOWR itself stands for five operations: For, Let, Order by, Where, Return. Four of which have analogs in SQL: SELECT, SET, ORDER BY, WHERE. These terms are used to either assign or retrieve items from a set. XQuery is a set manipulation language. Its whole purpose is to work with sets of information, not just single scalar values. Additionally, as a set manipulation language, it is meant to augment—rather than replace -- the XPath 2.0 language that's built into the specification. Indeed, in essence, most of XQuery is just a means to wrap a control language around XPath, in a manner somewhat similar to the way that XSLT combines a templating language with XPath. For this reason, when you are working with XQuery, the most effective use of the language is to do as much as possible within XPath 2 first, then go to the XQuery command structures when you've reached a point where XPath 2 can't quite get you all the way. The XQuery 1.1 working draft hints at other control structures, including the GROUP BY operator that makes it possible to aggregate results by group selectors and WINDOW clauses, which provide an ability to easily do set operations on subsequences of a given sequence. Additionally, the XQuery Scripting Extensions (or XqueryScript) provide other control structures that are designed to make XQuery easier to use within a more formal scripting role. However, both these drafts are still very much in development, and currently no commercial or open source implementations of XQuery support these capabilities. Control structures are not necessarily glamorous; indeed, they are about as exciting as rebar scaffolding, but like such scaffolding they are a critical part of building any XQuery application. Understanding how to work with these structures can make the difference between a useful, flexible application, and a one-off piece of code that will have to be written over and over again...

See also: W3C XML Query (XQuery)

Freebase Parallax a Promising Semantic Web Search Tool
Martin Heller, InfoWorld

Freebase is a semantic Web site from Metaweb and one of the technologies used by Powerset, which was recently acquired by Microsoft. The Freebase data set has been growing by leaps and bounds, and its user interface has improved on a monthly basis. Recently, David Huynh joined Freebase from the SIMILE project at MIT, and he's already brought insight and technology from SIMILE to bear on Freebase: witness his Parallax prototype. A normal search like Google returns a set of results and allows you to look at them one at a time. A normal information site like Wikipedia has individual articles on subjects, loosely joined by hyperlinks. Freebase adds a layer of ontology and semantic relations to information gleaned from sites like Wikipedia. So, for instance, an article about Jon Udell at Wikipedia says that he's a U.S. journalist who used to work for InfoWorld and now works for Microsoft. A similar article about Jon at Freebase adds structured ontological classification information for Jon as a person and an author. Thus, you can find Jon in Freebase not only by name, but also by searching for authors or people born in Philadelphia in 1956, and you can also follow the Publishing relation to his book. Parallax takes advantage of that additional information to search Freebase with sets, examining many-to-many relationships. So, to use the example in David's video, you can ask Parallax to find the collection of U.S. Presidents, filter that by Republicans, find all of their children, find all the places their children went to school, and create a map of those schools, using one search per step. Even better, David has open-sourced Parallax on Google Code. I pulled down the trunk source code from Subversion on Friday, and had a good look at it. It's one honking sophisticated JavaScript application, which (if I read it right) uses SIMILE technology and jQuery to call the Freebase Metaweb API." From the web site description: "Freebase is an open database of the world's information. It is built by the community and for the community—free for anyone to query, contribute to, build applications on top of, or integrate into their websites. Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC, it contains structured information on many popular topics, like movies, music, people and locations—all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users, who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients..."

See also: the Freebase web site

SOMA-ME: A Platform for the Model-Driven Design of SOA Solutions
L.-J. Zhang, N. Zhou, Y.-M. Chee (et al.), IBM Systems Journal

Service-oriented architecture (SOA) is an information technology (IT) architectural approach that supports the creation of business processes from functional units defined as services It has become a major focus in the emerging services computing discipline, which explores the ways in which IT can be used to develop and manage business processes efficiently. Helping customers implement SOA solutions, however, involves some major challenges. How do we develop an SOA solution in a way that ensures the reusability of the software artifacts developed? How do we ensure that the solution we develop is extensible? How do we develop software tools that validate an SOA solution so that we reduce the cost of maintaining it over its lifetime? In this paper, the service-oriented modeling and architecture modeling environment (SOMA-ME) is presented as a platform for addressing the above challenges. SOMA-ME is first a framework for the model-driven design of service-oriented architecture (SOA) solutions using service-oriented modeling and architecture (SOMA) or similar methods... A considerable amount of work has been performed in federated discovery of Web services and dynamic Web services composition, and related products and standards are being developed. IBM Web Services Outsourcing Manager (WSOM) enables the dynamic composition of service-oriented business process flow based on customer requirements. An XML-based business process outsourcing language was proposed for capturing customer requirements. The requirements are analyzed and a search script for automatically finding services is generated; services are then composed for performing the required task. WSOM makes use of Business Explorer for Web Services, a federated Web services discovery engine. In this paper, SOMA-ME has extended WSOM by using model-driven architecture and best practices of designing SOA solutions. This object model-based analysis for Web services discovery could be leveraged in the proposed SOMA-ME to support its services identification phase. A new Web Services Description Language (WSDL) metamodel that is QoS-enabled (quality of service-enabled) was introduced [with] model-driven process for Web service development that includes the transformation of WSDL to Unified Modeling Language. A Service Component Architecture-based (SCA-based) UML profile was used to model services and their extra-functional properties. A novel, model-driven approach, based on UML 2.0, was developed, which takes existing Web service interfaces as input and generates an executable Web service composition. As we know, the WSDL file is just one type of solution artifact in an SOA solution... Any SOA solution design process, and SOMA in particular, can be customized and enhanced for different industry domains or solution scenarios. In any such design process, the SOMA-ME model is populated with information specific to the SOA solution. Once solution modeling is complete, the model can be used to generate design documents, code artifacts, and other work products in an automated fashion, all of which can be captured in Word format. The SOMA-ME framework we describe in this paper and its associated tool can be viewed as a manufacturing model of SOA solutions. This manufacturing model relies on a governed environment comprising tools, processes, architectures, and repositories, which are used to build reusable assets in a systematic way...

See also: 'SOMA

A Runtime-Adaptable Service Bus Design for Telecom Operations Support Systems
I.-Y. Chen, G.-K. Ni, and C.-Y. Lin, IBM Systems Journal

Although vendors such as IBM, BEA, and Oracle provide ESB products for integrating service-based enterprise applications, these solutions sometime incur high implementation costs and present complex migration and management problems. For changing requirements, commercially available ESB products only allow such changes to be implemented at design time, which means that the ESB server has to be shut down for recompiling the application and rebooting the system. This problem has been studied and a number of solutions have been proposed to support a service bus design in which changes to the application can be carried out at runtime. Penta et al. presented WS-Binder, a framework that enables dynamic binding of service composition. It supports three binding types: pre-execution binding, runtime binding, and runtime re-binding. The two binding types at runtime enable the dynamic binding of alternative services when the primary services are unavailable. The decision policy used in this approach is based on quality of service (QoS) requirements... This paper proposes the adaptable service bus (ASB), a service bus that also enables dynamic composition of services. Whereas the ASB shares many of the characteristics of a conventional ESB,10 it places the service endpoint, service operation name, and parameter values related to the service operation in external storage. In order to modify the behavior of the application at runtime, developers can adjust the parameter values maintained in the external storage... the ASB is composed of five components. The first, a component user interface, enables developers to interact with the system so that they can set or modify rules and parameters. The second, a component service bus registry, performs two key functions: it registers the available services and it records performance measures over time. The third, a component service router, is a rule-based routing engine that determines which services will be invoked. One of the major challenges faced by an SOA-based OSS is the need to make use of legacy services, which are not NGOSS-compliant. Ensuring the support for legacy services is the responsibility of the fourth component, the data transformer. The transformer employs Extensible Stylesheet Language (XSL) to translate the older Billing Service Data (BSD), used by legacy services, into the NGOSS-mandated Shared Information Data (SID) format, and vice versa. The last component is the service invocator, which invokes a service on behalf of the process that originated the request.. One of the major difficulties faced by telecommunications companies striving to implement NGOSS is that NGOSS is not always compatible with the existing systems. Specifically, if a service is invoked using the NGOSS-compliant SID format embedded in WSDL, then a non-NGOSS-compliant service will be unable to respond adequately. This is because the noncompliant services of the legacy systems rely upon the BSD data structure. To address this problem, the ASB incorporates a data transformer component. The transformer acts as a translator between the service consumer and the service provider. Translation is accomplished by means of Extensible Stylesheet Language Transformations (XSLT) technology, which provides a program interface for developers to implement mapping between two different data models... Because the billing process is just one of the target areas for NGOSS compliance, future work will be directed toward the re-engineering of other business processes, such as operations support and readiness, fulfillment, and assurance."

This Week in HTML 5: Episode 3
Mark Pilgrim, The WHATWG Blog

"Welcome back to 'This Week in HTML 5,' where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. The biggest news this week is the birth of the event loop. [As presented in the August 23, 2008 version of "HTML 5 Draft Recommendation"], To coordinate events, user interaction, scripts, rendering, networking, and so forth, user agents must use event loops as described in this section. There must be at least one event loop per user agent, and at most one event loop per unit of related similar-origin browsing contexts. An event loop always has at least one browsing context. If an event loop's browsing contexts all go away, then the event loop goes away as well. A browsing context always has an event loop coordinating its activities. An event loop has one or more task queues. A task queue is an ordered list of tasks, which can be Events, Parsing, Callbacks, Using a resource, Reacting to DOM manipulation..." The purpose of defining an event loop is to unify the definition of things that happen asychronously. I want to avoid saying "events" since that term is already overloaded. For example, if an image defines an onload callback function, exactly when does it get called? Questions like this are now answered in terms of adding tasks to a queue and processing them in an event loop... The other major news this week is the addition of the hashchange event, which occurs when the user clicks an in-page link that goes somewhere else on the same page, or when a script programmatically sets the location.hash property. This is primarily useful for AJAX applications that wish to maintain a history of user actions while remaining on the same page. As a concrete example, executing a search of your messages in GMail takes you to a list of search results, but does not change the base URL, just the hash; clicking the Back button takes you back to the previous view within GMail (such as your inbox), again without changing the base URL (just the hash)..."

Automatic Deployment Toolkit for an SOA Project Environment, Part 1
HeQing (Hawking) Guan, Qiang Bai, (et al.), IBM developerWorks

Before starting development of a Service-Oriented Architecture (SOA) project, you need to make the development environment ready. There are a variety of environments that you may need to prepare in a project development life cycle, including developer, integration, test, solution demo, and customer production environments. In each environment, you need to properly install and configure a variety of software. Assume that there's an XYZ project, which requires twelve engineers (nine developers and three testers) and sisteen machines (each engineer has one machine, two machines are integration servers, and the other two machines are test servers). In this example, the XYZ project is an integrated case management solution for social service, which is built on the IBM SOA technology stack... In this scenario, almost all of the five applications should be installed and configured properly on almost all of the sisteen machines. Such repeatable tasks are time-consuming and error-prone, which has been a major challenge in most large engagements. An automatic deployment toolkit, named Automatic-DT, can handle this problem. Automatic-DT is written mostly with Python scripts. It helps you install and configure deployment nodes with several automatically installed and configured IBM software products. It also helps testers and developers refresh a build in their daily tests or integration life cycle. Plus, after proper packing, you can use it in a customer environment for solution deployment. This article series introduces an automatic deployment toolkit (Automatic-DT), which helps infrastructure architects install and configure deployment nodes with IBM software installed and configured automatically. This article (Part 1) provides an overview of Automatic-DT. The Automatic-DT is divided into several components: (1) Repository server: Stores software installation images; it can be an HTTP/FTP server or a local file folder; (2) Controller: The script's execution entry; in the controller, a list is used to store all software needed to be installed and uninstalled by sequence., where the list can be modified, for example, adding or removing its elements, changing the elements order, and so on; (3) Specific components for software installation and uninstallation. Automatic-DT is useful if you're installing the same software on a large number of machines. For example, in the XYZ project you only need to prepare two different Automatic-DT configuration files: one for testers and one for developers. Then repeat executing Automatic-DT scripts until all machines are prepared.

e-Forestry Industry Data Standards Schema Working Draft
Roger Coppock and Ian Logan, OASIS Forest Industries TC Contribution

Roger Coppock (UK Forestry Commission) has posted a working draft for the "e-Forestry Industry Data Standards Schema" to the OASIS TC's discussion list. "The Electronic Forestry Industry Data Draft (eFIDS) is designed to facilitate electronic trading within the forestry industry. The schema provides an XML framework that allows a variety of different trading documents to be used between various parties, for example: Delivery Notes, Invoices, Self-Bill Invoices, Advice Notes. The schema does not define the documents themselves but simply provides the framework for the documents and their corresponding data. Hence there is no XML element called "Delivery Note" but there is an element whose function is to contain the descriptor for a Delivery Note. The forestry industry in the UK has been working on electronic trading since the 1990s. It was recognised in the early stages that without an agreed data standard, that suited the particular trading conditions, it would be difficult to achieve widespread adoption of electronic trading across the industry. It was therefore agreed that a number of leading organisations in both the Public Sector (Forestry Commission, Scottish Enterprise) and Private Sector (BSW Timber, Norbord, UPM) should collaborate to identify the main aspects of the supply chain through an eBusiness Forum (EBF). It was concluded that the main transactional areas were in the timber despatch and invoicing areas. In order to speed up the development of a standard it was decided to use an existing specification which was originally based upon United Nations-EDIFACT standards and this suited the future development of eFIDS. The eFIDS standard is designed for use across the complete supply chain from forest to processor to re-seller. It provides the basis for implementation of a range of e-business applications and a number of developments were soon implemented. In order to increase the global appeal of eFIDS it was decided to host them under the OASIS banner and this process was completed in 2005. Within the OASIS Forest Industries Technical Committee the development of the standard has continued. At the same time there has been an increasing number of collaborations using eFIDS in a wider range of circumstances. Although eFIDS has been developed within the UK, it has always been the intention from the beginning that the standard should be capable of being used across international trading partners."

Space Junkies Ask 'Who Owns The Moon?'
Stefanie Olsen, CNET News.com

Within the next 10 years, the U.S., China, Israel, and a host of private companies plan to set up camp on the moon. So if and when they plant a flag, does that give them property rights? A NASA working group hosted a discussion this week to ask: who owns the moon? The answer, of course, is no one. The Outer Space Treaty, the international law signed by more than 100 countries, states that the moon and other celestial bodies are the province of all mankind. No doubt that would irk all of the people throughout the ages, like monks from the Middle Ages, who have tried to claim the moon was theirs. But ownership is different from property rights. People who rent apartments, for example, don't own where they live, but they still hold rights. So with all of the upcoming missions to visit the moon and beyond, space industry thought leaders are seriously asking themselves how to deal with a potential land rush, cowboy-style. According to William Marshal, a scientist in the small spacecraft office at NASA, "it comes down to assigning rights in the best interest of humanity, including ensuring no monopolies and no military installations. Entities can apply for space in geostational orbit and receive a slot on a first come, first serve basis, according to Marshal. That's an interesting model, he said, because it does that without granting ownership and allows access by less prosperous nations. "In conclusion: Who owns on the moon: no one. Who should own the moon: no one. Does this stop property rights? No. The best way forward is probably some sort of property licensing body like how it works in geo."

Selected from the Cover Pages, by Robin Cover

W3C Member Submission for Creative Commons Rights Expression Language (ccREL)

On August 20, 2008 W3C published the text of a Member Submission from Creative Commons: ccREL: The Creative Commons Rights Expression Language. The paper introduces a standard recommended by Creative Commons (CC) for machine-readable expression of copyright licensing terms and related information. ccREL is a major update of the earlier work of Creative Commons, proposing an annotation mechanism for expressing licenses of resources on the Web. The Creative Commons Rights Expression Language (ccREL) builds upon the astronomical success of Creative Commons licenses. CC licenses are embeddable machine-readable legal instruments allowing authors to express permissions for others to share, remix, and reuse content. Melissa Reeder (Creative Commons Development Manager) recently wrote in a newsletter that "by current count, there are more than 77 million Flickr photos under CC licenses." Supported by free tools, the Creative Commons licenses let authors, scientists, artists, and educators easily mark their creative work with the freedoms they want it to carry. Creative Commons licenses are expressed in three different formats: the Commons Deed (human-readable code), the Legal Code (lawyer-readable code); and the metadata (machine-readable code). Each license helps the creator retain copyright and announce that other people's fair use, first sale, and free expression rights are not affected by the license. The Creative Commons Rights Expression Language (ccREL) Member Submission "provides a comprehensive approach, covering an abstract model using RDF, a definition of basic properties and classes that can be extended and reused by third parties, and recommended practices to serialize this abstract model. The abstract model separates the concept of License, which can be characterized by a number of predefined properties with possible values, and so called 'work properties', i.e., properties that relate a specific work to a specific instance of a License. ccREL is firmly rooted in RDF, meaning that the various syntax possibilities for ccREL are also bound to possible RDF serializations. For (X)HTML documents, RDFa is the preferred serialization format (discontinuing the previous practice of adding RDF/XML code as an HTML comment in the HTML source), and the document gives several examples of how to do that in practice. For other document formats, usage of GRDDL, direct embedding of RDF data, XMP, etc, are also described. The separation of the abstract RDF-based model from the specific syntax is rewarded insofar as many different syntaxes become possible depending on the underlying Web resource format. This is also a major step forward compared to the earlier Creative Common license recommendations."

See also: Creative Commons references


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Cover Pages

Sponsors