Cover Pages: XML Daily Newslink: Thursday, 26 July 2007

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Primeton http://www.primeton.com

Headlines

XML Processing and Data Integration with XQuery
W3C Call for Implementations: Content Selection for Device Independence
A First Graphical Topic Maps (GTM) Level 1 Proposal
Developing Web Services Using PHP
Whose Name Is It Anyway? Decentralized Identity Systems on the Web
Ten Reasons to Model XML with RELAX NG, Not W3C XML Schema
Microsoft Inches Closer to Open Source
U.S. Health Info Technology Lags

XML Processing and Data Integration with XQuery
Jonathan Robie, IEEE Internet Computing

This article shows how to use XQuery for native XML processing and data integration, briefly explores other technologies used in the same space, and discusses some XQuery extensions for scripting and updates that are under way. Many Internet applications devote a great deal of their code to XML processing and data integration. XML is generally processed with conventional programming languages or scripting languages with XML APIs, and a great deal of the code involves parsing the XML, navigating, and casting XML data into constructs understood by the language being used. In middle-tier Web applications, much of the code is devoted to negotiating the differences among a wide variety of systems and sources, each with its own API, data model, and perhaps query language. Thus, programmers must learn how to program each data source and write code that integrates results across multiple data sources. XQuery simplifies XML processing because it's a native XML language that works with XML as naturally as an object-oriented language works with objects. When used with middleware that represents non-XML data sources as XML, XQuery simplifies data integration by freeing programmers from the details of each data source, while still allowing implementations to provide efficient access to the underlying data. In the same way that Structured Query Language (SQL) queries relational tables and creates tables as a result, XQuery queries XML and produces XML. The leading relational databases now support an XML datatype and provide XQuery implementations for querying this data. The SQL 2006 standard supports XQuery calls from within SQL,1 and most native XML databases and repositories also support it... Web services are an extremely important data source on the Web, and several XQuery implementations provide ways to make Web service calls from within queries. Because SOAP messages are XML documents, an XQuery can directly query either the envelope or the payload. There's currently no standard for calling a Web service in XQuery, but several vendors have provided proprietary ways of doing so... Given that XML has become the most popular language for data exchange, there's serious interest in the use of XQuery for efficiently processing in-flight XML. Instead of directing one query to a potentially large number of documents, each incoming document is directed to a potentially large number of queries—something done in streaming event processors but not traditionally with structures as rich as those of XML.

See also: W3C XQuery

W3C Call for Implementations: Content Selection for Device Independence
Rhys Lewis, Roland Merrick (eds), W3C Technical Report

W3C has announced a Call for Implementations in connection with two specifications from the Ubiquitous Web Applications Working Group: "Content Selection for Device Independence (DISelect) 1.0" and "Delivery Context: XPath Access Functions 1.0." Both specifications have been advanced to Candidate Recommendation status. Implementation feedback is welcome through 31-October-2007. The DISelect document specifies a syntax and processing model for general purpose content selection or filtering. Selection involves conditional processing of various parts of an XML information set according to the results of the evaluation of expressions. Using this mechanism some parts of the information set can be selected for further processing and others can be suppressed. The specification of the parts of the infoset affected and the expressions that govern processing is by means of XML-friendly syntax. This includes elements, attributes and XPath expressions. This document specifies how these components work together to provide general purpose selection. The "Delivery Context" document specifies a set of XPath functions that can be used to manipulate the Delivery Context associated with a request for an item of content. These functions have been designed to satisfy the requirements to adapt content based on the Delivery Context. While designed to work with Device Independent Content Selection (DISelect) it can be used in any XPath processor. The W3C Ubiquitous Web Applications Working Group seeks to simplify the creation of distributed Web applications involving a wide diversity of devices, including desktop computers, office equipment, home media appliances, mobile devices (phones), physical sensors and effectors (including RFID and barcodes). This will be achieved by building upon existing work on device independent authoring and delivery contexts by the former Device Independence WG, together with new work on remote eventing, device coordination and intent-based events.

A First Graphical Topic Maps (GTM) Level 1 Proposal
Lars Marius Garshol, Blog

ISO has been working on creating a graphical notation for Topic Maps for some time now, and now the first proposal for level 1 has been published. Graphical Topic Maps (GTM) Level 1 is the ontology part; there is also a level 0, which is the instance part. This is not a formal draft, just a simple slide show explaining the formalism, so it should be easy to read for anyone who's interested. This is the first step towards creating a standard, just coming up with something so people can say whether they like it or not. If the community approves it can be taken forward and a proper draft made; if the community does not approve a different proposal will have to be put together. The purpose of GTM level 1 is to provide people with a standard way to draw Topic Maps ontologies, which we hope will make it easier to create and exchange ontologies. We also hope people will create tools for drawing GTM diagrams. Ideally, these tools would also be able to export GTM models as TMCL schemas, and perhaps even be able to import TMCL schemas and create models from them. This proposal is based on UML, and tries to keep everything as simple and visually compact as possible. It's based on UML because most people seem to use something similar anyway (boxes with names in for topics, lines for associations), and this is in any case the most obvious route. It aims at compactness because my experience is that once you create some non-trivial models visual real-estate quickly becomes scarce... Graphical Topic Maps Notation will eventually become ISO 13250-7. GTM will consist of two sub-parts, for the time being informally known as GTM level 0 and GTM level 1.

See also: the text of the proposal

Developing Web Services Using PHP
Deepak Vohra, O'Reilly ONLamp.com

A web service consists of a server to serve requests to the web service and a client to invoke methods on the web service. The PHP class library provides the SOAP extension to develop SOAP servers and clients and the XML-RPC extension to create XML-RPC servers and clients. A web service is a software system designed for interoperable interaction over a network. A web service is defined with a WSDL (Web Services Description Language) document, and other systems interact with the web service using SOAP messages, transferred using HTTP with an XML serialization. A web service is an abstract resource that provides a set of functions and is implemented by an agent, which sends and receives messages. A provider entity provides the functionality of a web service with a provider agent and a requester entity uses the web service functionality with a requester agent. Web services implement various technologies, some of which are XML, SOAP, and WSDL. XML is a standard format for data exchange. Web service requests and responses are sent as XML messages. The elements and attributes that may be specified in an XML document are specified in an XML Schema. SOAP provides a standard framework for packaging and exchanging XML messages. A WSDL document specifies the operations (methods) provided by a web service and the format of the XML messages. The SOAP and XML-RPC extensions are packaged with the PHP 5 installation. The SOAP extension and the XML-RPC extension are not enabled by default in a PHP installation. To enable the SOAP and XML-RPC extensions, we add extension directives in the 'php.ini' configuration file... The SOAP extension supports subsets of the SOAP 1.1, SOAP 1.2, and WSDL 1.1 specifications. After activating the SOAP extension in the PHP configuration file, a SOAP server and a SOAP client may be created using the SOAP PHP class library. A SOAP server serves web service requests and a SOAP client invokes methods on the SOAP web service...

Whose Name Is It Anyway? Decentralized Identity Systems on the Web
Daniel J. Weitzner, IEEE Internet Computing

A new form of personal identity is emerging on the Web. Decentralized identification protocols are a departure from traditional distributed authentication approaches developed for the Internet. From a technical perspective, they're quite similar to distributed systems based on public-key infrastructures or federated identity systems, such as that proposed by the Liberty Alliance or Microsoft's Passport. What distinguishes the new decentralized approach is its use of URIs as the underlying identifier... When it comes to representing our identity on the Web, we long for a unified approach to authentication that can simplify the tangle of different usernames and passwords that we try to maintain in our heads or on actual or virtual sticky notes. Efforts to solve this so-called "single-sign-on" problem have yet to achieve any noticeable deployment. Into this breach steps OpenID, an emerging set of specifications from a very creative group of Web developers. From a functional standpoint, the OpenID protocol lets users securely identify themselves to any Web site using a URI that the user proves he or she can control. If we want to develop decentralized, Web-scale identity systems, we can learn some simple social and technical lessons from Web architecture. First, keep it simple: HTML and http were widely (and correctly) regarded as among the least sophisticated hypertext technologies available when the Web was designed. Second, stick to non-proprietary standards with royalty-free access to all necessary patent rights: the technology must not only be easy to implement, it must be free of patent and other intellectual property barriers. Third, avoid centralized registries: the Web's hallmark is that anyone can create a Web page, without payment, permission, or registration with any centralized entity. [Note: a more recent version of this article was published as a 'Technology and Society Column' in IEEE Internet Computing, Volume 11, Number 4 (July/August 2007), pages 72-76.]

Ten Reasons to Model XML with RELAX NG, Not W3C XML Schema
Alex Brown, Griffin Brown Weblog

"I have recently recommended to a large publishing client that they adopt RELAX NG as the basis of the formal definitions of their content, in preference to W3C XML Schema Definition Language (WXS). There are lots of individual bits of information on why RELAX NG should be preferred all over the web. Here is an attempt to condense some of the key information into ten points..." [Adapted summary:] (1) A better spec means better interoperability: in common with many people working with WXS schemas, we have been tripped up by interoperability problems caused by different tools having a different take on how WXS should be implemented. (2) Availability of a compact syntax: Unlike WXS, RELAX NG has a compact syntax. (3) The specification is a stable ISO standard. (4) No PSVI: James Clark and Elliotte Rusty Harold have said all that needs to be said about the perils of the PSVI. (5) No content defaulting. (6) Better datatyping support: RELAX NG has the option for pluggable type libraries which may be implemented through an API. (7) More sophisticated modelling: RELAX NG introduces useful new feature for modelling interdependent attribute and element content. (8) More sophisticated grammatical validation: WXS grammars have to be deterministic, but enforcing UPA [Unique Particle Attribution] constraint breaks idiomatic uses of XML. (9) Instances have no dependency: WXS schemas (like DTDs) provide a mechanism for associating an instance with a schema: the xsi:schemaLocation attribute; RELAX NG schemas, on the other hand, have no formal association with instances. (10) Growing consensus: A growing number of key XML languages are being normatively defined using RELAX NG, such as XHTML 2.0, the Atom Syndication Format, OpenDocument Format and DocBook 5. It's clear (if there is a shift) which direction that shift is in, particularly for document-like modelling.

See also: XML schema languages

Microsoft Inches Closer to Open Source
Sean Michael Kerner, InternetNews.com

Microsoft wants more open source software to run on Windows. Microsoft also wants its own Open Source Initiative (OSI) approved license. Perhaps they really can get along. On the software side, Microsoft today announced a partnership with open source solution vendor SpikeSource to eventually certify all of SpikeSource's SpikeIgnited solutions on the Microsoft Windows platform. Microsoft has also launched a new site intended to provide even more information about Microsoft's open source plans. Among those plans, according to the spokesperson, is the declared intention by Microsoft to submit the Microsoft Shared Source licenses to the OSI for approval. Such an approval would mean that the Microsoft licenses would be considered to be bona fide open source. Microsoft's Shared Source Licenses were simplified in 2005 down to three core licenses: The Permissive License (Ms-PL), The Community License (Ms-CL) and the Reference License (Ms-RL). At the time they were originally released, Microsoft earned rare praise from the open source community. Tim O'Reilly had reported: "In his keynote at OSCON, Microsoft General Manager of Platform Strategy Bill Hilf announced that Microsoft is submitting its shared source licenses to the Open Source Initiative. This is a huge, long-awaited move. It will be earthshaking for both Microsoft and for the open source community if the licenses are in fact certified as open source licenses. Microsoft has been releasing a lot of software as shared source (nearly 650 projects, according to Bill). If this is suddenly certified as true open source software, it will be a lot harder to draw a bright line between Microsoft and the open source community. Bill also announced that Microsoft has created a new top level link at microsoft.com, microsoft.com/opensource to bring together in one place all Microsoft's open source efforts. Bill sees this as the culmination of a long process of making open source a legitimate part of Microsoft's strategy..."

See also: Tim O'Reilly's Blog

U.S. Health Info Technology Lags
Staff, Reuters News and CNET News.com

American healthcare system is largely paper-based, but privacy concerns, high costs, and doctor reluctance mean that may not change anytime soon. Dr. David Agus runs a hospital laboratory with the technological sophistication to find tiny markers in human blood that may one day tell doctors which treatment will best cure a patient's cancer, but he has hit a low-tech speed bump. That's because the ultimate success of such personalized medicine projects depends on having thousands of people contribute health information to be digitally stored according to a standard format that makes it easy to share. And that practice is not yet commonplace in the United States, or in many other industrialized nations. Patients in the United States, where healthcare is fragmented and Census figures indicate that nearly 45 million residents lacked health insurance in 2005, already pay the price. Many avoidable costs are the result of a lack of information, and run the gamut from bills for unnecessarily repeated tests to potentially life-threatening care delays and medical errors, according to reports from the likes of research company Rand as well as physicians and patients on the ground. Meanwhile, patient privacy issues, complaints about costs, competition among technology providers and doctors' apparent reluctance to embrace the system have left many medical records in the informational Stone Age. Denmark leads the pack among European and English-speaking countries when it comes to using digital information to deliver health care, according to the Commonwealth Fund. The Danish government provides health care for its citizens and most of their health information is kept in a single system that can be accessed and updated by an individual's primary care doctor and other medical professionals.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors