Cover Pages: XML Daily Newslink: Thursday, 05 June 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
IBM Corporation http://www.ibm.com

Headlines

FDA Issues Structured Product Labeling Guidance: No Delays
A Prototype Knowledge Base for the Life Sciences
Archival Information Package: Cornerstone of the OAIS Implementations
A DSRL Script for Mapping from Schematron 1.n to ISO Schematron
Toward Integration: Multilanguage Programming
Describe REST Web Services with WSDL 2.0: A How-To Guide

FDA Issues Structured Product Labeling Guidance: No Delays
Angie Drakulich, Pharmaceutical Technology

The US Food and Drug Administration issued a new guidance on indexing structured product labeling (SPL). The Center for Biologics Evaluation and Research (CBER) and the Center for Drug Evaluation and Research (CDER) will begin indexing SPL in the product labeling for human drug and biologic products. SPL became a requirement in October 2005 when FDA stated that SPL in Extensible Markup Language (XML) was the only electronic format for content of labeling that CDER could process, review, and archive, according to the guidance. The agency is now recommending that content be submitted in SPL. SPL enables the electronic exchange of the content of labeling and other regulated product information. It also enables the inclusion of indexing elements with product labeling. Indexing is made possible by machine-readable tags that are inserted into the label, but that do not appear on the actual printed label (consumers cannot see them). With these tags, individuals using clinical-decision support tools and electronic prescribing systems can more easily and rapidly search and sort product information in product labeling... The change also will help to decrease prescribing errors and enhance the safe use of medical products. For example, says the guidance, a full-text search of the content of labeling for hepatoxicity will miss labelings that use the term liver toxicity. New indexing elements based on standards adopted for use in the healthcare setting will address this problem. The new guidance comes upon the completion of a six-month FDA pilot project that evaluted how best to add indexing elements to products. It also comes approximately two months after FDA issued a draft guidance on structured product labeling. The new guidance addresses some of the suggestions received from industry based on the draft, including more concrete advice on how applicants can recommend indexing terms to the agency and how indexed terms will be identified and shared... [alt URL]

A Prototype Knowledge Base for the Life Sciences
M. Scott Marshall and Eric Prud'hommeaux (eds), W3C Interest Group Note

Members of the W3C Semantic Web in Health Care and Life Sciences Interest Group (HCLS) have published a Note describing "A Prototype Knowledge Base for the Life Sciences." The document explains how one can use the Semantic Web to express and integrate scientific data. These techniques can be used for modeling any data, and the benefits of integration and model consistency apply to other diverse, distributed data domains. It is hoped that this document will inspire further contributions to the ongoing work at Neurocommons and the Health Care and Life Sciences Interest Group, as well as inspire those in other domains to exploit the Semantic Web. The prototype is a biomedical knowledge base, constructed for a demonstration at the Banff WWW-2007 Conference. It integrates fifteen (15) distinct data sources using currently available Semantic Web technologies such as the W3C standard Web Ontology Language (OWL) and Resource Description Framework (RDF). This report outlines which resources were integrated, how the knowledge base was constructed using free and open source triple store technology, how it can be queried using the W3C Recommended RDF query language SPARQL, and what resources and inferences are involved in answering complex queries. While the utility of the knowledge base is illustrated by identifying a set of genes involved in Alzheimer's Disease, the approach described here can be applied to any use case that integrates data from multiple domains... Many health care and life sciences organizations are interested in the data integration abilities promised by the Semantic Web. More specifically, the benefits include the aggregation of heterogeneous data using explicit semantics, and the expression of rich and well-defined models for data aggregation and search. Semantic Web technologies enable one to more flexibly add additional data sets into the data model, and more easily reuse data in unanticipated ways. Once data has been aggregated, a Semantic Web reasoner computes implied relationships among the aggregated data resulting in tighter integration and the possibility of additional insights.

See also: the W3C news item

Archival Information Package: Cornerstone of the OAIS Implementations
Guy Marechal, Conference Presentation

This presentation was given at the Sun Preservation and Archiving Special Interest Group (Sun PASIG) Conference held May 27-29, 2008. The event focused upon repositories, preservation, digital asset management, tiered storage architectures, and longterm data management. Summary: "In the Open Archival Information System (OAIS) standard, the Archival Information Package (AIP) is embedded in the system and not an interface of the system like the Submission Information Package (SIP) or the Dissemination Information Package (DIP). Many important projects sponsored by the European Union came to the conclusion that the key for concrete implementations of OAIS is to focus on the AIP as external interface and to merge the AIP, SIP and Persistent-DIP concepts. With that approach, the AIP are exchanged between systems and so the persistence and the interoperability are solved by the same constructs: indeed, the persistence is in fact the temporal interoperability of an evolving "Enterprise repository" and "Federated Archiving" requires space interoperability. It means that the approach offers a high degree of flexibility in the practical implementation of the full chain of the construction to the exploitation of the information assets. Each of the functional elements of the OAIS model is simply assumed existing in one or more occurrences. This allows of exploding the 'System' (the "S" of the OAIS acronym) into independent functional blocks that could be delivered by independent suppliers and managed by independent parties; it allows also operating as a federation of organisations. The AXIS architecture proposes an open and customised specification of constructs for implementing the AIP in the form of 'Autonomous eXchange Entities'. The AXE's are structured through a three levels wrapping system fitting perfectly with the Object oriented and Honeycomb repositories like the ST5800 but also allows directly the adoption of powerful retrieval through constructs (like XAM or OAI) and of powerful Package Wrappers (like ZIP, METS and MXF). The AXIS architecture is planned to be made available in open source on the Web site of the UNESCO..." Note: the PASIG collection of online presentations contains many papers of interest; see, for example, Raymond Clarke's "ZFS: The Last Word in File Systems."

A DSRL Script for Mapping from Schematron 1.n to ISO Schematron
Rick Jelliffe, O'Reilly Articles

ISO Document Schema Renaming Language (DSRL) is one of Martin Bryan's contributions to the ISO Document Schema Description Languages project at JTC1 SC34 WG1. This brings together various technologies by Murata Makoto, James Clark, Martin Duerst, Jenni Tennison, and others (including me) to try to build a layered solution to validation using a variety of 'little languages'... DSRL is now at a very late draft stage, and I expect it will be finalized over this year. DSRL is declarative: it provides mappings, and even though it could be used to rename items in schemas, Martin Bryan's open source XSLT implementation of it takes the more direct route of renaming the document... My vision is that in the near term, with DSRL completing the base DSDL quartet of RELAX NG, NVRL, DSRL and Schematron, that standards developers will start to take them on board as a package: (1) ISO NVDL selecting the particular schemas for different namespaces and culling foreign elements as desired; (2) ISO DSRL renaming, localization and providing default values to handle common evolution cases; (3) ISO RELAX NG performing grammar-based validation, extended with its XSD data types; (4) ISO Schematron performing more complex and detailed validation. A couple of years ago we finally arrived at the point where people had come to pretty realistic apprehensions about the proper limits of XSD functionality, and I think we are now arriving at the same kind of level of maturity with RELAX NG. As these limits become commonplace, I think the need for NVDL and DSRL (for XSD and for RELAX NG) will similarly become more well-know. My prediction is that it will increasingly occur to community standards bodies that their standards have quite a number of constraints or gotchas which are poorly expressed in English but much clearer (and machine verifiable) when expressed using DSRL (and NVDL and Schematron.)

Toward Integration: Multilanguage Programming
Steve Vinoski, IEEE Internet Computing

Multilingual programmers embrace the diversity of programming languages, enabling them to apply different languages to different integration problems to produce... Many developers simply put up with the mismatch and slog their way through, eventually reaching what is, at best, a mediocre solution. To help with productivity issues, they often resort to code generation, mapping XML constructs to statically typed programming language constructs to try to ease the impedance mismatch. Unfortunately, that approach can be extremely brittle as a result of converting highly flexible XML constructs into rigid static data types that are difficult to version adequately. Any changes to the XML document then require new code generation to reflect those changes, even if the application doesn't use the specific modified XML entities. The newly generated code can, in turn, require changes to the application code that uses it, so that any application using the generated code must undergo full build, test, and redeployment cycles. Any minor productivity gains achieved through code generation are quickly lost in the noise when compared to ongoing maintenance costs. Contrast this story of XML development—unfortunately, repeated quite often in enterprise-integration scenarios—with simply using a programming language that's better suited to the task. For example, the Python 'language xml.etree' module makes XML handling almost trivial (even with versioning), and Perl has XML packages that are equally easy to use. Erlang's xmerl module is quite good as well. Better still, though, are languages that support literal XML, such as ECMAscript for XML (E4X) and Scala, which both let developers write XML directly within the language's syntax. Literal XML effectively eliminates the impedance mismatch between XML and the programming language, letting the developer write just a few lines of code versus what might require hundreds or thousands of lines in a combination of generated and manually written brittle Java or C++ code...

Describe REST Web Services with WSDL 2.0: A How-To Guide
Lawrence Mandel, IBM developerWorks

This article provides an introduction to REST and WSDL 2.0, and walks you through creating a WSDL 2.0 description of a REST Web service. The term Web services is typically associated with operation- or action-based services using SOAP and the WS* standards, such as WS-Addressing and WS-Security. The term REST Web services generally refers to a resource-based Web services architecture that uses HTTP and XML. Each of these architectural Web service styles has its place, but until recently, the WSDL standard didn't equally support both styles. The WSDL 1.1 HTTP binding was inadequate to describe communications with HTTP and XML, so there was no way to formally describe REST Web services with WSDL. The publication of WSDL 2.0, which was designed with REST Web services in mind, as a World Wide Web Consortium (W3C) recommendation means there is now a language to describe REST Web services... REST is an architectural style that treats the Web as a resource-centric application. Practically, this means each URL in a RESTful application represents a resource. The URLs are also easy to understand and remember... Like (X)HTML, REST Web services make use of hyperlinks in XML. Traditional Web applications access resources using HTTP GET or POST operations. In contrast, RESTful applications access resources following the create, read, update, and delete (CRUD) style using the full range of HTTP verbs (POST, GET, PUT, and DELETE). There's one more key component of a REST application: RESTful applications should be stateless. This means in a REST application no session state is stored on the server. All of the information needed to satisfy the request is carried in the request message itself. A client can therefore cache a representation of a resource, which can significantly improve the application's performance, where a service explicitly allows it... One significant reason why REST Web services have to this point not made use of WSDL is that the WSDL 1.1 HTTP binding was inadequate to describe them. WSDL 2.0 was declared a W3C recommendation in June 2007. This second version of WSDL was created to address issues with WSDL 1.1, many of which had been identified by the Web Services Interoperability (WS-I) organization. In addition, WSDL 2.0 has good support for HTTP bindings...


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors