Cover Pages: XML Daily Newslink: Monday, 28 July 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Oracle Corporation http://www.oracle.com

Headlines

New Working Draft: Efficient XML Interchange Evaluation
OASIS Identity Metasystem Interoperability (IMI) Technical Committee
New Draft of ISO Character Entities from W3C
Introduction to Hibernate Search
OASIS Search Web Services Technical Committee Releases Committee Drafts
Introduction to Web Services Creation Using CXF and Spring

New Working Draft: Efficient XML Interchange Evaluation
Carine Bourne (ed), W3C Technical Report

W3C announced the publication of two Working Draft specifications from the Efficient XML Interchange (EXI) Working Group. This W3C WG was chartered to establish and optimize the performance of an alternate, binary, encoding of XML—specifically, for the XML Information Set. At the same time, W3C recognized that disruption to existing processors, and impact on the complex real-world uses of XML, must be minimized. The Working Group started by considering existing solutions and has evaluated each in terms of implementability and performance against the requirements produced by the XBC Working Group. We gathered together a test data set of more than 10000 documents in 30 or so XML vocabularies, from a broad range of use case groups, such as Scientific, Financial, Electronic (those intended for human consumption), Storage (intended as data stores), etc. The existing solutions, and candidate base technologies for a potential EXI format, were then measured over a number of merit criteria, within a benchmark framework based on japex. (1) "Efficient XML Interchange (EXI) Format 1.0" defines a format intended to simultaneously optimize performance and the utilization of computational resources. The EXI format uses a hybrid approach drawn from the information and formal language theories, plus practical techniques verified by measurements, for entropy encoding XML information. Using a relatively simple algorithm, which is amenable to fast and compact implementation, and a small set of data types, it reliably produces efficient encodings of XML event streams. The event production system and format definition of EXI are presented. (2) "Efficient XML Interchange Evaluation" provides an evaluation of the "Efficient XML Interchange (EXI) Format 1.0" document with reference to the Properties identified by the XML Binary Characterization (XBC) Working Group, relative to XML, gzipped XML and ASN.1 PER. It is conducted using the XBC Measurement methodology. For the "compactness" and "processing efficiency" Properties, the performance is measured with EXI Measurement framework, over the test data collected for the EXI measurements, representing XBC Use Cases.

OASIS Identity Metasystem Interoperability (IMI) Technical Committee
Staff, OASIS Announcement

Representatives of IBM, Microsoft, Nortel, and the US DoD have submitted a draft charter for a proposed OASIS Identity Metasystem Interoperability (IMI) Technical Committee." The purpose of the Identity Metasystem Interoperability (IMI) Technical Committee (TC) is to increase the quality and number of interoperable implementations of Information Cards and associated identity system components to enable the Identity Metasystem. In the Identity Metasystem, identities are represented to users as "Information Cards." Information Cards enable users to manage their digital identities from different identity providers and employ them in various contexts to access online services. Information Cards have a number of characteristics that help to improve user privacy and security when accessing online services. Broad interoperability across platforms and services is needed so that Information Card support is ubiquitous to realize the goals of the Identity Metasystem. The TC will accept as input the July 2008 Identity Selector Interoperability Profile specification and its associated guides as published by Microsoft, the July 2008 Web Services Addressing Endpoint References and Identity specification published by Microsoft and IBM, and the OSIS (Open Source Identity Systems) Feature Tests published by Identity Commons. The scope of the TC's work is to continue further refinement and finalization of the Input Documents to produce specifications that standardize the concepts and XML Schema renderings of the areas described below in a form that is backward compatible with the input documents. [See now the Cover Pages news story OASIS Identity Metasystem Interoperability TC Advances Information Card Use.]

New Draft of ISO Character Entities from W3C
Rick Jelliffe, O'Reilly

"I am pleased to see that the W3C MathML WG has produced a new draft XML Entity definitions for Characters. These are the latest and greatest mappings from the characters to Unicode. These are the characters you get in HTML or XHTML when you type '&lceil;' it should give you the appropriate character. There are three ways to use these entities: (1) Use a DTD and call in the sets as parameter entities; (2) Use XSLT2 and the character map function; (3) Use (draft) ISO DSRL—Martin Bryan's implementation in the Zip file as dsdl.org is a front end for XSLT2. ISO/IEC JTC1 SC34 (the Document Processing and Description Languages committee) originally defined and owned these sets, which SGML-ers and DTD users can be familar with through entity sets such as isopub. SGML was designed for the publishing industry, and mathematical typesetting has always had a need for many special characters. The American Mathematical Society was strongly involved and I am glad to see it is continuing its involvement. SC34 handed over maintenance of the entity sets to W3C MathML a year or so ago. I advise people who are still using my PEN entity set to move over to the W3C mappings for new documents. You can tell if you are using the PEN entities because the files will have a .pen extension: my set was I think the first XML mapping from entities to Unicode, though Unicode had already had the ISO sets as an input, and has not been kept up to date... I hope this set takes us one step closer to having all these entity sets built into XML. This is the kind of thing that computers are good at: turning names into numbers."

Introduction to Hibernate Search
Xinyu Liu, JavaWorld Magazine

Many Web applications exist to provide access to copious amounts of data stored in a relational database, but what's the easiest way to enable users to search through that data and find what they need? In this article, Dr. Xinyu Liu introduces Hibernate Search, which integrates the sophisticated search capabilities of Lucene with the familiar object-relational mapping framework of Hibernate. Apache Lucene is a high-performance, extensible full-text search-engine library written in Java. At first, it may not be obvious why you'd need such a thing -- after all, your data is nicely filed away in a decent relational database. While an RDBMS can do a great job of providing transactional CRUD operations on data stored in a relational model, search functions defined in SQL are not always capable of meeting both the functional and non-functional requirements of your projects. There are a number of query types that RDBMSs in general do not support without vendor extensions: (1) Fuzzy queries, in which "fuzzy" and "wuzzy" are considered matches (2) Word stemming queries, which consider "take," "took," and "taken" to be identical (3) Sound-like queries, which consider "cat" and "kat" to be identical (4) Synonym queries, which consider "jump," "hop," and "leap" to be identical (5) Queries on binary BLOB data types, such as PDF documents, Microsoft Word or Excel documents, or HTML and XML documents More disappointingly, SQL search results are not ranked by match-relevance scores. The SQL standard is simply not intended for full-text querying. Lucene search capabilities, on the other hand, are unlimited. Lucene handles all the queries just mentioned, and more; it also allows you to find text documents similar to other documents through its advanced term-vector query... Hibernate is a high-performance, mature object-relational mapping (ORM) library. As a non-intrusive ORM solution, Hibernate provides object query APIs for plain old Java object (POJO) persistence model classes and automatic data bindings between the object and relational representations of persistence data. In essence, it lets you focus on domain model-oriented programming.

See also: the Hibernate documentation

OASIS Search Web Services Technical Committee Releases Committee Drafts
Ray Denenberg, OASIS TC Announcement

On behalf of the OASIS Search Web Services Technical Committee, TC Chair Ray Denenberg announced the publication of five approved Committee Draft specifications. This TC was chartered to produce a web service interface specification including Search/Retrieve, Query, Sorting, Record Retrieval, and Index Browsing. (1) The "Abstract Protocol Definition (APD)" document presents the model for the SearchRetrieve operation and is also intended to serve as a guideline for the development of application protocol bindings. (2) The "Binding for SRU 1.2 Version 1.0" defines a binding for the OASIS SWS (Search Web Services) searchRetrieve operation. (3) "Binding for SRU 1.2: Auxiliary Binding for HTTP GET - Version 1.0" describes the construction of an http URL to encode parameter values of the form 'key=value', with support for Unicode characters. (4) "CQL 1.2 (The Contextual Query Language)" is a formal language for representing queries to information retrieval systems. The design objective is that queries be human readable and writable, and that the language be intuitive while maintaining the expressiveness of more complex languages. (5) "Binding for OpenSearch Version 1.0" is intended to be fully compatible with "OpenSearch 1.1, Draft 3. A binding may be "static" or "dynamic". A static binding is a human-readable document, essentially a profile. A dynamic binding is a machine-readable description of a server, written in a description language that the Committee is also developing (see Annex B of APD). The premise behind dynamic bindings is that any search engine, even one that existed prior to development of the standard, need only to provide a dynamic binding - a self-description. It need make no other changes in order to be accessible. A client will be able to access any search engine that provides a description, if only it implements the capability to read and interpret the description and use it to formulate a request (including a query) and interpret the response.

Introduction to Web Services Creation Using CXF and Spring
Rajeev Hathi and Naveen Balani, IBM developerWorks

Apache CXF is an open source framework that provides a robust infrastructure for conveniently building and developing Web services. It lets you create high-performance and extensible services, which you can deploy in the Tomcat and Spring-based lightweight containers as well as on a more advanced server infrastructure, such as JBoss, IBM WebSphere, or BEA WebLogic. This article is Part 1 of a series; it shows you how to expose POJOs as Web services using Spring and CXF. It also illustrates CXF integration with the Spring Framework. In this article, you build and develop an order-processing Web service using CXF and Spring. This Web service processes or validates the order placed by a customer and returns the unique order ID. After reading this article, you can apply the concepts and features of CXF to build and develop a Web service... CXF supports the following Web services standards: Java API for XML Web Services (JAX-WS); SOAP; Web Services Description Language (WSDL); Message Transmission Optimization Mechanism (MTOM); WS-Basic Profile; WS-Addressing; WS-Policy; WS-ReliableMessaging; WS-Security... Part 2 of this series will demonstrate how to expose POJOs as Restful services using CXF and Spring.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors