Cover Pages: XML Daily Newslink: Thursday, 17 May 2007

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Primeton http://www.primeton.com

Headlines

Code List Representation (Genericode) Version 1.0
W3C First Public Working Draft: SKOS Use Cases and Requirements
Extending and Versioning Languages: XML Languages
XML Parser Benchmarks: Part 2
Microsoft Wants ODF Added to ANSI Standards

Code List Representation (Genericode) Version 1.0
Anthony B. Coates (ed), OASIS Public Review Draft

OASIS announced that a Committee Draft of "Code List Representation (Genericode) Version 1.0" has been approved for public review. The TC was chartered to produce a neutral XML format for codifying and transmitting information about code lists. A code list in its simplest form is just a set of strings that each represent an item or idea. The OASIS code list representation is not just a simple list of strings. It is a complete description of a code list, including not only the codes, but also alternate codes, descriptions of the codes, and any other data that is associated with the codes. The OASIS code list representation also describes how new code lists are derived from existing code lists, so that the derivation is repeatable, automatable and auditable. It is hoped that third parties will produce software which processes the code list representation XML to produce run-time validation formats such as W3C XML Schema enumerations, programming language enumerations, database code lookup tables, ebRIM repository code list structures, etc. The OASIS Code List Representation format has a tabular model for code lists. The 'rows' are individual entries in a code list, where an entry is a set of one or more codes, plus other metadata, that is associated with a single conceptual entry in the code list. The 'columns' are individual (typed) pieces of metadata that can be applied to each entry in a code list. So columns define what kind of data can be in the code list, while rows define what actual data is in the code list.The code list format also supports the concept of 'keys', where a key is a set of one or more columns which uniquely identifies each row in the code list. Where a key has more than one column, it is a compound key.

See also: the Requirements

W3C First Public Working Draft: SKOS Use Cases and Requirements
Antoine Isaac, Jon Phipps, Daniel Rubin (eds), W3C Technical Report

W3C's Semantic Web Deployment Working Group has published the First Public Working Draft for "SKOS Use Cases and Requirements." Knowledge organisation systems play a fundamental role in information structuring and access, e.g. for asset description or web site organisation. Such vocabularies, coming in the form of thesauri, classification schemes, subject heading lists, taxonomies or even folksonomies, are developed and used worldwide, by institutions as well as individuals. However, these very important knowledge resources are still mostly isolated from the outside world, and not widely used in implementing systems. The development of new information technologies and infrastructures, such as the World Wide Web, calls for new ways to create, manage, publish and use these knowledge organisation systems. It is especially expected that conceptual schemes will benefit from greater shareability, e.g. by being published via web services. In the meantime, the documentary systems which use them will turn to advanced information retrieval techniques to construct most of their semantic structure and lexical content. SKOS (Simple Knowledge Organisation System) provides a model to represent and use vocabularies and ontologies in the framework of the Semantic Web. The document presents the preparatory work for a future version of SKOS. It lists representative use cases, which were obtained after a dedicated questionnaire was sent to a wide audience. It also features a set of fundamental or secondary requirements derived from these use cases, that will be used to guide the design of SKOS.

Extending and Versioning Languages: XML Languages
David Orchard (ed), W3C Draft TAG Finding

Significant revision has been made to W3C's draft TAG Finding, produced by the W3C Technical Architecture Group. The document is now published in three parts. "Extending and Versioning Languages: XML Languages" discusses the XML related aspects of versioning. It describes XML based terminology, technologies and versioning strategies. It provides XML Schema examples for each of the strategies and discussion about various schema design patterns. A number of XML languages, including XHTML and Atom, are used as case studies in different strategies. "Extending and Versioning Languages: Terminology" provides terminology for discussing language versioning. The evolution of languages by adding, deleting, or changing syntax or semantics is called versioning. Making versioning work in practice is one of the most difficult problems in computing. Arguably, the Web rose dramatically in popularity because evolution and versioning were built into HTML and HTTP. Both systems provide explicit extensibility points and rules for understanding extensions that enable their decentralized extension and versioning. The "Strategies" document provides motivation for versioning, presents a number of questions that language designers must answer, and discusses a variety of version identification strategies.

See also: Versioning Terminology

XML Parser Benchmarks: Part 2
Matthias Farwick and Michael Hafner, XML.com

The outcome of these benchmarks show that the LIBXML2 SAX-like parser in C is superior over the other tested parsers. In second place followed the two Java pull-parser implementations Javolution and Woodstox. In this part of the series we will show you how the object model parser performed in our tests. Object model parsers read in the data by using the event parsers. The object model parser benchmarks were of special interest for our high performance web service security gateway, because most web services security operations involve that at least the header of a SOAP message is read and altered. This in-memory altering can only be done by object model parsers like DOM implementations. The results for the AXIOM implementations are also very interesting in this context. They use a pull-parser to build up the in-memory representation of a XML document until the last node that needs to be read or altered. This has the advantage that not the whole document needs to be read into memory. LIBXML2 can be considered as the overall performance winner for object model parsers. It not only performs much better than all other parsers on documents up to 500 KB in size, but it also beats the two AXIOM implementations for documents up to 5 KB, when only the first part of the documents is read. It also does especially well for very small documents of about 1 KB where it is up to 10 times faster than the other implementations. For really big documents above 500 KB the default Java 1.5 DOM parser and the Oracle DOM parser in C are alternatives. But as the partial documents parsing benchmarks show, it is advisable that you evaluate which use cases of XML processing you will perform the most. If you find that in most cases you will only need to alter parts in the beginning of a XML document, you should consider using the Java AXIOM implementation.

See also: part 1

Microsoft Wants ODF Added to ANSI Standards
Elizabeth Montalbano, ComputerWorld

Days after declaring its intention to aggressively collect patent royalties from open-source distributors, Microsoft Corp. backed the addition of ODF—the document file format used widely in open-source alternatives to Microsoft Office—to a list of business standards. Microsoft also said it will support Office 2007's default document file format, Open XML, for the list maintained by the American National Standards Institute (ANSI) as well. The company said it supports ODF (Open Document Format for XML) because businesses want choice and interoperability for software they deploy. ANSI recommends business best practices, standards and guidelines to a range of industries in the U.S. Andy Updegrove said that by supporting ODF as an ANSI standard, Microsoft is "making it appear it is rising above the squabble to do the right thing." Instead, he thinks the move serves as a challenge to vocal ODF supporters to support approval of Open XML as a global standard when a final vote for the draft specification comes before the ISO. To its credit, Microsoft voted for ODF when it came before the ISO (International Organization for Standards), while IBM cast the only negative vote for Open XML when it was up for approval by standards organization Ecma International.

See also: InternetNews.com


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors