The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Last modified: December 19, 2007
XML Daily Newslink. Wednesday, 19 December 2007

A Cover Pages Publication
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
BEA Systems, Inc.

XML Entity Definitions for Characters
David Carlisle (ed), W3C Working Draft

W3C announced the release of a First Public Working Draft for the specification "XML Entity Definitions for Characters." The document has been produced by members of the W3C Math Working Group as part of the W3C Math Activity; it is one of three drafts relevant to MathML published on 2007-12-14. The document defines several sets of names which are assigned to Unicode characters; these names may be used for entity references in SGML/XML-based markup languages. Notation and symbols have proved very important for scientific documents, especially in mathematics. In the majority of cases it is preferable to store characters directly as Unicode character data or as XML numeric character references. However, in some environments it is more convenient to use the ASCII input mechanism provided by XML entity references. Many entity names are in common use, and this specification aims to provide standard mappings to Unicode for each of these names. In the Working Draft, two tables listing the combined sets are presented, first in Unicode order and then in alphabetic order; then tables documenting each of the entity sets are provided. Each set has a link to the DTD entity declaration for the corresponding entity set, and also a link to an XSLT2 stylesheet that will implement a reverse mapping from characters to entity names. In addition to the stylesheets and entity files corresponding to each individual entity set, a combined stylesheet is provided, as well as two combined sets of DTD entity declarations. The first is a small file which includes all the other entity files via parameter entity references; the second is a larger file that directly contains a definition of each entity, with all duplicates removed.

Example (sets) include: [1] C0 Controls and Basic Latin, C1 Controls and Latin-1 Supplement; [2] Latin Extended-A, Latin Extended-B; [3] IPA Extensions, Spacing Modifier Letters; [4] Combining Diacritical Marks, Greek and Coptic; [5] Cyrillic; [6] General Punctuation, Superscripts and Subscripts, Currency Symbols, Combining Diacritical Marks for Symbols; [7] Letterlike Symbols, Number Forms, Arrows... The editor notes: It is hoped that the entity sets defined by this specification may form the basis of an update to "ISO 9573-13-1991". However, pressure of other commitments has currently prevented this document being processed by the relevant ISO committee, thus the entity sets are being presented with Formal Public identifiers of the form "-//W3C//..." rather than "ISO...." It is hoped that an update to TR 9573-13 may be made later. The present version of TR 9573-13 defines the sets of names, but does not give mappings to Unicode. TR 9573-13 is maintained by ISO/IEC JTC 1/SC 34/WG 1 (Markup Languages). An Outgoing Liaison Statement from SC34 was recently communicated to the W3C MathML WG regarding cancellation of the project for TR 9573-13, Second Edition [Revision of TR 9573-13, SGML support facilities—Techniques for using SGML - Part 13: Public entity sets for SGML for mathematics and science], in accordance with Resolution 13 adopted at the SC 34 plenary meeting held in Kyoto, Japan, 2007-12-08/11.

See also: the source files

Mathematical Markup Language (MathML) Version 3.0
David Carlisle, Patrick Ion, Robert Miner (eds), W3C Technical Report

Members of the W3C Math Working Group have released a third Public Working Draft which specifies a new version of the the Mathematical Markup Language: MathML 3.0. MathML is an XML application for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the World Wide Web, just as HTML has enabled this functionality for text. MathML can be used to encode both mathematical notation and mathematical content. About thirty-five of the MathML tags describe abstract notational structures, while another about one hundred and seventy provide a way of unambiguously specifying the intended meaning of an expression. Additional chapters discuss how the MathML content and presentation elements interact, and how MathML renderers might be implemented and should interact with browsers. Finally, this document addresses the issue of special characters used for mathematics, their handling in MathML, their presence in Unicode, and their relation to fonts. While MathML is human-readable, in all but the simplest cases, authors use equation editors, conversion programs, and other specialized software tools to generate MathML. Several versions of such MathML tools exist, and more, both freely available software and commercial products, are under development. Note: The W3C Working Group has also published "A MathML for CSS Profile"; this MathML 3.0 profile admits formatting with Cascading Style Sheets. This will facilitate adoption of MathML in web browsers and CSS formatters, allowing them to reuse existing CSS visual formatting model, enhanced with a few mathematics-oriented extensions, for rendering of the layout schemata of presentational MathML.

See also: the Cascading Style Sheets Profile

ASCII Escaping of Unicode Characters
John Klensin (ed), IETF Best Current Practice

The Internet Engineering Steering Group has announced the publication of "ASCII Escaping of Unicode Characters" as an IETF Best Current Practice (BCP) specification. Abstract: "There are a number of circumstances in which an escape mechanism is needed in conjunction with a protocol to encode characters that cannot be represented or transmitted directly. With ASCII coding the traditional escape has been either the decimal or hexadecimal numeric value of the character, written in a variety of different ways. The move to Unicode, where characters occupy two or more octets and may be coded in several different forms, has further complicated the question of escapes. This document discusses some options now in use and discusses considerations for selecting one for use in new IETF protocols and protocols that are now being internationalized." In accordance with existing best-practices recommendations (RFC 2277), new protocols that are required to carry textual content for human use SHOULD be designed in such a way that the full repertoire of Unicode characters may be represented in that text. This document therefore proposes that existing protocols being internationalized, and that need an escape mechanism, SHOULD use some contextually-appropriate variation on references to code points unless other considerations outweigh those described here. This recommendation is not applicable to protocols that already accept native UTF-8 or some other encoding of Unicode. In general, when protocols are internationalized, it is preferable to accept those forms rather than using escapes. This recommendation applies to cases, including transition arrangements, in which that is not practical. This BCP document has been reviewed in the IETF but is not the product of an IETF Working Group; the IESG contact person is Chris Newman. The subject of escaping has been extensively reviewed and debated on relevant IETF mailing lists and by active participants of the Unicode community. The discussions were not able to achieve consensus to recommend one specific format, but rather to recommend two good formats and discourage use of some problematic formats. There was some debate over how much discussion of problematic formats was appropriate.

See also: XML and Unicode

Firefox 3 Beta 2 Arrives Early
Sean Michael Kerner,

In Mozilla's Firefox 3 Beta 2 release, Mozilla developers have improved security and performance as well as functionality. In total, Mozilla boasts in its release notes that some 900 improvements were made in Beta 2 over the Beta 1 release, which came out about a month ago. Many improvements are focused on how Firefox handles memory. Firefox developer Mike Beltzner claimed in a mailing list posting of over 330 memory leak fixes. Memory handling and leakage issues have been a high priority item for Mozilla developers throughout the Firefox 3 process. Firefox 3 Beta 2 also fixes leaks in how the browser handles JSON (JavaScript Object Notation) cross site requests, making the browser more secure. JSON is often used in Ajax web development and is an alternative to XML over HTTP (XHR) Requests. Security is further enhanced with anti-virus integration in Firefox's download manager. Beta 2 also improves on the security of plugins by implementing a version check to identify plugins that are not secure. Mozilla has also taken steps to further improve its Places bookmarking and history system which is a major new feature of the Firefox 3 browser. The Places system was originally intended to be part of the Firefox 2 release but wasn't ready in time. It has been part of the Firefox 3 development cycle since at least the Alpha 5 release in June. Fundamentally, Places makes it easy to create, manage and use bookmarks and history information.

See also: IE 8.0 and Acid2

Manage an HTTP Server Using RESTful Interfaces and Project Zero
Dan Jemiolo, IBM developerWorks

WS-* users and REST users have an ongoing debate over which technique is most appropriate for which problem sets, with WS-* users often claiming that more complex, enterprise-level problems cannot be solved RESTfully. This article puts that theory to the test by trying to create a RESTful solution for a problem area that is not often discussed by REST users: systems management. The article shows how to make a Zero-based RESTful interface for httpd that is as functionally complete, comparable to an Apache Muse-based WS-* version. The combination of Groovy scripts and RESTdoc comments provides the same features and behavior as we had with Java classes and WSDL and demonstrates that REST can handle the tasks that are thought to be "too complicated" for HTTP alone. The REST and WS-* solutions each have their pros and cons, and which one you favor may change from project to project. The article not about enumerating the pros and cons of WS-* technology versus REST-oriented technology, and it is not out to select a "winner." The goal of the article is to demonstrate whether or not REST and Web 2.0 development techniques provide a productive alternative for systems management projects and hopefully give developers some additional choices. WS-* users and REST users have an ongoing debate over which technique is most appropriate for which problem sets, with WS-* users often claiming that more complex, enterprise-level problems cannot be solved RESTfully.

AirTran Becomes First U.S. Carrier to Use Sabre XML Interface
Jay Boehmer, Business Travel News Online

AirTran Airways last week began displaying seat maps through Sabre's Extensible Markup Language (XML) interface, and plans to add additional booking options through the global distribution system early next year. Sabre vice president of product marketing Kyle Moore said the XML interface allows the GDS to display travel content not generally enabled through traditional legacy systems. Through XML, Sabre can tap into airlines' Web-based reservations systems and display and sell air content in a manner closer to how airlines sell and distribute through their Web sites, Moore said. Though Sabre has been using XML for years now to link with other travel suppliers, including Expedia and hotel companies, Moore said AirTran is the first major airline to adopt the link: "XML is far more flexible than technologies that we and travel suppliers have used in the past. It allows us to do things that we previously were not able to do. Carriers can use an XML connection to sell ancillary services, unbundle fare options and (like AirTran) show seat maps and more detailed flight information through the global distribution system. As carriers introduce new things, they're generally not building them in legacy technologies. This is a platform that can support traditional types of transactions using new technology or nontraditional types of transactions in environments in which they may want them to work." AirTran early next year will launch additional booking features through Sabre's XML link, according to Moore, who said that other airlines also are in discussions to hook up through XML.

See also: the announcement

Implementing Healthcare Messaging with XML
Marc de Graauw, Random Notes Blog

At XML 2007, Marc de Graauw provided an overview of the national EHR being set up in the Netherlands. It uses XML, HL7v3 and Web Services. He takes a look at lessons learned and the pitfalls to be avoided: (1) Schemas serve multiple masters—design, validation, contract, code generation. And those purposes don't play together well. Write flat, simple Schemas. Those are understandable and generate understandable code. Don't design Schemas for reuse. Use a simple spreadsheet format instead as your baseline. And tweak your Schemas with XSLT before generating code. After all, they're just XML. (2) Use a layered approach. Anything beyond Celsius-to-Fahrenheit will not be a monolithic Web Service. So anonymize payloads with 'xs:any' to generate stubs and make Schema's which describe just one software layer. This ensures reuse, and stacks nicely on top of the Internet stack... (3) Make examples everywhere: hand-write XML messages, and use those to develop and test services. XML based message exchanges are hard, and documentation for them gets large. Example XML messages are required to keep everyone sane. And make your messages wrong—see how applications handle all kinds of common mistakes. (4) Do a lot of HTTP work: specify HTTP status codes, when to use which codes in combination with higher level (SOAP, HL7v3) error codes. (5) Profile the profiles! Don't simply use WS-I Basic Profile and Security Profile, but write your own lean profiles—skin them till only what's really needed is left. Plenty of options means plenty of interoperability problems... profiling possibilities on top of WS-ReliableMessaging and WS-Security.

See also: the paper abstract

Qualcomm Digs Deep in Face of Global Litigation Onslaught
Andrew Longstreth, The American Lawyer

Re-blogged, via citation from Susy Struble: "This is a great read about Qualcomm's history and tactics around using ICT standards to directly generate revenue. You've got to be big to play successfully in this game...." Excerpt: "Fighting to save its business, Qualcomm expects to spend more than $200 million on litigation in 2008... The standard makers require wireless companies to play by certain rules. Generally, when a new standard is under consideration, companies in the industry are required to report patents they own that might be necessary to the new technology. The goal is to avoid "patent holdups," in which companies that control crucial technology charge exorbitant and unfair royalties. Before deciding on a standard that uses a company's technology, the body will seek assurances that the company will license its intellectual property on "fair, reasonable and nondiscriminatory" terms... 'The fact that so many have the same view about Qualcomm's licensing practice is instructive,' says George Cary of Cleary, who represents Broadcom before the EC and in New Jersey. 'The rest of the industry has one view of what is not "fair, reasonable and nondiscriminatory." Qualcomm has its own vision'."

See also: Patents and Open Standards


XML Daily Newslink and Cover Pages are sponsored by:

BEA Systems, Inc.
IBM Corporation
Sun Microsystems, Inc.

XML Daily Newslink:
Newsletter Archive:
Newsletter subscribe:
Newsletter unsubscribe:
Newsletter help:
Cover Pages:

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: