Cover Pages: XML Daily Newslink: Thursday, 04 March 2010

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

First Public Working Draft: Additional Requirements for Bidi in HTML
IETF Last Call Public Review for Internet Key Exchange Protocol: IKEv2
Industry Coalition Plans Interoperability Program with Certification
OASIS Public Review: Context/Value Association Using Genericode 1.0
IETF Sponsors Broadband Home Gateway (Homegate) Workshop
W3C First Draft for HTML: The Markup Language
The Art of Narrative and the Semantic Web

First Public Working Draft: Additional Requirements for Bidi in HTML
Aharon Lanin, Adil Allawi, Matitiahu Allouche (et al), W3C Technical Report

Members of the W3C Internationalization Core Working Group have released a first Public Working Draft for the specification Additional Requirements for Bidi in HTML. This document contains proposals for features to be added to HTML to support bidirectional text in languages such as Arabic, Hebrew, Persian, Thaana, Urdu, etc.

Abstract: "Authoring a web app that needs to support both right-to-left and left-to-right interfaces, or to take as input and display both left-to-right and right-to-left data, usually presents a number of challenges that make it an especially laborious and bug-prone task. Some of these are due to browser bugs, but some can be traced to a gap in the specification of the bidirectional aspects of a given HTML feature. And some of these challenges could be greatly simplified by adding a few strategically placed new HTML features. This document proposes fixes for some of the most repetitive pain points."

According to W3C Internationalization Lead Richard Ishida: "The former wiki-based document by Aharon Lanin entitled 'A Proposal for HTML Improvements for Bidi' has just been published as a W3C First Public Working Draft, with the (new) title 'Additional Requirements for Bidi in HTML'. This version includes edits based on the comments received by Aharon from bidi experts up to mid-February, but this is still a draft document and has been published now to facilitate further review and comment. It also contains some explicitly identified open issues.

For those not familiar with this document, it arose out of the frustrations of people who have to actually work with bidirectional text on the Web in everyday practical situations. For example, it covers issues related to re-use of fragments of text in various new locations by web apps or scripts, or situations where users need to type in or send bidirectional form data. It proposes additions to the HTML5 specification for such situations, which are not covered by the current HTML specification. Many of the ideas in the document, however, are also relevant to markup formats in general, and there are some implications for CSS and XSL-FO (which we hope to address more directly in a subsequent document)... The plan is to obtain feedback as soon as possible on the new Working Draft from bidi experts and internationalization folks, then issue a new draft that incorporates the results of those discussions. Only after that do we plan to put the proposals to the HTML community and seek their comments and commitment..."

See also: Richard Ishida's note

IETF Last Call Public Review for Internet Key Exchange Protocol: IKEv2
Charlie Kaufman, Paul Hoffman, Yoav Nir (et al), IETF Internet Draft

The Internet Engineering Steering Group (IESG) has received a request from the IP Security Maintenance and Extensions Working Group (IPSECME) to consider Internet Key Exchange Protocol: IKEv2 for approval as an IETF Proposed Standard. The IESG plans to make a decision in the next few weeks, and solicits final comments on this action. Please send substantive comments to the IETF by 2010-03-18.

This document describes version 2 of the Internet Key Exchange (IKE) protocol. IKE is a component of IPsec used for performing mutual authentication and establishing and maintaining security associations (SAs). This document replaces and updates RFC 4306, and includes all of the clarifications from RFC 4718.

IP Security (IPsec) provides confidentiality, data integrity, access control, and data source authentication to IP datagrams. These services are provided by maintaining shared state between the source and the sink of an IP datagram. This state defines, among other things, the specific services provided to the datagram, which cryptographic algorithms will be used to provide the services, and the keys used as input to the cryptographic algorithms.

IKE performs mutual authentication between two parties and establishes an IKE security association (SA) that includes shared secret information that can be used to efficiently establish SAs for Encapsulating Security Payload (ESP) or Authentication Header (AH) and a set of cryptographic algorithms to be used by the SAs to protect the traffic that they carry. In this document, the term 'suite' or 'cryptographic suite' refers to a complete set of algorithms used to protect an SA. An initiator proposes one or more suites by listing supported algorithms that can be combined into suites in a mix-and-match fashion. IKE can also negotiate use of IP Compression (IPComp) in connection with an ESP or AH SA. The SAs for ESP or AH that get set up through that IKE SA we call 'Child SAs'..."

Industry Coalition Plans Interoperability Program with Certification
William Jackson, Government Computer News

The Initiative for Open Authentication, an industry coalition promoting the use of open standards for interoperable strong authentication, used its annual meeting at this week's RSA Security Conference to discuss plans for an interoperability certification program.

The OATH coalition has been working on the program for about six months and expects to launch it in another six, and will be restricted to members of the institute. Common architectures and specifications do not necessarily mean that implementations by different vendors will work with each other, and the goal of the program is to ensure that any back-end authentication system will be able to work with any of the institute's schemes or algorithms.

The institute was organized five years ago to make the use of strong, two-factor authentication simpler and more widespread, increasing security and making it easier to conduct sensitive online transactions. It has produced a reference architecture based primarily on existing standards with a goal of making authentication schemes interoperable across networks and vendor platforms. One of the organization's guiding principles is that open architectures rather than proprietary solutions are required for the widespread adoption of a technology..."

According to the OATH announcement: "the organization has developed draft certification criteria for the first 2 profiles, the HOTP Standalone Client and the HOTP Validation Server. OATH will continue to develop the certification program during 2010 and publish additional profiles to address the whole breadth of OATH technologies and standards. The OATH certification profiles are intended to provide specific guidance and recommendations to providers who want to implement OATH specifications in their products. In addition to developing the profiles, OATH is also working on implementing a compliance testing program that will be announced in the latter half of 2010.

The OATH Certification Program is intended to provide assurance to customers that products implementing OATH standards and technologies will function as expected and interoperate with each other. This will enable customers to deploy 'best of breed' solutions consisting of various OATH 'certified' authentication devices such as tokens and servers from different providers... OATH has taken a modular approach and intends to develop the profiles to address the different OATH specifications as applied to the identified components of the OATH Reference Architecture..."

See also: the OATH announcement

OASIS Public Review: Context/Value Association Using Genericode 1.0
G. Ken Holman (ed), Public Review Draft

Members of the OASIS Code List Representation Technical Committee have released an approved CD specification for Context/Value Association Using Genericode 1.0 as 'Public Review Draft 02' through March 18, 2010. This 15-day review is limited in scope to changes made from the previous review.

This Committee Draft 02/Public Review Draft 02 normatively describes the file format used in a 'context/value association' file (termed in short as 'a CVA file'). This file format is an XML vocabulary using address expressions to specify hierarchical document contexts and their associated constraints. A document context specifies one or more locations found in an XML document or other similarly structured hierarchy of information. A constraint is expressed as either an explicit expression evaluation or as a value inclusion in one or more controlled vocabularies of values. This file format specification assumes a controlled vocabulary of values is expressed in an external resource described by the OASIS genericode standard.

Context/value association is useful in many aspects of working with an XML document using controlled vocabularies and other constraints. Two examples are (1) for the direction of user data entry in the creation of an XML document, ensuring that only valid values are proffered in a user interface selection such as a drop-down menu; and (2) for the validation of the correct use of valid values found in an XML document..."

See also: the OASIS announcement

IETF Sponsors Broadband Home Gateway (Homegate) Workshop
Staff, IESG Secretary

IETF announced a two-day workshop on HOMEGATE, to be held April 20-21, 2010 in London, England. This workshop pertains to a Working Group charter proposal for Broadband Home Gateway (possibly: 'HomeGate').

Background: "During the 76th IETF meeting, the Transport Area sponsored a Broadband Home Gateway BoF, called HOMEGATE. Since that time, interested IETF participants have been working to narrow the scope of the draft charter and to reach out to other Standards Development Organizations (SDOs) to ensure that the planned work is complimentary and not overlapping with their respective work. To further that goal, the IETF's Transport and Internet Areas intend to co-sponsor this two-day workshop on HOMEGATE between IETF-77 and IETF-78.

From the IETF Proposed Charter: "Access to broadband Internet services use networking technology of one form or another within the home, small office/home office (SOHO) or small to medium business (SMB) as the demarcation between the local network and the Internet. These technologies almost always involve a single entity - which is not purely a router—called a 'home gateway'. This entity connects a local user or users to various LAN services, providing some basic level of security. The majority of Internet users employ home gateways for this purpose. However, many serious, long-term problems face users of home gateways today. At the root of many of these problems is the fact that device manufacturers, and/or the organizations that specify requirements for such devices, are not certain which IETF standards and best current practices should be supported, and when/why that support is needed. As a result of this, millions of devices are being deployed every year, which do not work with important IETF protocols, standards, and best practices that are central to the future of the Internet.

One of the problems in this area appears to be that home gateway vendors are unclear which RFCs are important, or current, and why they are important and in what context they matter. Thus, the primary objective of the group is document a baseline of requirements derived from 'core' RFCs which must be supported. A secondary objective is to list desired-but-optional, or 'advanced', requirements from the same RFCs as well as other, non-core RFCs. The context and reasoning behind each document which is included should be summarized as well, in order to improve comprehension of why a given document has been included. These things will help improve compatibilities with and capabilities for use of the Internet of today. This will include a focus in areas such as DNS proxy behavior, congestion mechanisms support, and security. A secondary problem is compatibility with and capability for the use of the Internet of tomorrow. New security needs related to DNS are motivating a move to DNSSEC..."

W3C First Draft for HTML: The Markup Language
Michael Smith (ed), W3C Technical Report

W3C has released HTML: The Markup Language as a first draft. This document describes the HTML markup language and provides details necessary for producers of HTML content to create documents that conform to the language. By design, it does not define related APIs, nor attempt to specify how consumers of HTML content are meant to process documents, nor attempt to be a tutorial or 'how to' authoring guide.

The document was published by the W3C HTML Working Group, part of the HTML Activity in the W3C Interaction Domain. This non-normative document is intended to complement the normative conformance criteria defined in the specification "HTML5: A vocabulary and associated APIs for HTML and XHTML, and is similar in scope to the HTML5 (Author Edition) subset of that specification.

This specification provides the details necessary for producers of HTML content to create conformant documents, and for others to check the conformance of existing documents. It is designed to: (1) describe the syntax and structure of the HTML language; (2) describe the semantics of HTML elements and their attributes—that is, to describe what the elements and attributes represent; (3) be clear and unambiguous; (4) be as concise and readable as possible...

The section on 'Documents' covers HTML language and HTML and XML syntaxes, the HTML namespace and MIME types, Conformant documents, and case insensitivity in tag names and attribute names... The term document is used in the specification to mean an instance of the HTML language. The HTML language is the language described in this specification; it is an abstract language that applications can potentially represent in memory in any number of possible ways, and that can be transmitted using any number of possible concrete syntaxes. This specification makes reference to two particular concrete syntaxes for the HTML language: One syntax which is referred to throughout this specification as the HTML syntax, and another syntax, which is referred to throughout this specification as the XML syntax. Web browsers typically implement two separate parsers for processing documents: an HTML parser which is invoked when processing documents in the HTML syntax, and an XML parser which is invoked when processing documents in the XML syntax. The HTML syntax is the syntax described in the HTML syntax section of this specification. The XML syntax is defined by rules in the XML specification and in the Namespaces in XML 1.0 specification. Beyond the requirements defined in those specifications, this specification does not define any additional syntax-level requirements for documents in the XML syntax.

The Art of Narrative and the Semantic Web
Kurt Cagle, DevX.com

"As the Internet continues to evolve, Semantic Web technologies are beginning to emerge, but widespread adoption is likely to still be two to three years out... The web as originally conceived was largely static — web content, once posted, usually didn't change significantly. However, by 2010, the vast majority of content that is developed on the web falls more properly into the realm of messages rather than documents—Facebook and Twitter notifications, resources generated from rapidly changing databases, documents in which narrative content are embedded within larger data structures, RSS and Atom feeds, KML (ironically, Google Earth and Google Maps) documents and so forth. Thus, a URL no longer contains a static narrative—it contains a constantly changing message.

Document enrichment by itself is of only moderate utility—you are simply adding attributes to html elements to identify the category of a given word. With CSS, for instance, you could highlight the matched terms by category, visually showing place names compared to personal name. However, such enrichment gains more power when these XML documents are processed afterwards — you can pull categories out and add them to a general list of categories for the resource in question, you could create links to specific content such as Wikipedia or the online Physicians Desktop Reference.. There are currently three types of formats used for document enrichment. The first is essential a proprietary or ad hoc standard—the DE vendor provides a formal taxonomy system and method for embedding the formats within the initial text sample. The next approach (and one that is actually diminishing in use) is that of microformats: using an agreed upon standard taxonomy for certain domains, such as document publishing (Dublin Core), friendship relationships (Friend of a Friend, or FOAF), address books (vCard), geocoding information (geo) and so forth. The problem with microformats is that they don't always work well in conjunction, and there's no way of encoding deeper relational information via most microformats.

This latter issue lays at the heart of the Resource Description Framework for Attributes, or RDFa, which makes it possible to encode relational information about different resources and resource links. RDFa is actually a variant of the much older W3C RDF language first formulated in 1999, then significantly updated in 2004. With RDFa, you can define multiple distinct categories (also known as name-spaces) with terms in each category... You can also establish relationships between different parts of a document by using RDF terminology—for instance, indicating that a given introductory paragraph provides a good abstract summary "about" the document (or portion of a document) in question. There's even a specialized language called GRDDL that can take an RDFa encoded document and generate a corresponding RDF document. While comparatively few document enrichment companies have RDFa products on the market, many are moving in that direction, with organizations such as the BBC, NBC News, Time Inc. and Huffington Post among many others now exploring RDFa as a means of encoding such categorization information in the stories that are posted online...

These are progressive technologies — XML technologies are now about a decade old, XQuery and XML Database tools are just now really becoming main stream. Semantic Web technologies are beginning to emerge, but widespread adoption is likely to still be two to three years out. However, publishing and journalism are definitely at the forefront of that curve, because these areas in particular are most sensitive to the need to both provide enjoyable news content and the need to make such stories manipulatable and discoverable within the ever increasing sophistication and scope of the web itself. The narrative thread has become a rich, interwoven tapestry, illuminated by brilliant strands of meaning, semantics and abstraction, turning our writings into conversations, and from there into dialogs..."

See also: the W3C Semantic Web


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY