Cover Pages: XML Daily Newslink: Monday, 20 October 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

Last Call Review for Cascading Style Sheets "Media Queries"
XSPA Profile of XACML v2.0 for Healthcare
IETF Updates 'Extensible Messaging and Presence Protocol (XMPP): Core'
Open XML SDK 2.0 Architecture: Components
Extracting Meaning from Text with OpenCalais R3
UOML (Unstructured Operation Markup Language) Part 1 Version 1.0
XML Gateway Leads The SOA Specialist Charge
Introduction to WEB4J: Web Development for Minimalists

Last Call Review for Cascading Style Sheets "Media Queries"
H.W. Lie, T. Çelik, D. Glazman, A. van Kesteren (eds), W3C TR

W3C announced that the Cascading Style Sheets (CSS) Working Group has published a Last Call Working Draft of Media Queries. Public feedback is welcome through November 21, 2008. Prior to advancement of the Media Queries Working Draft, the CSS WG solicits feedback especially on the parts that have changed since the last publication: (1) Revision of the syntax section; (2) Addition of the 'aspect-ratio' and 'orientation' media features. HTML4 and CSS2 currently support media-dependent style sheets tailored for different media types. For example, a document may use sans-serif fonts when displayed on a screen and serif fonts when printed. 'screen' and 'print' are two media types that have been defined. Media queries extend the functionality of media types by allowing more precise labeling of style sheets. A media query consists of a media type and zero or more expressions to limit the scope of style sheets. Among the media features that can be used in media queries are 'width', 'height', and 'color'. By using media queries, presentations can be tailored to a specific range of output devices without changing the content itself. A media query is a logical expression that is either true or false. A media query is true if the media type of the media query matches the media type of the device where the user agent is running (as defined in the "Applies to" line), and all expressions in the media query are true. Also, a media query which is otherwise false becomes true if the 'not' keyword is present. When a media query is true, the corresponding style sheet is applied as per the normal cascading rules. Several media queries can be combined in a comma-separated list. If one or more of the media queries in the comma-separated list is true, the associated style sheet is applied, otherwise the associated style sheet is ignored. If the comma-separated list is the empty list it is assumed to specify the media query 'all'...

XSPA Profile of XACML v2.0 for Healthcare
Duane DeCouteau, Mike Davis, David Staggs, Brett Burley (eds), OASIS Working Draft

Members of the OASIS Cross-Enterprise Security and Privacy Authorization (XSPA) TC have published a working draft XSPA Profile of XACML v2.0 for Healthcare. The XSPA OASIS Technical Committee was chartered to "specify sets of stable open standards and profiles, and create other standards or profiles as needed, to fulfill the security and privacy functions identified by the functions and data practices identified by HITSP, or specified in its use cases, as all are mandated or specified from time to time." The "Cross-Enterprise Security and Privacy Authorization (XSPA) Profile of XACML v2.0 for Healthcare" describes several mechanisms to authenticate, administer, and enforce authorization policies controlling access to protected information residing within or across enterprise boundaries. The policies being administered and enforced relate to security, privacy, and consent directives. This profile may be used in coordination with additional standards including Web Services Trust Language (WS-Trust) and Security Assertion Markup Language (SAML). This profile specifies the use of XACML 2.0 to promote interoperability within the healthcare community by providing common semantics and vocabularies for interoperable policy request/response, policy lifecycle, and policy enforcement... XAMCL functions of the Policy Enforcement Point (PEP) are carried out by the Service Interface. The PEP interacts with the Policy Information Point (PIP) of the Attribute Service and the Policy Decision Point (PDP) functionality of the Access Control Service (ACS), in enforcing authorization decisions. The XSPA profile of XACML supports sending all Service User requests through an ACS. XACML functions of the PDP are carried out by the ACS. Attributes necessary to make a local access control decision are determined and HL7 Permission are granted to the Service User based on their role, purpose of use (POU), the service endpoint of the external resource, and any site specific operational attributes. XACML functions of the Policy Information Point (PIP) are carried out by the Attribute Service. The Attribute Service has access to attribute information (e.g., location, purpose of use), object preferences, consent directives and other privacy conditions (object masking, object filtering, user, role, purpose, etc.) that constrain enforcement. XACML functions of the Policy Administration Point (PAP) are carried out by the Policy Authority. The Policy Authority has access to security policies that include rules regarding authorizations required to access a protected resource and additional security conditions (location, time of day, cardinality, separation of duty purpose, etc.) that constrain enforcement... Systematized Nomenclature of Medicine (SNOMED) Clinical Terms User Guide provides the core general terminology for the electronic health record (EHR). As used in this profile, SNOMED CT is used to designate clinically relevant protected information objects.

See also: XSPA TC references

IETF Updates 'Extensible Messaging and Presence Protocol (XMPP): Core'
Peter Saint-Andre (ed), IETF Internet Draft

IETF announced the release of an updated Internet Draft specification for "Extensible Messaging and Presence Protocol (XMPP): Core," edited by Peter Saint-Andre of the XMPP Standards Foundation. This 165-page document "defines the core features of the Extensible Messaging and Presence Protocol (XMPP), a technology for streaming Extensible Markup Language (XML) elements for the purpose of exchanging structured information in close to real time between any two or more network-aware entities. XMPP provides a generalized, extensible framework for incrementally exchanging XML data, upon which a variety of applications can be built. The framework includes methods for stream setup and teardown, channel encryption, authentication of a client to a server and of one server to another server, and primitives for push-style messages, publication of network availability information ('presence'), and request-response interactions. This document also specifies the format for XMPP addresses, which are fully internationalizable." This 'bis' specification, when complete, would obsolete RFC 3920. IETF RFC 3920 was published in October 2004. Summary from Daniel Rubio: "Originally conceived under the name Jabber and targeted at instant messaging applications, the approach set forth by Jabber has led to the creation of XMPP which is now an IETF standard— RFC-3920 —supported by various other XMPP extensions. At its core, XMPP is defined as a streaming protocol that makes it possible to exchange XML fragments between any two network endpoints. Now, exchanging XML fragments between two network end points is not a novel idea, in fact it is what Web services designed around SOAP and REST principles already do. So why is XMPP different? It's all in the XML fragment payload. Whereas XML fragments used in SOAP-type services are provisioned with many WS-*/SOAP specific payloads to guarantee the many features required in enterprise services—such as security and reliability—and REST type services can be considered open-ended since they don't require any specific XML provision, XMPP uses specific payloads a'la SOAP to guarantee its primary feature as a real-time communication transport is achieved... If you're familiar with services based on SOAP, XMPP's principles are strikingly similar. Both use their own addressing scheme, they both use envelopes to indicate communication behavior, and they both rely on standardized XML elements to further specify the purpose of an exchange. Similarly though, XMPP also requires that both a server and client be capable of processing and implementing the behavior expressed in such payloads."

Open XML SDK 2.0 Architecture: Components
Zeyad Rajabi, Blog

A previous blog article presented the overall design of the Open XML SDK 2.0 with respect to goals and scenarios. This article installment talk about the architecture of the SDK in terms of its different components. The Open XML SDK is designed and implemented in a layered approach, starting from a base layer moving towards higher level functionality, such as validation. The System Support layer contains the fundamental components that the SDK is built upon. The Open XML File Format Base Level layer is the core foundation of the SDK. This layer provides you functionality to create Open XML packages, add or remove parts, and read/add/remove/manipulate XML elements and attributes. The Open XML File Format Higher Level layer is the last layer in our architecture. This layer provides functionality to make it easier for you to code against Open XML formats. For example, one idea is to have this layer contain schema and semantic level validation to help assist you in generating proper and valid Open XML files. The System Support layer consists of the following components: (1) .Net Framework 3.5: The Open XML SDK leverages the advanced technology provided by .Net Framework 3.5, especially LINQ to XML, which makes manipulating XML much easier and more intuitive. (2) System.IO.Packaging: The Open XML SDK needs to be able to add/remove parts contained within Open XML Format packages. Included as part of .Net Framework 3.0 were a set of generic packaging APIs capable of adding removing parts of Open Package Convention (OPC) conforming packages. Given that Open XML Formats are based on OPC, the SDK uses System.IO.Packaging APIs to open, edit, create, and save Open XML packages. (3) Open XML Schemas: The Open XML SDK is based on Open XML Formats, which are represented and described as schemas. These schemas make up the foundation of the Open XML SDK. Currently the Open XML SDK is based on Ecma 376; we will add support for IS 29500 as soon as the standard is made public. The Open XML File Format Base Level layer provides a platform for Open XML developers to create Open XML solutions and consists of the following components: (A) Open XML Packaging API: This component is built on top of the .Net Framework 3.0 System.IO.Packaging component. Instead of providing generic access to the parts contained in the Open XML Package, this component allows developers to manipulate Open XML parts with strongly typed classes and objects. (B) Open XML Low Level DOM: This component represents the XML wrapper of the Open XML schemas. (C) Stream Reading/Writing, which component includes stream reader and writer interfaces specifically targeting Open XML elements and attributes... [As to] the Open XML File Format Higher Level Layer, we might be able to provide [Schema Level Validation and Additional Semantic Validation]... We envision the Helper Function Layer as a way to provide helper functions or code snippets to make your life a bit easier in creating valid Open XML files. Certain operations within Open XML can be somewhat complex. For example, deleting a paragraph in a WordprocessingML document is not simply just deleting the paragraph node. There are a variety of extra steps required to delete a paragraph and maintain the integrity of a valid Open XML document.

See also: Part 1

Extracting Meaning from Text with OpenCalais R3
James Leigh, DevX.com

A big challenge companies face today is that most information, both online and archived, is only available as published text and does not contain any formal structure suitable for synthesizing. In a formal structure, information can be summarized, used to help locate meaningful text, and combined with other text to provide new insights. This article shows how to convert unstructured written text into structured data using OpenCalais, which is a public general-purpose text-extraction service that uses a combination of statistical and grammatical analysis to extract meaning. OpenCalais is not the only solution available for extracting meaning from text, but it is the only publicly available web service... The simplest way to categorize a document or paragraph is to use word associations. For example, if the words "earnings" and "acquired" are used in a document, it is likely a document about business finances. Furthermore, if the word "Reuters" is mostly used only in business finance documents, then other documents containing this word are likely to also be about business finances. This technique is called statistical analysis and is commonly used for document categorization. Statistical analysis is an OpenCalais technique to categorize documents and identify what the text is referring to... Heuristic rules in OpenCalais are further used not just to identify associations, but to extract meaning from the text as well. OpenCalais uses heuristic rules to identify facts and events to create new information derived from multiple documents. OpenCalais does this by identifying commonly used verbs to describe facts or events... Thomson Reuters offers a public OpenCalais web service with a no-cost license; applications can connect and use the service free of charge to extract meaning from any text. The web service is geared towards general-purpose use, and works well for commonly understood documents. Thomson Reuters also offers subscription licenses, for customization to particular vocabularies. These web services allow any text to be uploaded via an HTTP POST and respond with an RDF/XML file that describes the document. The response contains the original document (called DocInfo) with a category (called DocCat), instance information of referenced named entities with relevance score, and events and facts that are found in the document. OpenCalais R3 brings improvements to named entity extraction and categorization. To use the public web service, you post the URL-encoded license, content, and parameters to [the REST URI]. If successful, the response is an RDF/XML file. You can parse the file directly or import it into an RDF store. Sesame, a leading RDF framework, provides parsers and storage for RDF content. After you import a collection of document metadata into an RDF store, you can synthesize it to derive new assets of information based on extracted data. Aduna's Cluster Map technolog can visualize the relationships between documents (through named entities) and between named entities (through facts and events). The Cluster Map download archive includes a simplistic web crawler and two interactive visualization tools that you can use to explore these relationships. Executing the Main class with a list of URLs that you can import into the local RDF store opens two windows: Document, and Named Entity Cluster Map. The relationships appear in the side pane, while the selected relationships are shown graphically using Aduna's Cluster Map technology, which displays whether and how sets overlap (similar to Venn diagrams and Euler diagrams). In the command line, you can prefix each URL by '1' to indicate that embedded links should be followed once, or '0' to include only the explicit URL.

See also: the OpenCalais web site

UOML (Unstructured Operation Markup Language) Part 1 Version 1.0
Staff, OASIS Announcement

OASIS announced that the member ballot for approval of the UOML Committee Specification as an OASIS Standard has passed. "UOML (Unstructured Operation Markup Language) Part 1 Version 1.0" defines a markup language for unstructured document operation, including the definitions of abstract document model and document operating instructions to the abstract document model. Negative votes on the CS were received; the TC gave consideration to the negative votes but has voted to request the specification be approved. The UOML-X TC was chartered to develop and maintain an XML-based operation interface standard for unstructured documents. The Unstructured Operation Markup Language specification defines an XML schema for universal document operations. The schema is suitable for operating printable documents, including create, view, modify, and query information, that can be printed on paper, e.g. books, magazine, newspaper, office documents, maps, drawings, blueprints, but is not restricted to these kinds of documents. Several commercial and free applications are already available based on the submitted draft of UOML. OASIS members Primeton Technologies Ltd., Redflag 2000, and Sursen have submitted Statements of Use indicating successful use of the UOML specification. UOML is interface standard to process unstructured document; it plays the similar role as SQL (Structured Query Language) to structured data. UOML is expressed with standard XML, featuring compatibility and openness UOML deals with layout-based document and its related information (such as metadata, rights, etc.) Layout-based document is two dimensional, static paging information, i.e. information can be recorded on traditional paper. The software which implements the UOML defined function, is called DCMS, applications can process the document by sending UOML instructions to DCMS. UOML first defines abstract document model, then operations to the model. Those operations include read/write, edit, display/print, query, security control; it covers the operations which required by all different kinds of application software to process documents. UOML is based on XML description, and is platform-independent, application-independent, programming language-independent, and vendor neutral. This standard will not restrict manufacturers to implement DCMS in their own specific way. This specification is the first part of UOML, which defines the operations used for read/write, edit, and display/print layout-based document.

XML Gateway Leads The SOA Specialist Charge
Erik Pieczkowski, InformationWeek Review

Vordel's XML Gateway is a pre-hardened network appliance that promises to offload processor-intensive tasks running on application servers. In addition to routing XML traffic based on content or sender, it performs XML content screening and transforms XML payloads on the fly. The XML Gateway protects services and mediates communications between service consumers and providers, while its software-based VXA engine provides XML acceleration for XPath expressions, XML schema validation, and XSL transformations on both hardware- and software-based appliances... Vordel's claim for ease of deployment and centralized policy management is well-founded. In our lab tests, we were able to quickly set up both software and hardware-based appliances and manage policies across them. This is a nice feature and obviates having to manage silos of policies across XML appliances... Vordel surrounds XML Gateway with a diverse and well-featured toolset. The Vordel Policy Director offers centralized policy creation and management, Vordel Reporter provides visibility and reporting on Web service metrics, and SOAPbox is a testing suite for XML applications. Each of these tools was easy to work with once set up and configured. Policies define rules for how an XML Gateway-protected service can be consumed. The Vordel XML Gateway enforces a vast number of policies. Once we defined the policies using Policy Studio, we could limit users' service access by HTTP basic authentication, XPath credentials, and service availability. Policy Studio is a powerful mechanism for policy creation and maintenance, and role-based access to policies is a nice feature. The Policy Director architecture eliminates the need to manage a group of isolated policies across individual XML Gateways. With XML Gateway, Vordel targets the enterprise, and the product's benefits are most fully realized when running multiple XML Gateways within the network. Implementing Vordel's architecture does take some planning, but overall, Vordel provides a well-thought-out system that centralizes policy management and performs ably under load. And, we found pricing competitive: $59,000 for the XML Gateway appliance hardware, or $35,000 for the XML Gateway software appliance.

Introduction to WEB4J: Web Development for Minimalists
John O'Hanley, Java World Magazine

As Java Web application frameworks have become more powerful and flexible, they've also become more complex. The WEB4J framework in many ways flies in the face of this trend: it offers few customization options, but is easy to learn and work with. "My experience in building Java Web apps has been that the tools currently in wide use often cause more pain than they relieve. Most Web applications are too complex—and complex tools are, by definition, hard to learn and hard to use. Many popular Web frameworks include hundreds or thousands of classes published in their javadocs; in contrast, WEB4J has only 85 published classes. Building plain, boring Java Web apps that just pump data into and out of a database should be just that—boring. But it's often a long-distance phone call away from boring, because it's so complicated. The fundamental thesis of WEB4J is that currently building Java web apps is unnecessarily complex... Many tools make extensive use of XML files. But coding in XML is widely recognized as a particularly fruitful source of pain. XML files aren't part of compiled code, so errors are often found only at runtime. XML syntax is widely regarded as verbose and clunky, and many modern IDEs are wonderfully rich tools for editing Java but not much help with XML files. Unfortunately, many frameworks place XML at the very core of their design... Some tools force you to abandon things you already know how to do, and replace them with a parallel set of techniques specific to that tool. For instance, in Struts 1.x, you typically don't implement form controls using standard HTML; instead, the framework forces you to replace standard form controls with a set of custom tags, one for each type of control. Tools should build upon what you already know, and not force you to learn a different way of doing essentially the same thing... What's different about WEB4J is its central design goal. Simply put, WEB4J's central design goal is aggressive minimalism, a kind of minimalism not seen in other Java tools. This minimalism is manifest in two ways: the small size of its API, and in the concision of apps built using the framework... WEB4J has a philosophy of deep simplicity and minimalism. Its fundamental goal is rapid development, and its intent is to make the task of building plain vanilla HTML interfaces to a relational database as simple and as fast as possible. In addition, WEB4J was explicitly designed to avoid the kinds of pain endemic to many contemporary Web application frameworks..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors