Cover Pages: XML Daily Newslink: Tuesday, 15 April 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Primeton http://www.primeton.com

Headlines

Apache Abdera: Atom, AtomPub, and Java
Use HATS to Generate Atom Feeds for Mainframe Applications
W3C Invites Public Comment on Content Transformation Guidelines 1.0
Proposal for IETF NETCONF Data Modeling Language Working Group
Don't Be Surprised By E-Discovery
The Spirit of Schematron in Test Driven Development (TDD)
Eclipse-Compatible User Interface Included On Mule 2.0 ESB

Apache Abdera: Atom, AtomPub, and Java
Stefan Tilkov, James Snell, Diephouse; InfoQueue

The Apache Abdera project, an open source Atom Syndication and Atom Publication Protocol implementation currently still in its incubation phase, has recently reached its 0.40 milestone, an important step towards graduation [as an Apache project]. Snell: "While Atom and AtomPub certainly began life as a way of syndicating and publishing Weblog content, it has proven useful for a much broader range of applications. I've seen Atom being used for contacts, calendaring, file management, discussion forums, profiles, bookmarks, wikis, photo sharing, podcasting, distribution of Common Alerting Protocol alerts, and many other cases. Atom is relevant to any application that involves publishing and managing collections of content of any type... Abdera is an open source implementation of the Atom Syndication Format and Atom Publishing Protocol. It began life as a project within IBM's WebAhead group and was donated to the Apache Incubator in June 2006. Since then, it has evolved into the most comprehensive open-source, Java-based implementation of the Atom standards.. Abdera has been part of the Apache Incubator for long enough. While there are still some details to work out, I would very much like to see Abdera graduate to its own Top Level Project at Apache, and become host to a broad range of Atom-based applications." Diephouse: "Look to some of the public services out there: most of the APIs for Google are based on AtomPub. Microsoft is moving toward it for web APIs too. These services are all going beyond just blogs. AtomPub goes beyond public web APIs as well—I've noticed that many enterprises are starting to use AtomPub for some of their internal services as well. Both AtomPub and SOAP/WSDL give you a way to build a service for others to use. But AtomPub takes a fundamentally different approach to helping users implement services. It implements constraints which give new types of freedom. Because the data format is constrained—every entry has a title, entry, id, and content/summary—I can use an Atom feed from any type of application and get some useful information out of it... Abdera includes support for developing/consuming AtomPub services, an IRI library, a URI template library, unicode normalization, extensions for things like XML signature/encryption, GData, GeoRSS, OAuth, JSON and more. One of the cool new things in the latest release are a set of 'adapters' which allow you to have an AtomPub service without any coding by storing entries in JDBC, JCR or the filesystem...

See also: Atom references

Use HATS to Generate Atom Feeds for Mainframe Applications
Ramakrishnan Kannan, et al., IBM developerWorks

Nowadays, content distributors deliver all content, including news and site updates, as feeds. Most enterprise applications use feeds for various purposes, including to monitor an application and check the status of a project. Content providers publish a feed link on their site that users register with a feed reader. The feed reader checks for updates to the registered feeds at regular intervals. When it detects an update in the content, the feed reader requests the updated content from the content provider. The feeds contain only a summary of the content, but they provide a link to the detailed content. Atom Syndication Format and RSS are the most common specifications of feeds. We're using Atom feeds in this article, but you can change easily to RSS feeds with a little modification. This article leverages a product called IBM WebSphere Host Access Transformation Services (HATS), which converts any given green-screen, character-based 3270 or 5250 host application into a Web application (HTML) or rich-client application. HATS also allows programmatic interfaces to convert the identified content in these host applications into any other format. We take a step-by-step approach to show you how to write a HATS program that converts the host application content into Atom feeds... Delivering data as Atom feeds in mainframes opens a new world of possibilities for enterprise applications. Organizations can use mashup editors to extract data from companies with external or internal feeds and create new applications or information. For example, call centers can take advantage of mashups by passing a calling customer's ZIP code information to Google Maps to identify the location of the customer. This can help the call center employees personalize the conversation by enquiring about the weather from the customer's location, and so on. The delivery of data as Atom feeds in mainframe servers is one of the fundamental building blocks that enables an organization to embrace Web 2.0.

W3C Invites Public Comment on Content Transformation Guidelines 1.0
Jo Rabin (ed), W3C Technical Report

W3C announced that the Mobile Web Best Practices Working Group has published the First Public Working Draft for "Content Transformation Guidelines 1.0." This document provides guidance to managers of content transformation proxies and to content providers for how to coordinate when delivering Web content. Content transformation techniques diverge widely on the web, with many non-standard HTTP implications, and no well-understood means either of identifying the presence of such transforming proxies, nor of controlling their actions. From the point of view of this document, Content Transformation is the manipulation in various ways, by proxies, of requests made to and content delivered by an origin server with a view to making it more suitable for mobile presentation. The W3C MWI BPWG neither approves nor disapproves of Content Transformation, but recognizes that is being deployed widely across mobile data access networks. The deployments are widely divergent to each other, with many non-standard HTTP implications, and no well-understood means either of identifying the presence of such transforming proxies, nor of controlling their actions. This document establishes a framework to allow that to happen.

Proposal for IETF NETCONF Data Modeling Language Working Group
Staff, Internet Engineering Steering Group Announcement

The IESG Secretary announced that a new IETF working group has been proposed in the Operations and Management Area, described in a draft NETMOD Charter. The NETCONF Working Group has completed a base protocol to be used for configuration management. However, the NETCONF protocol does not include a standard content layer. The specifications do not include a modeling language or accompanying rules that can be used to model the management information that is to be configured using NETCONF. This has resulted in inconsistent syntax and interoperability problems. The purpose of NETMOD is to support the ongoing development of IETF and vendor-defined data models for NETCONF. The WG will define a "human-friendly" modeling language defining the semantics of operational data, configuration data, notifications, and operations. This language will focus on readability and ease of use. This language must be able to serve as the normative description of NETCONF data models. The WG will use YANG as its starting point for this language. Language abstractions that facilitate model extensibility and reuse have been identified as a work area and will be considered as a work item or may be integrated into the YANG document based on WG consensus. The WG will define a canonical mapping of this language to NETCONF XML instance documents, the on-the-wire format of YANG-defined XML content. Only data models defined in YANG will have to adhere to this on-the-wire format. In order to leverage existing XML tools for validating NETCONF data in various contexts and also facilitate exchange of data models SDL data modeling framework (ISO/IEC 19757) with additional annotations to preserve semantics. The initial YANG mapping rules specifications are expressly defined for NETCONF modeling. However, there may be future areas of applicability beyond NETCONF, and the WG must provide suitable language extensibility mechanisms to allow for such future work. The NETMOD WG will only address modeling NETCONF devices and the language extensibility mechanisms... Initial deliverables: (1) An architecture document explaining the relationship between YANG and its inputs and outputs; (2) The YANG data modeling language and semantics; (3) Mapping rules of YANG to XML instance data in NETCONF; (4) YIN, a semantically equivalent fully reversible mapping to an XML-based syntax for YANG. YIN is simply the data model in an XML syntax that can be manipulated using existing XML tools (e.g., XSLT); (5) Mapping rules of YANG to DSDL data modeling framework (ISO/IEC 19757), including annotations for DSDL to preserve top-level semantics during translation; (6) A standard type library for use by YANG. The IESG has not made any determination as yet; please send your comments to the IESG mailing list by April 22, 2008.

Don't Be Surprised By E-Discovery
John Moore, Federal Computer Week

E-discovery requires government agencies to know what electronic documents they have and be able to find them quickly if someone requests them for a court case. That's no small task considering the enormous volume of electronic documents created by the typical organization. Email messages and attachments represent a good chunk of the problem, but word-processing documents, PDFs and other digital information also contribute to the management challenge. The amended Federal Rules of Civil Procedure, which has heightened awareness of e-discovery, cover a wide range of data types under the umbrella of electronically stored information... E-discovery experts recommend establishing a taxonomy and creating metadata tags for electronic information. The taxonomy provides a general way to classify information, and metadata provides detail on information to make searches more fruitful. The Electronic Discovery Reference Model project devised an Extensible Markup Language (XML) schema to consistently describe electronic information. [Penny] Quirk said EDRM created the XML e-discovery standard to ensure that consistent and common nomenclature is used for business records during the e-discovery process; the project is scheduled for completion in this year's second quarter... Electronic documents culled in e-discovery and used in litigation demand special treatment: documents compiled in significant cases at the Justice Department are kept as permanent records of the government. Records in garden-variety cases in federal court are considered temporary, but they might still be housed for a number of years at one of the National Archive's Federal Records Centers. The National Archives tapped Lockheed Martin in 2005 to build an Electronic Records Archives system that will help the agency ingest electronic records flagged for permanent storage; the aim now is to accept government reco ds in any format, encapsulating each electronic document in an XML metadata wrapper.

See also: the EDRM XML Project

The Spirit of Schematron in Test Driven Development (TDD)
Eric Larson, O'Reilly Articles

Test Driven Development is a relatively popular methodology nowadays and I think XML tools can play crucial aspect in better testing. Testing frameworks are more than capable of using and testing XML based applications, but just in case you have ever had trouble, here are a few tips. XSLT makes for an excellent transformation tool for massaging XML data. This means it also can be a helpful tool to reduce large XML data sets to something manageable, whether it is XML or not. For example [see the] simple XSLT stylesheet that will return content on errors checking an Atom Feed, which is is exceptionally simple, but hopefully it makes the point. In the example, you'll also notice that the output was not contained in a XML Element. Sometimes it is easier to just parse a simple text file line by line, so this might be that situation. Likewise, having a designated set of test elements could be helpful—think reports transformed to HTML). That said, the goal is not to create some enormous test framework in XML and XSLT. The real goal is to use a great tool for transforming XML to something you can use easily. I wouldn't necessarily suggest trying to validate the content of an element or do complex string parsing. XSLT 1.0 isn't really the easiest language for string parsing or complex math with out a little help. You can always add your own extension functions to help out, but hopefully keeping things simple by massaging the data gets you 80% of the way. The idea here is make things palatable to your own tastes... I like XML, but I hate XML Schema and DTDs. RELAX NG is slightly better option, but when you just want to make sure some value is present, the above methods can be a simpler solution. The essence of the above suggestions come from Schematron, an excellent validation tool that is as simple as knowing XPath. Schematron in fact has been implemented using XSLT, so adding it to your existing test framework should be relatively simple. There are times when XML seems to present a subtle problem within the world of object oriented languages. It's not a hard problem on a technical level. Working with XML is relatively simple with many examples and resources. Things get hard when you don't have good tools to help you along the way. The XML landscape to your programming language of choice when XML has more than enough tools to seamlessly integrate testing your XML along side your models, views, controllers and integrations.

See also: Schematron references

Eclipse-Compatible User Interface Included On Mule 2.0 ESB
Charles Babcock, InformationWeek

MuleSource, the company behind the open source enterprise service bus Mule, has expanded Mule's capabilities in Version 2.0, giving it an integrated development environment that works inside Eclipse. The integrated development environment in Version 2.0 allows a Java developer to create connectivity between his application and existing applications in his target environment. The IDE helps generate a Mule configuration file for any project being built inside the Eclipse programmer's workbench. Developers can drag and drop the attributes, the type of messaging and the origin and destination of the messages, in the configuration file using the options available in the IDE's editor. The messages are typically XML and developers may view the XML source code as well as work with visual elements... Web services registries typically capture Web Services Description Language (WSDL) information on a service so it can be discovered by someone on the network who wants the service. In addition to WSDL service names, Galaxy can capture and index additional information for quick reference, such as an annotations that describe Java interfaces, telling an outside user how to connect to a Java service; summary information on Windows Communication Foundation, a set of .Net technologies for building Web services; and implementations of the Internet Engineering Task Force's Atom content syndication protocol. Mule Galaxy also draws on the work of the Apache incubator project, CXF, aimed at connecting services to a variety of protocols, such as SOAP, XML/HTTP, and RESTful HTTP. MuleSource now hosts a new Web site, MuleForge.org, to encourage additional open source participation in Galaxy, Apache CXF Transport, Jersey Transport for building RESTful Java services, and other projects.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors