Cover Pages: XML Daily Newslink: Thursday, 03 September 2009

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

XML Threatens Big Data?
NIST/NSA Survey of Access Control Models
Extensible Resource Descriptor (XRD) Version 1.0
Generate PDFs with XStream and XSL-FO
Updated IETF Internet Draft: vCard Extensions to WebDAV (CardDAV)
International RuleML Symposium on Rule Interchange and Applications
Units Ontology with Support for SPIN Framework
AJAX Widget Security Enabled
Instant Notifications Using Google's PubSubHubbub Protocol

XML Threatens Big Data?
Kurt Cagle, XML Today Blog

"A Dataspora Blog article entitled 'How XML Threatens Big Data' argues that XML is a poor format for working with big data formats—financial, governmental, business, what have you. Specifically, it sets out three major problems with XML: (1) XML Spawns Data Bureaucracy; (2) Size Matters for Data; (3) Complexity Carries a Cost... Evidence: a several hundred megabyte XML datafile coming from the Genome project and load it into an Oracle relational database with an XML front end, which ultimately required parsing the file first with Perl and then building up specialized subparsers. Not surprisingly, this process was almost too small and unworkable, and as such, XML must be seen as the culprit... But I don't buy it; let's address each of the perceived problems...

XML is ultimately about data interchange. If you are the only one using your XML, then sure, go ahead and build ad hoc data models. However, the moment that you have to share your data with other consumers of that data, lazy data modeling is going to bite you hard. A much better philosophy, especially as the world moves to a distributed processing environment, is to use resource oriented programming (aka RESTful services, XRX, MODS and so on). This is becoming especially significant in the XML space because of the rise of both XQuery and XML Databases, but it can be applied to other environments as well...

XML has evolved dramatically since 2000. Related technologies, such as RDF/OWL, have adapted even more. While there are problems with XML, ones even known to practitioners of long standing in the field, these problems are intrinsic to most computer languages. There are places where XML shouldn't be used, is overkill, and where other solutions are better. However, overall, most of the problems with people working with XML come down to laziness, ignorance, and parochialism. The problems outlined in the original article were legitimate in 2000. Today, there are a large number of both commercial vendors and open source projects providing very sophisticated tools, from XML Databases and XQuery to XSLT 2.0, firmware SAX processors and any number of XML renderers. There's a far better understanding of data modeling and data architecture, and new views about how we effectively describe reality via computational linguistics..."

See also: the XML-DEV list thread

NIST/NSA Survey of Access Control Models
Gunnar Peterson, Blog

"NIST and the NSA are holding a Privilege (Access) Management Workshop, [and] one of the docs contains a very short, sharp focused survey of Access Control Models. The authors have clearly been there, done that. It begins with ACLs which are resource focused, then looks at the next evolution RBAC (ABAC, PBAC, RAdAC). Now I have seen several large companies that look at RBAC as a silver bullet. Its a useful model but its no silver bullet. Then getting more modern, the survey examines Attribute Based Access Control (ABAC) and Policy Based Access Control...

A key advantage to the ABAC model is that there is no need for the requester to be known in advance to the system or resource to which access is sought. As long as the attributes that the requestor supplies meet the criteria for gaining entry, access will be granted. Thus, ABAC is particularly useful for situations in which organizations or resource owners want unanticipated users to be able to gain access as long as they have attributes that meet certain criteria. This ability to determine access without the need for a predefined list of individuals that are approved for access is critical in large enterprises where the people may join or leave the organization arbitrarily...

Many access control models look great on the whiteboard and then fall down when they meet reality simply because the model assumes way too much a priori knowledge. Of course, ABAC introduces many wormholes that attacks can flow through. On PBAC, there's good summary of why we need it (improve governance and reduce wormholes in ABAC), what standards are there to help (XACML), and where work still is needed..."

Extensible Resource Descriptor (XRD) Version 1.0
Eran Hammer-Lahav and Will Norris (eds), OASIS TC Working Draft

Members of the OASIS Extensible Resource Identifier (XRI) Technical Committee have produced an interim (unapproved) Working Draft 05 for the "Extensible Resource Descriptor (XRD) Version 1.0" specification. XRD is a simple generic format for describing resources. This draft is intended to supersede the "Extensible Resource Identifier (XRI) Resolution Version 2.0" Committee Draft 03, published in February 2008. The draft reflects all changes discussed in the TC by September 01, 2009, and "as far as the editors are concerned, this is the final working draft..."

The XRD specification defines a "generic format for describing resources. Resource descriptor documents provide machine-readable information about resources (resource metadata) for the purpose of promoting interoperability and assist in interacting with unknown resources that support known interfaces. For example, a web page about an upcoming meeting can provide in its descriptor document the location of the meeting organizer's free/busy information to potentially negotiate a different time. The descriptor for a social network profile page can identify the location of the user's address book as well as accounts on other sites. A web service implementing an API with optional components can advertise which of these are supported...

An XRD document may describe the properties of the resource itself, as well as the relationship the resource has with other resources. XRD builds directly on the typed link relations framework defined by IETF Internet Draft "Link Relations and HTTP Header Linking" (aka "Web Linking"), and used by HTML, Atom, and other protocols. The XRD schema defines only the basic elements necessary to support the most common use cases, with the explicit intention that applications will extend XRD as defined in Section 2.5, 'XRD Extensibility' to include any other metadata about the resources they describe..."

[Update: Working Draft 06 (04-September-2009).

Generate PDFs with XStream and XSL-FO
Brian J. Stewart, IBM developerWorks

Many business applications require creation of a PDF document consisting of data stored in Java business objects. You can best think of these PDF documents as a view of the business data: This view, including the layout and structure, should be easily changeable and loosely tied to the business objects. This article provides a solution to this common business problem using XML, XStream, and Extensible Stylesheet Language Formatting Objects (XSL-FO)... The separation of concerns allows you to isolate the view from the business objects, thus you can change the view (PDF document) without having to modify the Java code.

XStream is a simple but powerful library that enables you to serialize and de-serialize objects to and from XML. XStream's power comes from its flexibility, simplicity, speed, low memory usage, low overhead, and its control over the XML the library produces. Another key feature of the XStream library is its support for processing of deep object graphs, such as a catalog of CD objects housing tracks that contain track information. XStream doesn't require any changes to existing business objects unless you want to take advantage of its Java annotation support.

The next building block is XSLT, which allows you to transform a structured XML document into various output formats, such as XML and HTML. XSLT is a complex and robust XML-based language that has several built-in functions—for example, string functions and formatting functions. It also uses XPath extensively to query and select XML nodes.."

Updated IETF Internet Draft: vCard Extensions to WebDAV (CardDAV)
Cyrus Daboo (ed), IETF Internet Draft

Members of the IETF vCard and CardDAV (VCARDDAV) Working Group have released an updated version of the specification "vCard Extensions to WebDAV (CardDAV)." As summarized in document Appendix A (Change History), several version -08 changes have been made in light of IETF Area Director Review by Alexey Melnikov.

The WebDAV standard ("HTTP Extensions for Web Distributed Authoring and Versioning") defines a set of methods, headers, and content-types ancillary to HTTP/1.1 for the management of resource properties, creation and management of resource collections, URL namespace manipulation, and resource locking (collision avoidance). WebDAV uses XML for property names and some values, and also uses XML to marshal complicated requests and responses. The RFC specification contains DTD (grammar) and text definitions of all properties and all other XML elements used in marshalling. WebDAV also includes a few special rules on extending WebDAV XML marshalling in backwards-compatible ways...

WebDAV offers a number of advantages as a framework or basis for address book access and management. Most of these advantages boil down to a significant reduction in design costs, implementation costs, interoperability test costs and deployment costs.

The "vCard Extensions to WebDAV (CardDAV)" specification defines extensions to the WebDAV protocol to specify a standard way of accessing, managing, and sharing contact information based on the vCard format. Address books containing contact information are a key component of personal information management tools, such as email, calendaring and scheduling, and instant messaging clients. To date several protocols have been used for remote access to contact data, including Lightweight Directory Access Protocol (LDAP), Internet Message Support Protocol (IMSP) and Application Configuration Access Protocol (ACAP), together with SyncML used for synchronization of such data... a CardDAV address book is modeled as a WebDAV collection with a well defined structure; each of these address book collections contain a number of resources representing address objects as their direct child resources. Each resource representing an address object is called an "address object resource". Each address object resource and each address book collection can be individually locked and have individual WebDAV properties..."

International RuleML Symposium on Rule Interchange and Applications
Organizers, RuleML-2009 Symposium Update

A revised call for participation has been issued for RuleML 2009, the International RuleML Symposium on Rule Interchange and Applications, to be held November 5-7, 2009 in Las Vegas, Nevada, USA. "The International Symposium on Rules, Applications and Interoperability has evolved from an annual series of international workshops since 2002, international conferences in 2005 and 2006, and international symposia since 2007. RuleML-2009 is devoted to practical distributed rule technologies and rule-based applications which need language standards for rules (inter)operating in, e.g., the Semantic Web, Multi-Agent Systems, Event-Driven Architectures, and Service-Oriented Applications. The submission deadline is Friday, September 11 2009.

The Third International Rule Challenge at RuleML 2009 is one of the highlights at RuleML-2009 with prestigious prizes. Submissions of benchmarks/evaluations, demos, case studies / use cases, experience reports, best practice solutions (e.g. design patterns, reference architectures, models), rule-based implementations/ tools/ applications, demonstrations engineering methods, implementations of rule standards (e.g. RuleML, RIF, SBVR, PRR, rule-based Event Processing languages, BPMN+rules, BPEL+rules, ...), rules + industrial standards (e.g. XBRL, MISMO, Accord, ...), and industrial problem statements are particularly encouraged..."

Units Ontology with Support for SPIN Framework
Holger Knublauch, Blog

"My co-workers at TopQuadrant have just published a new OWL ontology about Quantities, Units, Dimensions and Datatypes (QUDT). This is a result of a long term, ongoing project with NASA AMES, and our friends at NASA have permitted us to publish those ontologies to encourage the wider use outside of NASA. The QUDT ontology is very carefully designed and provides comprehensive coverage of almost every unit of measurement...

Each unit has a stable URI, making it possible to link to it from your own domain models in a reliable way. For each unit, the ontology defines some useful metadata including abbreviation, a link to DBpedia and a categorization of units into groups, such as length units... I think this units ontology can fill an important gap in the current Semantic Web and Linked Data efforts. Numeric data without any formalized units is pretty useless for machines, and sometimes even for humans...

There are two main ways of using the units ontology: you can use the unit resources to "annotate" your properties with a dedicated property such as qud:units. The values of your property would use built-in datatypes such as 'xsd:double'. The other alternative is to embed the unit directly into the literals. For this use case, all units have also been declared to be 'rdfs:Datatypes'. This makes it possible, to assign units as rdfs:ranges of a property...

AJAX Widget Security Enabled
Paul Krill, InfoWorld

"In an upgrade to one of its core technologies, the OpenAjax Alliance, an industry group formed to boost interoperability in the AJAX space, on Monday is offering OpenAjax Hub 2.0, featuring capabilities for secure interaction between JavaScript widgets. The Hub 2.0 specification defines standardized JavaScript APIs for secure mashups and offers cross-vendor interoperability among mashup tools and components. It isolates third-party widgets in secure sandboxes and mediates messages between widgets using a security manager...

A Web site, for example, could house a third-party calendar widget that might be malicious or have vulnerabilities to site hijacking. Hub 2.0 prevents attacks by isolating untrusted widgets from the main application and other widgets. User credentials access is prevented..."

Instant Notifications Using Google's PubSubHubbub Protocol
Abel Avram, InfoQueue

Google 'pubsubhubbub' "is a simple, open, web-hook-based pubsub protocol and open source reference implementation.

This protocol allows interested parties to get instant notifications when a feed is updated. The protocol was developed by Google and it can be found under the Google Code project with the same name. Instead of a client constantly polling a server at regular time intervals in order to find out if the feed has been updated, the PubSubHubbub protocol turns the pulling approach into a pushing one. The client subscribes to a Hub and it is almost instantly notified when the feed is updated.

Google has implemented the protocol for several of their services including FeedBurner, Reader (shared items), Blogger and, lately, Alerts. The protocol is open, licensed under Apache License 2.0, so anyone could use it. Furthermore, the hubs can be run on any server, they don't need to run on Google's App Engine. Google has created a reference implementation of a Hub that can be used to test the publishing/subscribing process to see how it works..."

According to the published documentation: "The protocol in a nutshell is as follows: (1) A feed URL (a "topic") declares its Hub server(s) in its Atom or RSS XML file... The hub(s) can be run by the publisher of the feed, or can be a community hub that anybody can use, where Atom and RssFeeds are supported; (2) A subscriber (i.e., a server that's interested in a topic), initially fetches the Atom URL as normal. If the Atom file declares its hubs, the subscriber can then avoid lame, repeated polling of the URL and can instead register with the feed's hub(s) and subscribe to updates; (3) The subscriber subscribes to the Topic URL from the Topic URL's declared Hub(s); (4) When the Publisher next updates the Topic URL, the publisher software pings the Hub(s) saying that there's an update; (5) The hub efficiently fetches the published feed and multicasts the new/changed content out to all registered subscribers..."

See also: the Google Code project


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors