This issue of XML Daily Newslink is sponsored by:
IBM Corporation http://www.ibm.com
- SML/SML-IF Service Modeling Standards Extend Reach of XML Family
- Trusted Identity for All: Toward Interoperable Trusted Identity Management Systems
- Live Distributed Objects: Edge Mashups for Service-Oriented Collaboration
- Keys Don't Grow in Threes
- Report: Improving Access to Government through Better Use of the Web
- Distribution Format Exchange Profile (DFXP) for Timed Text Authoring
- VC-Filter: Using the Same XSD Schema with Different Schema Processors
- JavaServer Faces Version 2: Streamline Web Application Development
- The Bold and the Beautiful: Two New Drafts for HTML 5
- Trustworthy Voting: From Machine to System
SML/SML-IF Service Modeling Standards Extend Reach of XML Family
Staff, W3C Announcement
W3C has announced the publication of new standards that make it possible to use XML tools to improve the quality of increasingly sophisticated systems and services built from the XML family of standards. Now developers can validate sets of XML documents, either in place, using Service Modeling Language 1.1 (SML), or as a package, using SML Interchange Format 1.1 (SML-IF). Validity constraints are expressed using a flexible combination of XML Schema and ISO Schematron, extended for cross-document use. SML "defines extensions to the W3C XML Schema language by adding support for inter-document references and user-defined constraints. This combination of features is very useful in building complex multi-document models that capture structure, constraints, and relationships. In the management domain, these models are typically used to automate configuration, deployment, monitoring, capacity planning, change verification, desired configuration management, root-cause analysis for faults, etc. The facilities defined by this Working Group are expected to be of general use with arbitrary XML vocabularies, but the first major use of SML will be to model the structure, relationships, and constraints for complex information technology services and systems.
To illustrate what SML adds to the XML ecosystem, consider what happens when someone purchases an airline ticket. Suppose the reservation information is stored as an XML document that includes passenger information. The reservation also refers to a second XML document that stores departure time and other information about the flight. One department manages customer information, another manages flight information. Before any transaction with the customer, the airline wants to ensure that the system as a whole is valid. SML allows the airline to verify that certain constraints are satisfied across the reservation and flight data. This makes it easier to manage inconsistencies, and to do so without writing custom code. As a result, the airline lowers the cost of managing tasks such as informing passengers when flight times change. An organization may also find that it needs to apply additional constraints when using data in a particular context, for example because of local laws. Developers can use SML to layer on context-specific constraints without duplicating content.
See also: the W3C announcement
Trusted Identity for All: Toward Interoperable Trusted Identity Management Systems
Piotr Pacyna, Anthony Rutkowski, Amardeo Sarma, Kenji Takahashi; IEEE Computer Guest Editorial
The May 2009 issue of IEEE Computer Magazine focuses on identity management, and investigates prospective options for the reliable and secure creation, storage, sharing, and use of personally identifiable information pertaining to digital identity across administrative domains, focusing specifically on the interoperability of IdM systems. It also looks at an electronic voting system that is both practical and resistant to tampering as well as two proposed star topologies developed to improve the robustness of communications systems.
From the Guest Editors' Introduction: "Cyberspace, as opposed to the real world, is characterized by the disappearance of the boundaries that served in the past as natural protection against accidental leakage of personal information. At minimum, these boundaries reduced the impact of privacy breaches by ill-intentioned individuals. The trend of moving everyday life and business activities to the Internet domain exposes people more than ever before. Today, the largest network infrastructure in the world that involves significant identity management (IdM) dimensions supports mobile phones. The availability of inexpensive and understandable technology for data interception and data mining makes it easy to correlate and exploit identity data for malicious purposes. In fact, the problem of identity theft has already elevated cybercrime to the top of the list of concerns for governments, organizations, and corporations, and now citizen awareness, and concern, is also growing. These concerns are well founded and supported by the evidence, which shows that ubiquitous deployment of systems for collecting, processing, and sharing personally identifiable information for service delivery makes every person vulnerable to online theft of identity data, thus undermining confidence in information technologies. This progression of concern has led to the expansion of research on several digital IdM topics, including improving trustworthiness and privacy.
The flip side of the privacy issue is that public policy requirements sometimes require the converse capability. Currently, identity management systems (IdMSs) are being built for specific purposes in the private space of corporations, in public-sector systems, and in federal and local government systems, with the objective of allowing for the controllable disclosure of identity data. These systems are often application-specific, centrally managed solutions designed with different assumptions and built with different goals in mind. As a result, they generally lack seamless interoperability. Many industry groups and standards organizations are working to develop IdMSs and standardize protocols for managing trusted attributes pertaining to identities that apply to various types of entities, such as service providers, customers, users, devices, and objects. Standardization bodies, industry, and forums such as the Liberty Alliance, OASIS, W3C, CA/Browser Forum, 3GPP, ISOC, ETSI, ITU-T, ISO, ATIS, and ANSI are developing IdM frameworks and protocols. They have developed IdM architectures, systems, and enabling technologies for managing identifiers and attributes, ranging from application-controlled systems, through service-provider-centric systems and network-operator-centric solutions to user self-managed attribute control. The results are represented by, for example, Liberty/SAML, and WS-Trust, OpenID, Eclipse (Higgins project), and Shibboleth. At the heart of these solutions is the technology for attribute authentication, assertion processing, and trusted exchange of identity claims. The technology is augmented with nontechnical means, such as best practices and policies introduced by governments and enterprises, which depend on the application areas.
Live Distributed Objects: Edge Mashups for Service-Oriented Collaboration
Ken Birman, Jared Cantwell, Daniel Freedman (et al.), IEEE Computer
Live Distributed Objects is a new programming model and a platform, in which instances of distributed protocols are modeled as components with an object-oriented look and feel that can be composed in a type-safe manner to build complex distributed applications using a simple, intuitive drag and drop interface. The Live Distributed Objects platform makes it possible to combine hosted content with P2P protocols in a single object-oriented framework... Networked collaboration tools may be the key to slashing healthcare costs, improving productivity, facilitating disaster response, and enabling a more nimble information-aware military. Better applications could even make possible a world of professional dialog and collaboration without travel. We term such tools service- oriented collaboration (SOC) applications. SOC systems are more and more appealing because of the increasingly rich body of service-hosted content, such as electronic medical health records, data in various kinds of databases, image repositories, patient records, and weather prediction systems. They may also tap into sensors, medical devices, video cameras, microphones, and other real-world data sources. Many kinds of applications are constructed as mashups, in which data from various sources is combined in a single multilayered interactive GUI, and it may seem natural to use mashups to build SOC applications as well... Today's Web services standards are overly focused on the data-center side of the story. Not only are performance, scalability, and security all serious concerns, but the trend toward prebuilt minibrowsers with sophisticated but black-box behavior is making it increasingly difficult to combine information from multiple sources. SOC applications aren't at odds with Web services, but they do need something new...
The Live Objects platform solves these problems. Even a nonprogrammer can build a new SOC application, share it (perhaps via e-mail), and begin to collaborate instantly. Moreover, performance, scalability, and security can all be addressed... An end user creates a new SOC application by selecting components and combining them into a new mashup, using drag-and-drop. Our tools automatically combine references for individual objects into an XML mashup of references describing a graph of objects and type-check the graph to verify that the components compose correctly. For example, a 3D visualization of an airplane may need to be connected to a source of GPS and other orientation data, which in turn can only be used over a data replication protocol with specific reliability, ordering, or security properties... When act ivated on a user's machine, an XML mashup yields a graph of interconnected proxies. If needed, an object proxy can initialize itself by copying the state from some active proxy (our platform assists with this sort of state transfer). The object proxies then become active ('live'), for example, by relaying events from sensors into a replication channel or receiving events and reacting to them (such as redisplaying an aircraft). The Live Objects approach shares certain similarities with the existing Web development model, in the sense that it uses hierarchical XML documents to define the content. On the other hand, it departs from some of the de facto stylistic standards that have emerged...
Live objects leverage Web services, but the examples we've given make it clear that the existing Web services standards don't go far enough. The main issue arises when components coexist in a single application. Just as services within a data center need to agree on their common language for interaction, and do so using Web services standards, components living within a SOC application running on a client platform will need to agree on the events and representation that the 'dialog' between them will employ. The decoupling of functionality into layers also suggests a need for a standardized layering: The examples above identify at least four: the visualization layer, the linkage layer that talks to the underlying data source, the update generating and interpreting layer, and the transport protocol. We propose using event-based interfaces to perform this decoupling — a natural way of thinking about components that dates back to Smalltalk and is common in modern platforms too, notably Jini...
See also: the Live Distributed Objects web site
Keys Don't Grow in Threes
Stephen Farrell, IEEE Internet Computing
Many Internet security mechanisms depend on the use of cryptographic algorithms for various forms of authentication and confidentiality. Even when well-known and standardized cryptographic algorithms are used in well-known protocols, some parameters must be specified, the most important of which are usually algorithm identifiers and key or hashoutput lengths. In this article I review some recent key length recommendations and compare those to current usage... Many Internet security mechanisms at some level use cryptography. Take, for example, accessing a secure Web site via an 'https://' URL, which involves running the HTTP protocol over the Transport Layer Security (TLS) protocol (also known as Secure Sockets Layer, SSL). This typically uses the Rivest, Shamir, and Adleman (RSA) asymmetric cryptographic algorithm to check the signature on the Web server certificate and to exchange a session key. In fact, that part of the protocol also requires a digest algorithm, such as the secure hash algorithm (SHA-1). After the key exchange, the browser and Web server use a symmetric algorithm such as the Advanced Encryption Standard (AES) to protect the information exchanged between the browser and the Web server. In this protocol as well as others, we often talk about using a suite of algorithms and sometimes call that ensemble a ciphersuite. For each of these cryptographic uses, we should be concerned about the cryptosystem's strength. Not only are there different algorithms that can be configured (for example, Diffie-Helmann key agreement is an alternative to RSA), many of the algorithms they use can support different key lengths, which have different strengths. By strength here, I mean something that indicates the amount of work that a knowledgeable and well-funded adversary would have to do to break the algorithm's security, as used in the protocol in question. So, for example, if I could quickly factor any number that's roughly 2,100 bits long, then I could masquerade as almost any Internet bank using RSA. Thankfully, we don't think this is practical for any adversary at this time. These algorithms generally require a cryptographic key as input, or produce a certain size of output, and were those inputs or outputs very short, then the adversary probably wouldn't have much work to do at all.
For example, if an algorithm can only accept 40-bit keys, then the system has only 1,099,511,627,776 (2 to the power of 40) keys in total, which, although a large number, isn't large enough: an adversary could simply try all possible keys until one works—a so-called 'brute force' attack. Similarly, if a digest operation's output were small, say only 32 bits long, then an adversary could simply guess possible inputs until it found a matching output, probably after only roughly 65,000 attempts... The main point is that, as time goes on, what was once a reasonable ciphersuite could become insecure. And if we trade off performance or bandwidth for strength, as we usually do in any real system, it's quite likely that we'll have to increase the strengths used in a matter of years, not decades...
It looks like we have some work to do to update the set of ciphers used on the Internet, and, if we follow the recommendations [from NIST and other expert authorities], we should start in 2009 for deployment in 2010. Probably the first practical thing to do is to make an inventory of the use of cryptographic algorithms in your deployed environment. Although this sounds easy, in many real deployments, it could be timeconsuming and messy. For example, various Web services might use loadsharing devices that actually terminate TLS connections, and it could be hard to inventory the set of services that use that TLS termination point. Similarly, virtual private network gateway devices might not easily list the set of algorithms they support—or getting that list might require work—and privileged access to the device in question could be difficult to obtain... The overall conclusion we can draw is that it's worthwhile for system and application administrators to take an inventory of their use of cryptography (or, more generally, of deployed security measures) and to maintain that inventory so that they can plan changes well in advance (or even in a hurry if the cryptographic sky does fall).
Report: Improving Access to Government through Better Use of the Web
Suzanne Acar, Jose M. Alonso, Kevin Novak (eds), W3C Interest Group Note
Members of the W3C eGovernment Interest Group have published an updated report Improving Access to Government through Better Use of the Web. Current Web technology allows governments to share with the public a variety of information in unlimited quantities on demand. Technology is also available to allow citizens to bring issues of concern to the attention of local, regional and national governments. However, exploiting these capabilities within government systems is a challenge that encompasses environmental, policy, legal, and cultural issues. Establishing effective eGovernment requires openness, transparency, collaboration and skill in taking advantage of the capabilities of the World Wide Web. The rich potential for two-way dialogue between citizens and government creates a need for global leadership. The W3C has an opportunity to provide guidance in support of eGovernment objectives by promoting existing open Web standards and noting the challenges external to the Web and technology. There is also role for the W3C to facilitate the development and vetting of new open Web standards needed by governments in context. This document is an attempt to describe, but not yet solve, the variety of issues and challenges faced by governments in their efforts to apply 21st century capabilities to eGovernment initiatives. Detail and useful examples of existing, applicable open Web standards are provided. Where government needs in the development of eGovernment services are not currently met by existing standards, those gaps are noted... The W3C eGov IG is currently working with, forming relationships, or collaborating with governments and other organizations (The World Bank, EC, OECD, OAS, ICA, CEN and OASIS). Activities throughout the World on the issues, challenges, and work required to aid governments in achieving the eGovernment vision is consistently recognized by the eGov IG and its partners.
See also: the eGovernment Activity at W3C
Distribution Format Exchange Profile (DFXP) for Timed Text Authoring
Mike Dolan, Geoff Freed, Sean Hayes (et al., eds), W3C Technical Report
W3C announced that the Timed Text Working Group has published a Working Draft of Timed Text (TT) Authoring Format 1.0 - Distribution Format Exchange Profile (DFXP), updating an earlier version from 2006-11-16. Timed text is textual information that is intrinsically or extrinsically associated with timing information. The document specifies the distribution format exchange profile (DFXP) of the timed text authoring format (TT AF) in terms of a vocabulary and semantics thereof. The timed text authoring format is a content type that represents timed text media for the purpose of interchange among authoring systems. The Distribution Format Exchange Profile is intended to be used for the purpose of transcoding or exchanging timed text information among legacy distribution content formats presently in use for subtitling and captioning functions. In addition to being used for interchange among legacy distribution content formats, DFXP content may be used directly as a distribution format, for example, providing a standard content format to reference from a 'text' or 'textstream' media object element in a SMIL Version 2.1 document.
Use of DFXP is intended to function in a wider context of 'Timed Text Authoring and Distribution' mechanisms that are based upon a system model wherein the timed text authoring format serves as a bidirectional interchange format among a heterogeneous collection of authoring systems, and as a unidirectional interchange format to a heterogeneous collection of distribution formats after undergoing transcoding or compilation to the target distribution formats as required, and where one particular distribution format is DFXP.
A worked example in this document demonstrates the primary types of information that may be authored using DFXP: metadata, styling, layout, timing, and content. In typical cases, styling and layout information are separately specified in a document instance. Content information is expressed in a hierarchical fashion that embodies the organization of both spatial (flow) and timing information. Content makes direct or indirect references to styling and layout information and may specify inline overrides to styling. The first subtitle 'Subtitle 1' (Time Interval 0.76, 3.45) is presented during the time interval 0.76 to 3.45 seconds. This subtitle inherits its font family, font size, foreground color, and text alignment from the region into which it is presented. Since no region is explicitly specified on the paragraph, the nearest ancestor that specifies a region determines the region... During the next active time interval, two distinct subtitles are simultaneously active, with the paragraph expressing each subtitle using a different style that overrides color and paragraph text alignment of the default style.
See also: the W3C Activity Video in the Web
VC-Filter: Using the Same XSD Schema with Different Schema Processors
Michael Sperberg-McQueen, Black Mesa Technologies Blog
The most immediate and obvious benefit of the mechanism is to make it possible for XSD 1.1 processors to survive usefully in some future world where they may be confronted with schema documents from a later version of XSD. Or rather, to be more precise: the vc mechanism makes it possible for schema authors to write schema documents for (say) XSD 3.2 processors which will fall back to some graceful alternative when used by an XSD 1.1 processor. Or, if that is really not desirable, the mechanism makes it possible to make the 1.1 processor reject the schema document (just use some construct not legal in 1.1), so you can require a certain minimum level in the processors used for your schema document... Other specs I know of (XSLT 1.0 and 2.0, SVG Tiny, etc.) have similar mechanisms; I am glad that the XML Schema working group took the time to look at them and try to learn from them. In specifying the vc mechanism we tried to keep the mechanism simple (although the temptation to add features can be very strong) and to provide control for the schema author, who needs to be able to specify what fallback to use in different situations, including the rule 'give up, stop now, do not try to process this schema document'. A blanket rule saying a 1.1 processor should just ignore anything it doesn't understand — which some people repeatedly urged us to adopt—lacks the requisite property. It does not allow the user (here the schema author) to say, in effect, 'Look, if you don't support feature XYZ, there really is no point in trying to use this schema document.' XSD 1.0 processors don't support the mechanism out of the box, of course, since it wasn't specified as part of XSD 1.0. But the vc mechanism is designed to be compatible with 1.0, in the sense that an XSD 1.0 processor can be retrofitted with it, without ceasing to be 1.0-conformant. It is to be hoped that 1.0 processors which are now being actively maintained do add support for the vc mechanism: it's simple, and it will help the 1.0 processor remain relevant in a world with both 1.0 and 1.1 schema documents.
See also: the VC-Filter online documentation
JavaServer Faces Version 2: Streamline Web Application Development
David Geary, IBM developerWorks
JavaServer Faces technology simplifies building user interfaces for JavaServer applications. Developers of various skill levels can quickly build web applications by: assembling reusable UI components in a page; connecting these components to an application data source; and wiring client-generated events to server-side event handlers. "With version 2.0, Java Server Faces (JSF) makes it easy to implement robust, Ajaxified Web applications. This article launches a three-part series by JSF 2.0 Expert Group member David Geary showing you how to take advantage of the new features in JSF 2. In this installment, you'll learn how to streamline development with JSF 2 by replacing XML configuration with annotations and convention, simplifying navigation, and easily accessing resources. And you'll see how to use Groovy in your JSF applications... JSF 1 was developed in an ivory tower, and the results, arguably, were less than spectacular. But JSF got one thing right — it let the marketplace come up with lots of real-world innovations. Early on, Facelets made its debut as a powerful replacement for JavaServer Pages (JSP). Then came Rich Faces, a slick JSF Ajax library; ICEFaces, a novel approach to Ajax with JSF; Seam; Spring Faces; Woodstock components; JSF Templating; and so on. All of those open source JSF projects were built by developers who needed the functionality they implemented. The JSF 2.0 Expert Group essentially standardized some of the best features from those open source projects. Although JSF 2 was indeed specified by a bunch of eggheads, it was also driven by real-world innovation. In retrospect, the expert group's job was relatively easy because we were standing on the shoulders of giants such as Gavin King (Seam), Alexandr Smirnov (Rich Faces), Ted Goddard (ICEFaces), and Ken Paulson (JSF Templating). In fact, all of those giants were on the JSF 2 Expert Group. So in many respects, JSF 2 combines the best aspects of ivory tower and real world. And it shows. JSF 2 is a vast improvement over its progenitor. This is the first article in a three-part series that has two goals: to show you exciting JSF 2 features, and to show you how to best use those features, so that you can take advantage of what JSF 2 offers. It covers: managed bean annotations, simplified navigation, and support for resources. In the remaining two articles in this series, I will explore Facelets, JSF 2's composite components, and built-in support for Ajax..."
See also: the JSF 2 web site
The Bold and the Beautiful: Two New Drafts for HTML 5
Rick Jelliffe, O'Reilly Technical
Two new drafts out at W3C from the HTML 5 effort: HTML 5: The Markup Language (hat-tip Micah) and HTML 5: A Vocabulary and Associated APIs for HTML and XHTML (hat-tip Jeni.) The first one is a model of the kinds of standards-writing we need: I'd recommend any standards editor looks at it for a model of a good solution to the problem they are trying solve. It uses standard notations or make simple objective statements that can be trivially implemented. In particular, see how easy it would be to implement its Assertions statements in Schematron: they are singular and objective. I presume they spring from Henry Sivonen's validation work. The second one is much larger, and is where many of the fiddles of historical HTML applications go. So it is not surprising if it is a bit less crystalline than the markup language spec. Its contents are pretty good, though, which excuses a lot in a standard I suppose. If I would find fault with it, I think it has the XML Schemas Part 1 problem of laboriously spelling out every step in natural language text: this disguises patterns in the constraints that diagrams or schemas or tables expose, which increases the reading burden on the reader. Furthermore, artificial languages can be more readily automatically converted to code. These are engineering problems and engineering has evolved a large set of diagramming techniques that should be used. You can link back to plain language descriptions, but it is dangerous to use language where less ambiguous notations are possible...
From the standards perspective, I think this may be a good approach for other specifications to follow: for the documents, a rigorous "minimum manual" approach using standard schema languages (or statements which are clearly trivially implementable in such) in particular RELAX NG, Schematron, XSD datatypes and EBNF. Then a separate specification giving semantics for a class of applications. It is a continual tension in both the ODF and the OOXML standardization efforts, so I am glad to see the HTML 5 editorial approach. From his comments, I think Murata Makoto is even more strong on this than I am. If you look at how difficult it is to draft standard text using required status terms like "shall" or "should", and how using other terms opens the door for abuse and malarky, I often think that we should just ban natural language from standards. Of course that is too much: I think Schematron's approach, where you back up natural language assertions with executable tests, is a much more practical approach...
See also: the HTML 5 draft
Trustworthy Voting: From Machine to System
Nathanael Paul and Andrew S. Tanenbaum, IEEE Computer
In this article, the authors describe an electronic voting approach that takes a system view, incorporating a trustworthy process based on open source software, simplified procedures, and built-in redundant safeguards that prevent tampering. "Vrije Universiteit has devised an electronic voting system that is both practical and resistant to tampering. We are currently implementing the electronic voting machine software and intend to make the source code freely available this year. Electronic voting offers myriad benefits—from multilingual operation to the prevention of overvoting—but to be trustworthy, a voting system must satisfy three main goals: ensure the election's integrity, allow results to be audited, and be sufficiently understandable that voters and politicians will have confidence in using it. The system must allow audits because, if there is a dispute, a recount is mandatory. Requesting a machine to reread the result is pointless because it will merely read out the initial result. Finally, the voters and the politicians must have confidence in the system. A prerequisite to that confidence is the ability to understand how the system works. Many papers on voting systems describe cryptographic techniques, but cryptography alone does not build confidence in voters. Cryptography is only one method for achieving trustworthiness, and designers should view it as but one aspect of a larger system...
Our voting system adds transparent operational procedures and open-source code to standard, well-tested, cryptography. To protect election keys and voter privacy, the system uses open source voting software and lets anyone verify that the published software is indeed running on the voting machine. Attestation—the process of verifying that the software now running is the published software—is the main technical challenge. There is also the challenge of getting states to use open source software, but that is a political and legal issue. In our scheme, the system performs attestation by computing the hash—a cryptographic checksum—of the published (executable) software and the running software. The user can compare the two hashes. For hash algorithms such as SHA-256, it would be extremely difficult to create malicious code whose hash matches the published voting software's hash. The hard part is computing the hash over the machine's software in a way that can be trusted. To do this, we use the Trusted Platform Module (TPM), a device that is already part of many modern PCs... The Trusted Platform Module (TPM) lets a poll worker or voter verify in real time that the voting machine is running the open source software that it is supposed to be running. Central to that verification ability is the 'skinit' instruction. Anyone can attest a voting machine's software by asking for a TPM-signed hash of the software (an X.509 certificate for the corresponding public key is also available, so anyone can verify the hash)...
Parts of the system require cryptography to ensure secrecy and prevent tampering. Computational load is not an issue, since voting machines don't have throughput concerns, so our system uses public-key cryptography to simplify key management. All public-key systems involve the use of a public (encryption) key and a private (decryption) key. We use three types of key pairs to encrypt and sign voting data... The procedures and techniques we describe work together to yield a trustworthy voting system—one that is secure from the first key's generation to the publishing of results. In our nine-step process, election integrity stems from the voting machine software's open source character, the use of public and transparent procedures, and the voters' ability to personally verify their individual votes. We anticipate that our system will provide a simple and trustworthy alternative to existing systems.
XML Daily Newslink and Cover Pages sponsored by:
|Sun Microsystems, Inc.||http://sun.com|
XML Daily Newslink: http://xml.coverpages.org/newsletter.html
Newsletter Archive: http://xml.coverpages.org/newsletterArchive.html
Newsletter subscribe: email@example.com
Newsletter unsubscribe: firstname.lastname@example.org
Newsletter help: email@example.com
Cover Pages: http://xml.coverpages.org/