Cover Pages: XML Daily Newslink: Monday, 08 September 2008

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Sun Microsystems, Inc. http://sun.com

Headlines

Free Open SAML 2.0 Toolkits for eGov Federations
XSpec: A Behavior Driven Development (BDD) Framework for XSLT
W3C Last Call for SKOS Simple Knowledge Organization System Reference
OASIS Members Propose New TC for Semantic Mapping of XML (MOX)
Caching XML Data at Install Time
What's Next for BizTalk Server?
Bringing History Online, One Newspaper at a Time
Five SOA Best Practices According to IBM

Free Open SAML 2.0 Toolkits for eGov Federations
Staff, Danish National IT and Telecom Agency

You can now download free toolkits and reference implementations for service providers which will integrate with Identity Providers supporting the open SAML 2.0 standard. The toolkits and associated reference implementations implements the Danish eGov OIOSAML 2.0 profile and can be downloaded from the open source repository. The purpose of these toolkits is twofold. For a service provider that does not use a standard product for integration, the toolkits makes it quicker and easier to integrate with Danish public sector solutions like the citizen portal (Borger.dk) and the Business-to-Government portal (Virk.dk). Further, using the toolkits will also lower the development cost associated with this integration. With the toolkits, it is possible to have citizens and employees login to a web solution and simultaneously achieve single sign on to other solutions in the same federation. Instead of developing similar functionality from scratch, a public authority can save months of development work by using one of the released toolkits. On the Danish open source repository you can find toolkits for the Java and .NET development platforms called OIOSAML.JAVA and OIOSAML.NET. The toolkits contain documented libraries for the SAML 2.0 integration and a reference implementation which demonstrates how integration with Danish login solutions can be performed. Both toolkits are already being used for development of public sector solutions for Virk.dk and Borger.dk. The reference implementations and the .Net toolkit are released under the open source license Mozilla Public License 1.1. This license allows combining the open source code with code not released under an open source license. To facilitate testing of SAML 2.0 integration, a pre-packaged Identity Provider based on the open source solution SimpleSamlPHP also is available at the softwareborsen.dk site. While the reference implementations are driven by Danish requirements the underlying toolkits are deemed generally applicable for service providers in SAML 2.0 based eGovernment federations. The toolkits, reference implementations and development tools are provided by the IT Infrastructure and Implementation Division at the Danish National IT and Telecom Agency in conjunction with public sector partners. OIOSAML.JAVA is based on OpenSAML 2.0. The development of OIOSAML.NET was jointly financed by the Danish Public Sector Federation, anchored at the Danish Agency for Governmental Management, together with the Danish Knowledge Center for Software. OIOSAML.JAVA was developed in collaboration with Virk.dk and this venture continues to develop additions to OIOSAML.NET regarding access management.

XSpec: A Behavior Driven Development (BDD) Framework for XSLT
Jeni Tennison, O'Reilly Technical

A while ago I put together a framework for unit testing XSLT. I've been using that for a couple of years and it's been OK, but then I started playing with Ruby on Rails, and testing with RSpec: a framework for Behavioural Driven Development (BDD). It got me wondering what BDD for XSLT would look like. And so I put together XSpec. To use it, you write scenarios in an XSpec description documents and test your stylesheet against those scenarios using a script. I have mine integrated into Oxygen, so I just hit a button and get a test report on my stylesheet. What's the difference between this and the various other XSLT unit testing frameworks out there? Basically, it's the 'BDD-ness': the focus on creating a human-readable description of the desired behaviour of your stylesheet rather than simply a bunch of executable tests. I find XSpec fits better with how stylesheets are designed than the unit-testing framework I developed previously. Plus I've built in features that I've found useful in RSpec and through experience, such as the ability to nest scenarios and focus on particular tests, suppressing everything else. I also hope that it will (eventually) be executable via XProc, which will get around some of the problems with testing XSLT with XSLT whilst still being declarative and programming-language neutral... I've been using XSpec on my most recent XSLT project, and it's been a major asset. I currently have around 250 tests, which take just a few seconds to run, but make me feel extremely secure that the changes I make don't mess up anything that matters. It's taken great discipline, especially in the early stages, to make the development truly driven by the behaviour I desire (in other words, to create the tests before I write the code), but I'm reaping the rewards now.

W3C Last Call for SKOS Simple Knowledge Organization System Reference
Alistair Miles and Sean Bechhofe (eds), W3C technical Report

Members of W3C's Semantic Web Deployment Working Group have published the Last Call Working Draft for the "SKOS Simple Knowledge Organization System Reference" specification. This document defines the Simple Knowledge Organization System (SKOS), a common data model for sharing and linking knowledge organization systems via the Web. The SKOS data model provides a standard, low-cost migration path for porting existing knowledge organization systems to the Semantic Web. SKOS also provides a light weight, intuitive language for developing and sharing new knowledge organization systems. It may be used on its own, or in combination with formal knowledge representation languages such as the Web Ontology language (OWL). Comments are welcome through 03-October-2008. The Working Group has also published an update of the companion SKOS Primer. Overview: The SKOS data model views a knowledge organization system as a concept scheme comprising a set of concepts. These SKOS concept schemes and SKOS concepts are identified by URIs, enabling anyone to refer to them unambiguously from any context, and making them a part of the World Wide Web. SKOS concepts can be labeled with any number of lexical (UNICODE) strings in any given natural language, such as English or Japanese Hiragana. One of these labels in any given language can be indicated as the "preferred" label for that language, and the others as "alternate" labels. Labels may also be "hidden", which is useful e.g. where a knowledge organization system is being queried via a text index. SKOS concepts can be assigned one or more notations, which are lexical codes used to uniquely identify the concept within the scope of a given concept scheme. While URIs are the preferred means of identifying SKOS concepts within computer systems, notations provide a bridge to other systems of identification already in use such as classification codes used in library catalogues. SKOS concepts can be documented with notes of various types. The SKOS data model provides a basic set of documentation properties, supporting scope notes, definitions and editorial notes, among others. This set is not meant to be exhaustive, but rather to provide a framework that can be extended by third parties to provide support for more specific types of note. SKOS concepts can be linked to other SKOS concepts via semantic relation properties. The SKOS data model provides support for hierarchical and associative links between SKOS concepts. SKOS concepts can be grouped into collections, which can be labeled and/or ordered. This feature of the SKOS data model is intended to provide support for node labels within thesauri, and for situations where the ordering of a set of concepts is meaningful or provides some useful information. SKOS concepts can be mapped to other SKOS concepts in different concept schemes. The SKOS data model provides support for four basic types of mapping link: hierarchical, associative, close equivalent and exact equivalent.

See also: the updated SKOS Primer

OASIS Members Propose New TC for Semantic Mapping of XML (MOX)
Staff, OASIS Announcement

OASIS members have submitted a proposed charter for the "Semantic Mapping of XML (MOX) Technical Committee." A commment period for the proposal is open through 22-September-2008. The MOX TC will closely coordinate with other OASIS TCs, particularly with those in the OASIS Telecom Member Section to ensure that related specifications are consistent and can be used with each other. The TC will operate under the RAND IPR Mode. Summary: "The ability to discover and compose services based on semantic analysis or mapping is needed in many industries such as Telecom and e-Health. In the Telecom industry, having the ability to discover and compose services based on those services' semantics is needed to enable Telecom providers to guarantee Service Level Agreements (SLAs). As such, there is a need to derive ontologies from the service WSDL file to describe the service functionality/behavior within domains. The composition of web services is achieved by creating a third web service (hereafter referred to as the integrating web service) and its Web Service Description Language (WSDL) file. This web service invokes the source legacy Web service to retrieve data, mediates to resolve any mismatches the data has with the destination web service, typically through invocation of common services, and then passes the converted data to this destination Web service. The mediation step can take various forms including an XSL transformation. Typically the XSL is generated at design time by either a programmer or by a data expert using an editor that permits the matching of entities in different schemas by drawing lines between schemas entities. Currently in data management tools, the matched entities correspond to logical constraints on the schemas (e.g. XQuery, XSL, SQL), making them difficult to exchange across systems. Semantic mediation and Semantic Web Service approaches require the use of aligned ontologies and XSL to achieve mediation in semantic composition of Web services. Thus, enabling the semantic composition of Web services requires the generation of XSL, mediating schema, aligned ontologies, and possibly other artifacts... In this work, the TC will investigate the use of mapping relations which result in the generation of the artifacts needed for service composition and reuse. That is, by capturing the mappings of XML Schemas entities in XML or HTML, a user agent can generate the mediating schema, the XSL, and the aligned ontologies needed to semantically compose web services. This way, a single process can be used to generate the mappings between schemas, which result in the generation of XSL, mediating schemas, and aligned ontologies. Furthermore, the mapped entities can be exchanged between different platforms by simply copying them. Consequently, mapping relations enable reuse in Service Oriented Architecture (SOA). As such, the objective of the working group is to specify a standard for mappings between XML schemas using mapping relations. We refer to a file that is compliant with the standard specification as a MOX file. The TC accepts as its starting point the mapping relations as described in the Web of Mashup and Metadata Scripting Language (WMSL)..."

Caching XML Data at Install Time
Dan Connolly, W3C Blog

"The W3C web server is spending most of its time serving DTDs to various bits of XML processing software... Evidently there's software out there that makes a lot of use of the DTDs at W3C and they fetch a new copy over the Web for each use. As far as this software is concerned, these DTDs are just data files, much like the timezone database your operating system uses to convert between UTC and local times. The tz database is updated with respect to changes by various jurisdictions from time to time and the latest version is published on the Web, but your operating system doesn't go fetch it over the Web for each use. It uses a cached copy. A copy was included when your operating system was installed and your machine checks for updates once a week or so when it contacts the operating system vendor for security updates and such. So why doesn't XML software do likewise? It's pretty easy to put together an application out of components in such a way that you don't even realize that it's fetching DTDs all the time. [See the sample configuration for 'xsltproc'.] You can use xsltproc: the switch '--novalid' tells it to skip DTDs altogether. Or you can set up an XML catalog as a form of local cache... The point I'm making here isn't specific to DTDs; catalogs work for all sorts of XML data, and the general principle of caching at install time goes beyond XML altogether."

See also: on Excessive DTD Traffic

What's Next for BizTalk Server?
Oliver Sharp, Microsoft Presspass Interview

Microsoft Corp. has provided an update on its plans for BizTalk Server 2009, which is on track for availability during the first half of 2009. With over 8,200 customers today, BizTalk Server remains one of Microsoft's most consistently-delivered server offerings. "BizTalk Server 2009 will be a full release of the product. It delivers a full upgrade to enable customers to take advantage of the latest platform wave (delivered through Windows Server 2008, Visual Studio 2008, SQL Server 2008, .NET Framework 3.5). In particular the platform updates enable greater scalability and reliability, new Hyper-V virtualization support, and many advances in the latest developer tools. BizTalk Server 2009 also delivers some of the top features that have been requested by our customers, including a new UDDI v3-compliant services registry, new and enhanced LOB adapters (Oracle E-Business Suite, SQL Server), enhanced host systems integration (updates to MQ, CICS, IMS, CICS), a new Mobile RFID platform and management tools, enhanced B2B capabilities (updates to EDI, AS2, SWIFT), enhanced developer and team productivity through ALM integration with Team Foundation System and Visual Studio, and a new release of ESB Guidance 2.0 patterns and practices... Our vision for BizTalk Server has remained pretty consistent since the product was introduced, and gained clarity through customer and partner feedback. If you think about BizTalk Server's original charter (back when it was launched in 2000), it was focused on enabling our customers to develop secure and reliable XML-based connectivity and bringing everything together in a manageable way, thereby leading to improvements in the way organizations conduct their day-to-day business. We've always been focused on making connections, and we still are, but over the years we've added in more systems and additional support for disparate and heterogeneous systems. Just in the past two releases we have added in over 30 adapters, four vertical industry accelerators and support for disparate protocols for mainframe, midrange systems and intelligent RFID devices... In response to customer feedback, we are committed to continuing support for BizTalk Server's XLANG orchestration technology. We will provide XLANG compatibility for existing applications, based upon current versions of BizTalk Server, and have no plans to stop support of the existing BizTalk orchestration and messaging engine..."

Bringing History Online, One Newspaper at a Time
Punit Soni, Blog

This blog article reports on the new 'Google News Archive' initiative. "For more than 200 years, matters of local and national significance have been conveyed in newsprint—from revolutions and politics to fashion to local weather or high school football scores. Around the globe, we estimate that there are billions of news pages containing every story ever written. And it's our goal to help readers find all of them, from the smallest local weekly paper up to the largest national daily. The problem is that most of these newspapers are not available online. We want to change that. Today, we're launching an initiative to make more old newspapers accessible and searchable online by partnering with newspaper publishers to digitize millions of pages of news archives. [You are now able] to search these newspapers, you'll also be able to browse through them exactly as they were printed—photographs, headlines, articles, advertisements and all... This effort expands on the contributions of others who've already begun digitizing historical newspapers. In 2006, we started working with publications like the 'New York Times' and the 'Washington Post' to index existing digital archives and make them searchable via the Google News Archive. Now, this effort will enable us to help you find an even greater range of material from newspapers large and small, in conjunction with partners such as ProQuest and Heritage, who've joined in this initiative. One of our partners, the Quebec Chronicle-Telegraph, is actually the oldest newspaper in North America—history buffs, take note: it has been publishing continuously for more than 244 years. You'll be able to explore this historical treasure trove by searching the Google News Archive or by using the timeline feature after searching Google News... Over time, as we scan more articles and our index grows, we'll also start blending these archives into our main search results so that when you search Google.com, you'll be searching the full text of these newspapers as well. This effort is just the beginning. As we work with more and more publishers, we'll move closer towards our goal of making those billions of pages of newsprint from around the world searchable, discoverable, and accessible online..."

See also: The Register

Five SOA Best Practices According to IBM
Boris Lublinsky, InfoQueue

A recent white paper from IBM Global Services describes the lessons applied by IBM's Academy of Technology to achieve success in their SOA implementations. They did that by focusing on five priorities: (1) Develop architecture with a vision for the future—looking beyond simple connectivity and focusing more on architecture is the most common recurring need for SOA implementations. (2) Foresee linkages from IT to your business processes—implementation of an architecture that transitions IT into the role of a service provider for business functionality. (3) Create an organizational structure to support SOA including culture, skills, training, teaming, organization structure, decision making, reward systems, collaboration and governance. (4) Build a scalable infrastructure—create a baseline for your services performance and scalability using appropriate instruments and measurements. (5) Enable operational visibility—focus on governance and service management... Many SOA implementations are focused on service implementation and do not pay adequate attention to the data management aspect of SOA. This lack of attention can result in data mismanagement, unreliable data and threats to data integrity. Many practitioners are still living in the realm of traditional point-to-point data requirements... As you move toward implementing data as a service, effective information metadata management and use of Common Information Model (CIM) is a key critical success factor. Using CIMs can help speed development by enabling you to establish standards and descriptive metadata for information that can be applied to all interfaces, messages, data structures and data transformations to support reuse. A model driven approach to standardize best practices will also accelerate development and provide for further consistency across interfaces and informational structures. Using this model-driven approach helps reduce the need for transformation, and makes it much easier to design transformations when they are required... Component business modeling (CBM) as well as Service-Oriented Modeling and Architecture (SOMA) both support a best practices approach to modeling. CBM helps you analyze your enterprise by first partitioning it into relatively independent, non-overlapping business components to identify opportunities for innovation or improvement.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors