Cover Pages: XML Daily Newslink: Tuesday, 03 February 2009

A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS and Sponsor Members
Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by:
Microsoft Corporation http://www.microsoft.com

Headlines

Open Group Announces TOGAF Version 9 Enterprise Architecture Framework
Generate DITA Java API Reference Documentation Using DITADoclet and DITA API Specialization
W3C Report: Workshop on the Future of Social Networking
Business-Driven Architect: Open Group Cloud Computing Summit
Web Semantics in the Clouds
Towards Kerberizing Web Identity and Services
The Stimulus Package and EHRs: Spend First, Design Later?
"duri" and "tdb" URN Namespaces Based on Dated URIs
Semantic Email Addressing: The Semantic Web Killer App?

Open Group Announces TOGAF Version 9 Enterprise Architecture Framework
Staff, The Open Group Announcement

"The Open Group, a vendor- and technology-neutral consortium focused on open standards and global interoperability within and between enterprises, has announced the general availability of TOGAF version 9. Developed and endorsed by the members of The Open Group's Architecture Forum, TOGAF version 9 represents an industry consensus framework and method for Enterprise Architecture that's made available for use internally by any organization around the world, including both members and non-members of The Open Group... A truly evolutionary release, TOGAF 9 features significant enhancements to key capabilities of the framework introduced in previous versions. For example, TOGAF 9 includes new materials that show in detail how the Architecture Development Method (ADM) can be applied to specific situations, such as service-oriented architecture (SOA) and security architecture. In addition, the new Architecture Content Framework includes a detailed content meta-model that formalizes the definition of an enterprise architecture (EA) and also establishes clear links between business and IT objects. In all areas, the specification adds detail and clarity above and beyond previous TOGAF versions. A key enhancement in TOGAF 9 is the introduction of a seven-part structure and reorganization of the framework into modules with well defined purposes that will allow future modules to evolve at different speeds and with limited impact across the entire blueprint. The following is a summary of key enhancements to TOGAF 9: (1) Detailed overview on how to apply the Architecture Development Method (ADM) to specific infrastructure situations, such as SOA and security architecture; (2) Expanded content framework with a content meta-model that formalizes the definition of an enterprise architecture and establishes clear links between business and IT objects; (3) Revised modular structure for simplified consideration of the specific aspects of an architecture's core capabilities; (4) Extended set of concepts and guidelines to support the creation of integrated hierarchies of architectures within organizations that have design governance models..."

See also: Dana Gardner's blog

Generate DITA Java API Reference Documentation Using DITADoclet and DITA API Specialization
Mariana Alupului, IBM developerWorks

DITA (Darwin Information Typing Architecture) represents an open, OASIS standard, XML-based architecture for authoring, producing, and delivering technical information. This article shows how to use DITADoclet, DITA Java API specialization, and the Eclipse IDE to create Java API reference documentation for easy distribution in many formats. DITADoclet generates the DITA Java API files, automatically creates the DITAMAP and MAPLIST files (DITA Java API specialization) for the Java API reference documentation, extracts the developer comments from the Java source code, and migrates the information to the generated DITA API files. The DITADoclet and DITA Java API solution provides API writers with the tools to generate fully documented Java APIs. A fully documented API can serve several purposes, but the most important reason is to allow the API users to fully understand, search, and browse the API functions that are available to them. To completely use the functionality of the API, software users require an accurate and fully documented API. The advantages of using the DITADoclet to generate DITA Java API documentation are: (1) Search: Provides an efficient way to retrieve methods and classes/or interfaces. The Javadoc system does not provide such a search mechanism for the table of contents (TOC). (2) Navigation: Provides a TOC navigation that is generated automatically directly from the Java source code. (3) Index: The indexes complement the keyword search. Indexes are created by the API writer and can therefore give valuable additional information. The index lists all packages, classes, interfaces, methods and fields, sets them into context (for example, gives the containing class for a method or field) and links to the document describing the entry. (4) Links: DITA shows the missing links that are hard to find without using DITA. For this demo project ('org.dita.dost'), you have 108 topics with 3567 local links that DITA automatically checks. Motivation: "Typically, the Javadoc tool from Sun Microsystems is used to generate Java API reference documentation from Java source code. The Javadoc tool generates the basic structure for the Java API reference documentation, but the documentation is often incomplete and limited to developer comments. Changes to development teams appear to encourage removal of the API writers and editors from the Java API reference documentation process altogether. Developers have time to manage only Java source code files with incomplete comments..."

See also: DITA resources

W3C Report: Workshop on the Future of Social Networking
Staff, W3C Announcement

Participants in W3C's "Workshop on the Future of Social Networking" have announced a number of important observations in a Report issued today. By enabling users to share profiles and data across networks, social networking sites can grow further and open possibilities for a decentralized architecture for the Social Web. Contextual information, especially for mobile device users, can significantly enrich the social networking user experience. Many users remain unaware of the impact of social networking on their privacy. Though growing rapidly, social networking sites (especially their business models) are hampered by lack of interoperability and could benefit from micropayment solutions. Many social networking sites have yet to take into account the special requirements of users with disabilities, and users on mobile devices. The W3C Report, issued by the fifty-five organizations that participated in the two-day Workshop, also suggested as next steps for W3C to create an Incubator Group for further discussion on this topic... The W3C Incubator Group will review and map the data interoperability technologies available, including technologies developed outside the W3C, to identify potential gaps and illustrate how to use these technologies together, supported by an open source implementation of a decentralized architecture. Work on privacy best practices, both for users and providers, and further interaction with the existing W3C Policy Languages Interest Group, will continue the discussions on the preservation of privacy.

See also: the W3C announcement

Business-Driven Architect: Open Group Cloud Computing Summit
Brenda Michelson, ebiz Blog

Per comments by David Bernstein and Russ Daniels ("Cisco on fundamentals of cloud interoperability and standards") — what is Cisco doing? Cisco doesn't run a cloud. Does run a SaaS application, WebEx. Cisco has helped a lot of people build a lot of infrastructure. Cisco is 'arms dealer' in cloud computing. Cisco's Cloud Strategy: Build right products -- unified fabric, unified compute, virtualization aware; Technology -- enhanced IP core with tight coupling to Software; Referenced software -- services-led cloud blueprints, reference software stacks; Open Standards; Multi-phased—stand alone to enterprise to intercloud. Cisco sees cloud as the 4th wave of application infrastructure: mainframe, microcomputer/ client server, web/internet, cloud. This next wave of applications will be highly connected, highly collaborative, media intensive applications. Cloud adoption phases include standalone clouds (derivative pricing, external data centers; folks want better security, SLAs and control), enterprise class clouds (key challenges federation, portability, market), and intercloud (dynamic workload migration, apps integrate across clouds, and more)... The intercloud is the grand vision... Consider an intercloud example of dynamically moving a workload (VM) from one cloud to another... Another dynamic workload example, federation, generalized service access across clouds: same steps 1-3, then Cloud 1 needs to query Cloud 2 (RDF/SPARQL, OWL), Cloud 1 selects, receives protocols, interface (web services, REST), Cloud 1 calls services in Cloud 2 (metering, SLAs). These examples illustrate the technology challenges related to cloud interconnection. How does this translate into what Cisco is doing? Some specific Intercloud Projects that Cisco is involved in to bring the intercloud to life: (1) addressing, IETF LISP; (2) virtual machines - DMTF OVF; (3) conversations - XMPP.org; (4) Unified Cloud Interface (UCI) - W3C, Google Code; (5) Distributed Storage Acceleration - open cloud consortium, UDP based data transfer. David is making a call for participation for folks to help solve these technical challenges. Not a call to help Cisco, but to help the industry..."

Web Semantics in the Clouds
Peter Mika and Giovanni Tummarello, IEEE Intelligent Systems

In the last two years, the amount of structured data made available on the Web in semantic formats has grown by several orders of magnitude. On one side, the Linked Data effort has made available online hundreds of millions of entity descriptions based on the Resource Description Framework (RDF) in data sets such as DBPedia, Uniprot, and Geonames. On the other hand, the Web 2.0 community has increasingly embraced the idea of data portability, and the first efforts have already produced billions of RDF equivalent triples either embedded inside HTML pages using microformats or exposed directly using eRDF (embedded RDF) and RDFa (RDF attributes). Incentives for exposing such data are also finally becoming clearer. Yahoo!'s SearchMonkey, for example, makes Web sites containing structured data stand out from others by providing the most appropriate visualization for the end user in the search result page. It will not be long, we envision, before search engines will also directly use this information for ranking and relevance purposes -- returning, for example, qualitatively better results for queries that involve everyday entities such as events, locations, and people... Yahoo! is building on grid computing using Hadoop to enable the analysis, transformation, and querying of large amounts of RDF data in a batch-processing mode using clusters of hundreds of machines, without apparent bottlenecks in scalability. The Yahoo! crawler affectionately named Slurp began indexing microformat content in the spring of this year, and the company recently added eRDF and RDFa to its supported formats. Yahoo! has also innovated in the Semantic Web area by allowing site owners to expose metadata using the DataRSS format, an Atom-based format for delivering RDF data... With respect to the Semantic Web research community, we are very interested in continuing to develop Semantic Web algorithms cast into the MapReduce framework or its higher-level abstractions such as Pig and HBase. We believe we can successfully transform some of the research problems we've been facing into the wellunderstood MapReduce paradigm and then apply solutions based on open source implementations and commodity hardware. We call on the research community to explore the entire range of Semantic Web algorithms that could be successfully transformed into this increasingly popular solution space. Many of the Semantic Web's scalability problems will likely turn out to be less challenging after all...

See also: the reference page

Towards Kerberizing Web Identity and Services
Thomas Hardjono, MIT Kerberos Consortium Announcement

The MIT-KC would appreciate your inputs and comments regarding a white paper Towards Kerberizing Web Identity and Services, and also your suggestions and recommendations more broadly regarding the Kerberos-on- the-Web project. From the paper abstract: "Today authentication and authorization are addressed in an incoherent, and often site-specific, fashion on the Internet and the Web specifically. This situation stems from many factors including the evolution, design, implementation, and deployment history of HTTP and HTTP-based systems in particular, and Internet protocols in general. Kerberos is a widely-implemented and widely-deployed authentication substrate with a long history in various communities and vendor products. Organizations that currently use Kerberos as a key element of their infrastructure wish to take advantage of its unique benefits while moving to Web-based systems, but have had limited success in doing so. The authors of this paper have drawn upon their combined experience with supporting large Kerberos deployments, writing and developing web-based identity protocols, and integrating heterogeneous authentication services in order to produce this paper..." One of the major goals of the MIT-KC is to establish Kerberos as a ubiquitous authentication mechanism on the Internet and also to make Kerberos appropriate for new environments. One of the key efforts within the MIT-KC directed at this goal is the Kerberos-on-the-Web (Kerb-Web) project. The Kerberos-on-the-web project seeks initially to investigate the various aspects of the development and deployment of Kerberos within the Web space. This includes, among others: (a) the use of the Kerberos authentication paradigm within the context of web-authentication and web-services security, (b) the possible architecture integration and interactions between the Kerberos infrastructure and web-services security infrastructure, (c) the possible enhancements of the Kerberos authentication protocol and Kerberos token in order to address the requirements for Single-Sign-On (SSO) on the Web and Web Identity Federation, and (d) the potential re-use of existing Kerberos infrastructure investments in enterprises and other organizations to support the deployment of Kerberos-on-the-Web solutions.

See also: Jeff Hodges' blog

The Stimulus Package and EHRs: Spend First, Design Later?
Andrew Updegrove, ConsortiumInfo.org Blog

A front page story in the New York Times recently highlighted the danger that billions Stimulus Package dollars targeted at providing universal broadband Internet access could be wasted unless time consuming research and analysis is first performed. The same danger applies to another key element of both the Stimulus Package as well as the Obama administration's own policy agenda: deploying a national health information technology network based upon "Electronic Health Records" (EHRs), at a cost of up to $200 billion over the next five years. EHRs can provide even greater savings, thereby helping the administration meet it's promise of capping healthcare costs while extending universal healthcare to all Americans. But only if they are properly designed. The problem is that EHRs are essentially extremely complex frameworks of IT standards, and these frameworks have not yet been finalized. Until they are, it will be impossible to begin to deploy EHRs—or to spend the $5 billion included in the Stimulus Package for this purpose. The issues are summarized in the Editorial of my most recent issue of Standards Today, titled "Getting EHR Standards Right": The lesson to be learned, then, is that we had better get the standards right, both from a real world as well as a technical perspective. If the standard suites mandated do not solve real problems in ways that work for care givers, vendors and other stakeholders, then this ambitious and worthwhile endeavor will be doomed from the outset, and an enormous amount of money will have been squandered. The issue also includes an depth interview with Charles Jaffe, M.D., CEO of Health Level 7 (HL7), one of the oldest and most important developers of EHR standards. Jaffe highlights EHR challenges and gaps in his interview "View from the Trenches."

See also: XML and Healthcare

"duri" and "tdb" URN Namespaces Based on Dated URIs
Larry Masinter (ed), IETF Internet Draft

A (lightly) updated version of the IETF Internet Draft "'duri' and 'tdb' URN Namespaces Based on Dated URIs" has been published, following discussion on the W3C Technical Architecture Group (TAG) discussion list 'www-tag'. The document is not a product of any IETF working group, but many of the ideas have been discussed since 2001. Abstract: "This document defines two namespaces of URNs, based on using a timestamp with an (encoded) URI. The results are namespaces in which names are readily assigned, offer the persistence of reference that is required by URNs, but do not require a stable authority to assign the name. The first namespace ("duri") is used to refer to URI-identified resources as they appeared at a particular time. The second namespace ("tdb") is useful as a way of creating URNs that refer to physical objects or even abstractions that are not themselves networked resources. The definition of these namespaces may reduce the need to define new URN namespaces merely for the purpose of creating stable identifiers. In addition, they provide a ready means for identifying "non-information resources" by semantic indirection." Background: "The URN specification allows for many URN namespaces, and many have been registered. However, obtaining an appropriate URN in any of the currently defined URN namespaces may be difficult: a number of URN namespace registrations have been accompanied by comments that no other URN namespace was available for the class of documents for which identifiers were wanted... Many people have wondered how to create globally unique and persistent identifiers. There are a number of URI schemes and URN namespaces already registered. However, an absolute guarantee of both uniqueness and persistence is very difficult. In some cases, the guarantee of persistence comes through a promise of good management practice, such as is encouraged in [the document] "Cool URLs don't change". However, relying on promise of good management practice is not the same as having a design that guarantees reliability independent of actual administrative practice. The 'tdb' URN scheme allows ready assignment of URIs for abstractions that are distinguished from the media content that describes them... The goal of the 'tdb' URN scheme proposed below is to provide a mechanism which is, at the same time: (1) permanent: The identity of the resource identified is not subject to reinterpretation over time; (2) explicitly bound: The mechanism by which the identified resource can be determined is explicitly included in the URI. useful for non-networked items: Allows identification of resources outside the network: people, organizations, abstract concepts... " [Editor's note: "The only substantial change I made since the 2004 draft was to change the interpretation of the date from 'first instant' to 'last instant', based on a comment by Al Gilman in 2004."]

Semantic Email Addressing: The Semantic Web Killer App?
Michael Kassoff, Charles Petrie, Lee-Ming Zen, Michael Genesereth; IEEE Internet Computing

Email addresses, like telephone numbers, are opaque identifiers. They're often hard to remember, and, worse still, they change from time to time. Semantic email addressing (SEA) lets users send email to a semantically specified group of recipients. It provides all of the functionality of static email mailing lists, but because users can maintain their own profiles, they don't need to subscribe, unsubscribe, or change email addresses. Because of its targeted nature, SEA could help combat unintentional spam and preserve the privacy of email addresses and even individual identities. SEA is a simple but novel technology that lets you address email to a semantically cally defined set of entities. A SEA mail server computes the recipients of a semantically addressed email on the fly based on the address's semantic definition... SEA has application in both corporate intranets and the Internet; it raises several issues, including security and privacy issues, errors, user adoption, and standardization... Our prototype SEA module, the Infomaster Semantic Email Addresser (ISEA), runs on top of the Infomaster information integration engine. Infomaster lets you query multiple data sources on the Internet through a single mediated schema. The system can therefore pull information from many sources—not just about people but also useful supporting information about organizations, locations, and so forth. Researchers at three Semantic Technologies Institutes—Digital Enterprise Research Institute (DERI) Galway, STI Innsbruck, and the Stanford Logic Group—tested ISEA over a period of one year. The requirements for an industrial-strength version were jointly developed and development is planned. Because STI is large and distributed, with members frequently coming and going, it's difficult for members to keep track of other members' locations and activities. It's therefore a natural application for SEA. The prototype lets members email people based on their site, group affiliations, name, interests, and other attributes. We obtain this information from private databases and publicly available FOAF files... Several other researchers have recognized the value in bringing semantics to email. For example, the Information Lens system lets users send semistructured email messages and filter those messages using production rules. Users can send to a special mailbox called 'anyone,' and anyone can choose to receive messages from this mailbox based on production rules. This flips the nature of widely broadcast emails on its head. Instead of starting with receiving all emails and whittling them down based on filtering rules, the user starts with an empty inbox and pulls in email of interest. This is similar to the RSS subscription model. As RSS feeds contain more semantic information, the semantic subscription model exemplified by Information Lens might become more commonplace. More recently, MailsMore lets users annotate an email's content with Resource Description Framework (RDF) triples and automatically includes RDF triples based on standard email headers such as the 'To,' 'From,' 'Subject,' and body fields. This can be used for semantic filtering and filing of emails. The Mangrove system takes this idea further; it allows not only structured email content but also semantic email processes...

See also: the reference page


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Headlines

Sponsors