This issue of XML Daily Newslink is sponsored by:
Sun Microsystems, Inc. http://sun.com
- Microsoft Office 2007 SP2 to Support XPS, PDF v1.5, PDF/A, and ODF v1.1
- Processing Linked Web Data with XSLT
- State of the Semantic Web
- DITA, DocBook, and the Art of the Document
- W3C Call for Implementations: XQuery and XPath Full Text 1.0
- Web-based Spreadsheets with OpenOffice.org and Dojo
- OASIS Open Standards Forum 2008
- A Uniform Resource Identifier for Geographic Locations ('geo' URI)
Microsoft Office 2007 SP2 to Support XPS, PDF v1.5, PDF/A, and ODF v1.1
Staff, Microsoft Announcement
Microsoft announced that with the release of Microsoft Office 2007 Service Pack 2 (SP2) scheduled for the first half of 2009, the list of supported document formats will grow to include support for XML Paper Specification (XPS), Portable Document Format (PDF) 1.5, PDF/A, and Open Document Format (ODF) v1.1. "When using SP2, customers will be able to open, edit and save documents using ODF and save documents into the XPS and PDF fixed formats from directly within the application without having to install any other code. It will also allow customers to set ODF as the default file format for Office 2007. To also provide ODF support for users of earlier versions of Microsoft Office (Office XP and Office 2003), Microsoft will continue to collaborate with the open source community in the ongoing development of the Open XML-ODF translator project on SourceForge.net. In addition, Microsoft has defined a road map for its implementation of the newly ratified International Standard ISO/IEC 29500 (Office Open XML). IS29500, which was approved by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) in March, is already substantially supported in Office 2007, and the company plans to update that support in the next major version release of the Microsoft Office system, code-named 'Office 14'. Consistent with its interoperability principles, in which the company committed to work with others toward robust, consistent and interoperable implementations across a broad range of widely deployed products, the company has also announced it will be an active participant in the future evolution of ODF, Open XML, XPS, and PDF standards. Microsoft will join the OASIS technical committee working on the next version of ODF and will take part in the ISO/IEC working group being formed to work on ODF maintenance. Microsoft employees will also take part in the ISO/IEC working group that is being formed to maintain Open XML and the ISO/IEC working group that is being formed to improve interoperability between these and other ISO/IEC-recognized document formats. The company will also be an active participant in the ongoing standardization and maintenance activities for XPS and PDF. It will also continue to work with the IT community to promote interoperability between document file formats, including Open XML and ODF, as well as Digital Accessible Information System (DAISY XML), the foundation of the globally accepted DAISY standard for reading and publishing navigable multimedia content. Microsoft is also committed to providing Office customers with the ability to open, edit and save documents in the Chinese national document file format standard, Uniform Office Format (UOF)."
See also: the Interop Vendor Alliance Web site
Processing Linked Web Data with XSLT
Uche Ogbuji, DevX.com
State of the Semantic Web
Ivan Herman, Conference Presentation
This presentation was delivered by Ivan Herman, W3C Semantic Web Activity Lead, at the 2008 Semantic Technology Conference held in San Jose, California, on May 18, 2008. The history of the Semantic Web goes back several years now. The 55-slide presentation summarizes what has been achieved, where we are, and where we are going. Ivan Herman joined the Centre for Mathematics and Computer Sciences (CWI) in Amsterdam in 1988 where he holds a tenured position. He joined the W3C Team as Head of W3C Offices in 2001 while maintaining his position at CWI. Ivan served as Head of Offices until 2006, when he was asked to take the Semantic Web Activity Lead position, which is now his principal work at W3C. As summarized in Bruno Pinheiro's blog: "[Herman] made a broad presentation of what they're focusing at the W3C, which are the discussions that are burning at the community and talked about some technologies that they are putting their bets on. As far as I saw, Dublin Core and FOAF are a common sense at the vocabulary level, as they appeared as good examples in both presentations and in every book about semantic. SPARQL is the Query Language that with RDF and WOL seems to be under the spotlight now. Ivan talked a little about an interesting project called the 'Linking Open Data Project', which Goal is to 'expose open databases in RDF', setting RDF links among data items from different databases and setting up SPARQL endpoints to query the data. The first practical projectOne of the projects of this initiative is the DBPedia: by extracting data from that 'infobox' on wikipedia pages (right columm) from a City, for example, and integrating with the city information on the US Census database they can build a stronger an richer knowledge of that city. At this elaboration stage there are still lots of issues, but these were the ones Ivan talked about: security, trust, provenance; ontology merging, alignment, term equivalences; Uncertainty. The most important for me were the ontology merging and uncertainty. The web as we know was build on sharing and linking documents. Now, on the Semantic wave the same concept must be applied. There's no need to build a complete new ontology on geonames, for example. Just link to an existing and build one just for your own knowledge domain..."
See also: Bruno Pinheiro's blog
DITA, DocBook, and the Art of the Document
Kurt Cagle, O'Reilly Reviews
Structured documentation provides a level of uniformity that can then serve for reusing content from a single document source. Today that is important because such structured source documents can in turn be transformed into HTML, into PDFs, PostScript files, RTF and Microsoft Word. Such source documents can also serve to power binary help files, to provide first-level semantics for text-to-speech and VoiceML applications and so forth - all at the same time. A consistent document language makes it possible to build transformations to import partial content into output for labels on cans or boxes, and provides a single point of authority for translation into foreign languages... DocBook and DITA both provide XML Markup for describing different facets of technical documentation. DocBook actually has its origins, ironically enough, with O'Reilly & Associates as a language used to lay out narrative technical books, based primarily upon the works of Norman Walsh and Robert Stayton. DocBook was originally an SGML specification, and was one of the first non-W3C specifications to be converted to XML, with the formal specification for DocBook being then assigned to OASIS-Open as part of their documentation activity. It is used primarily for describing books, articles, research papers and (with some additions) slides, but its structured layout also makes it attractive for storing technical articles with small to moderate sized organizations. Indeed, even today, many of the books that O'Reilly produces are laid it first in DocBook... DITA, on the other hand, evolved from the Darwin Information Typing Architecture developed by IBM in order to create individual 'topics' of content -- such as those that might be used for an online documentation system. The topics in turn are organized by topic maps that establish a hierarchical structure for the topics. Topics in turn use a basic layout language which borrows somewhat from HTML, but extends it to include figures, examples, notes, screen displays and so forth. DITA works especially in those cases where narrative content is limited to the domain of a single topic (such as the individual entry within a help application), although efforts are underway to try to extend this to formal business documents, with mixed success. As a technology, DITA seems to work best in those situations where you're dealing with content that can be parsed into distinct chunks that have to be updated by a wide number of authors.
W3C Call for Implementations: XQuery and XPath Full Text 1.0
S. Amer-Yahia, C. Botev, S. Buxton (et al., eds), W3C Technical Report
W3C has issued a call for implementations in connection with the publication of "XQuery and XPath Full Text 1.0" as a Candidate Recommendation. This document has been jointly developed by the W3C XML Query Working Group and the W3C XSL Working Group, each of which is part of the XML Activity. It will remain a Candidate Recommendation until at least 15-September-2008, and will not be submitted for consideration as a W3C Proposed Recommendation until its four key exit critera are met. A Test Suite for this document is under development, and implementors are encouraged to run this test suite and report their results. The editorial teams have also released Working Drafts for "XQuery and XPath Full Text 1.0 Requirements" and "XQuery and XPath Full Text 1.0 Use Cases." The CR document defines the language and the formal semantics of XQuery and XPath Full Text 1.0. Additionally, the document defines an XML syntax for XQuery and XPath Full Text 1.0. XQuery and XPath Full Text 1.0 extends the syntax and semantics of XQuery 1.0 and XPath 2.0... As XML becomes mainstream, users expect to be able to search their XML documents. This requires a standard way to do full-text search, as well as structured searches, against XML documents. A similar requirement for full-text search led ISO to define the SQL/MM-FT standard. SQL/MM-FT defines extensions to SQL to express full-text searches providing functionality similar to that defined in this full-text language extension to XQuery 1.0 and XPath 2.0. XML documents may contain highly structured data (fixed schemas, known types such as numbers, dates), semi-structured data (flexible schemas and types), markup data (text with embedded tags), and unstructured data (untagged free-flowing text). Where a document contains unstructured or semi-structured data, it is important to be able to search using Information Retrieval techniques such as scoring and weighting... As XQuery and XPath evolve, they may apply the notion of score to querying structured data. For example, when making travel plans or shopping for cameras, it is sometimes useful to get an ordered list of near matches in addition to exact matches. If XQuery and XPath define a generalized inexact match, we expect XQuery and XPath to utilize the scoring framework provided by XQuery and XPath Full Text.
See also: the Requirements document
Web-based Spreadsheets with OpenOffice.org and Dojo
Oleg Mikheev and Doan Nguyen Van, Java World Magazine
OASIS Open Standards Forum 2008
Staff, OASIS Announcement
OASIS announced that the annual OASIS European Forum will be held October 1-3, 2008 near London. The theme will focus on "Security Challenges for the Information Society." OASIS invites proposals for presentations, panel sessions, and interoperability demonstrations related to this theme. Funding for the Forum is provided by OASIS Foundational Sponsor members, BEA, IBM, Primeton, and Sun Microsystems, and by IDtrust. "Open exchange of information and access to online services also pose challenges and threats. Service providers want to authenticate the identity of individuals requesting access, and determine the resources and services they are entitled to access. Users want their identity and personal data and privacy to be protected adequately, and the confidentiality of sensitive data they are submitting to be respected. In today's Internet and in many large private network infrastructures, heterogeneity and diversity are the rule rather than the exception. Security infrastructures need open standards and interoperability to scale to the huge deployments that are being rolled out today. Some of these security standards from OASIS and other organizations support a model where identity authentication, access control, digital signature processing, encryption and key management are provided as services that can be distributed and shared. The Open Standards Forum 2008 will provide users who are evaluating or looking to deploy such security infrastructures with an opportunity to explore the state of the art in security services, standards and products. It will also provide users with an opportunity to present and share their use cases, requirements and (initial) experience with other users and with some of the leading experts in this field."
Alexander Mayrhofer and Christian Spanring (eds). IETF Internet Draft Members of the IETF Geographic Location/Privacy (GEOPRIV) Working Group have published an initial -00 version of the draft "Uniform Resource Identifier for Geographic Locations ('geo' URI)." The document specifies an Uniform Resource Identifier (URI) for geographic locations using the 'geo' scheme name. A 'geo' URI provides latitude, longitude and optionally altitude of a physical location in a compact, simple, human-readable, and protocol independent way... An increasing number of Internet protocols and data formats are being enriched by specifications on how to add information about geographic location to them. In most cases, latitude as well as longitude are added as attributes to existing data structures. However, all those methods are specific to a certain data format or protocol, and don't provide a generic way to protocol independent location identification. The 'geo' URI scheme is another step into that direction and aims to facilitate, support and standardize the problem of location identification in geospatial services and applications. 'Geo' URIs identify a geographic location using a textual representation of the location's spatial coordinates in either two or three dimensions (latitude, longitude, and optionally altitude). Such URIs are independent from a specific protocol, application, or data format in which they might be contained... Because the 'geo' URI is not tied to any specific protocol, and identifies a physical location rather than a network resource, most of the general security considerations on URIs do not apply. he URI syntax does make it possible to construct valid 'geo' URIs which don't identify a valid location on earth. Applications must not use URIs which such invalid values, and should warn the user when such URIs are encountered... The IETF Geographic Location/Privacy (GEOPRIV) Working Group, part of the Real-time Applications and Infrastructure Area activity, was chartered to assess the authorization, integrity and privacy requirements that must be met in order to transfer location information, or authorize the release or representation of such information through an agent. A goal of this working group is to deliver a specification that has broad applicablity and will become mandatory to implement for IETF protocols that are location-aware. The group has produced several final RFCs.
XML Daily Newslink and Cover Pages are sponsored by:
|BEA Systems, Inc.||http://www.bea.com|
|Sun Microsystems, Inc.||http://sun.com|
XML Daily Newslink: http://xml.coverpages.org/newsletter.html
Newsletter Archive: http://xml.coverpages.org/newsletterArchive.html
Newsletter subscribe: email@example.com
Newsletter unsubscribe: firstname.lastname@example.org
Newsletter help: email@example.com
Cover Pages: http://xml.coverpages.org/