The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: June 23, 2001
XML Articles and Papers. January - March 2001.

XML General Articles and Papers: Surveys, Overviews, Presentations, Introductions, Announcements

References to general and technical publications on XML/XSL/XLink are also available in several other collections:

The following list of articles and papers on XML represents a mixed collection of references: articles in professional journals, slide sets from presentations, press releases, articles in trade magazines, Usenet News postings, etc. Some are from experts and some are not; some are refereed and others are not; some are semi-technical and others are popular; some contain errors and others don't. Discretion is strongly advised. The articles are listed approximately in the reverse chronological order of their appearance. Publications covering specific XML applications may be referenced in the dedicated sections rather than in the following listing.

March 2001

  • [March 31, 2001] DeltaXML XML Schema software. Posting from Robin LaFontaine describes online availability of 'schema comparator'. "Monsell's DeltaXML XML Schema software compares XML Schema files taking into account the fact that elements, attributes etc. can be in any order. So only significant changes are identified - even down to ignoring a change in the order of a 'choice' item, ... but a change in a 'sequence' is identified... The free trial version works for small schemas. If you want a full trial for larger schemas let me know and I will provide an evaluation license. If you have data files to compare, DeltaXML Markup will compare any XML files and identify changes for you, representing these changes in XML of course..." See the DTD changes in XML Schema from CR to PR, and XSU - Upgrade for XML Schema documents [20000922 to PR 20010316]. Also (1) "Revised Online Validator for XML Schema (XSV) and XML Schema Update Tool (XSU)" and (2) "XML Schemas."

  • [March 30, 2001] "A Framework for Implementing Business Transactions on the Web." Hewlett-Packard initial submission to OASIS BTP work. By Dr. Mark Little (Transactions Architect, HP Arjuna Labs, Newcastle upon Tyne, England), with Dave Ingham, Savas Parastatidis, Jim Webber, and Stuart Wheater. 20 pages (with 11 notes). [See the posting from Mark Little.] "An increasingly large number of distributed applications are constructed by being composed from existing applications. The resulting applications can be very complex in structure, with complex relationships between their constituent applications. Furthermore, the execution of such an application may take a long time to complete, and may contain long periods of inactivity, often due to the constituent applications requiring user interactions. In a loosely coupled environment like the Web, it is inevitable that long running applications will require support for fault-tolerance, because machines may fail or services may be moved or withdrawn. A common technique for fault-tolerance is through the use of atomic transactions, which have the well know ACID properties, operating on persistent (long-lived) objects. Transactions ensure that only consistent state changes take place despite concurrent access and failures... From the previous discussions it should be evident that there are a range of applications that require different levels of transactionality. Many types of business transaction do not have the simple commit or rollback semantics of an ACID transaction, and may complete in a number of different ways that may still be interpreted as successful but which do not imply everything that the business transaction did has occurred. We have shown that a flexible and extensible framework for extended transactions is necessary, then in addition to standardising on the interfaces to this framework, we also need to work on specific extended transaction models that suit the Web. We would not expect applications to work at the level of Signals, Actions and SignalSets, as these are too low-level. Higher-level APIs are required to isolate programmers from these details. However, from experience we have found that this framework helps to clarify the requirements on specific extended transaction implementations. We have given examples of the types of Web applications that have different requirements on any transaction infrastructure, and from these we believe it should be possible to obtain suitable extended transaction models." Other issues that will need to be considered when implementing many business transactions include: (1) Security and confidentiality... (2) Audit trail... (3) Protocol completeness guarantee... (4) Quality of service..." See "OASIS Business Transactions Technical Committee."

  • [March 30, 2001] "OASIS Security Services TC: Glossary." By the OASIS Security Services Technical Committee (SSTC). Edited by Jeff Hodges. "A New Oasis-SSTC-Draft is available from the on-line SSTC document repository. This draft is presently a work item of the Use Cases and Requirements subcommittee, and of the SSTC as a whole. This document comprises an overall glossary for the OASIS Security Services Technical Committee (SSTC) and its subgroups. Individual SSTC documents and/or subgroup documents may either reference this document and/or 'import' select subsets of terms." Background may be read in the mailing list archives (1) security-use and (2) security-services. Document also in PDF format. See the Technical Committee web pages.

  • [March 30, 2001] "Spinning Your XML for Screens of All Sizes. Using HTML as an Intermediate Markup Language." By Alan E. Booth (Software Engineer, IBM) and Kathryn Heninger Britton (Senior Technical Staff Member, IBM). From IBM developerWorks. March 2001. ['This article shows how to use HTML as an intermediate language so that you can write a single stylesheet to translate from XML to one or more versions of HTML and use the features of the WebSphere Transcoding Publisher server to translate the resulting HTML to the target markup language the requesting device requires.'] "Business applications expressed in vertical XML dialects must be translated into presentation formats, such as HTML, to be displayed to users. With the advent of Internet-capable cell phones and wireless PDAs came several new presentation languages, many of which are in common use today. You can write XSLT stylesheets to control the way the original business-oriented XML data is translated into a presentation format, but the process of writing stylesheets for each different presentation of a single application is onerous. This article addresses two major trends in Web-based business applications: (1) The use of XML to capture business information without the presentation specifics of HTML. This trend is based on the recognition that the generation of business data requires different skills than the effective presentation of information. Also, business data is often exchanged by programs that find the presentation tagging irrelevant at best. (2) The proliferation of presentation markup languages and device constraints, multiplying the effort required to generate effective presentations. In addition to traditional desktop browsers, there are Internet-capable cell phones, PDAs, and pagers. These new devices often require different markup languages, such as compact HTML (CHTML), Wireless Markup Language (WML), VoiceXML, and Handheld Device Markup Language (HDML). In contrast to the rich rendering capabilities of desktop browsers, many of these devices have very constrained presentation capabilities, including small screens and navigation restrictions... IBM WebSphere Transcoding Publisher can transcode or translate automatically from HTML to several other presentation markup languages, including WML, HDML, and compact HTML (i-mode). Transcoding Publisher can also exploit the capability of XSLT to produce different output based on the values of parameters. It does so by deriving parameter values from the current request, using data in the HTTP header and characteristics of the requesting device. Using both of these capabilities, the problem of deriving multiple presentations from one business application can be reduced to generating one stylesheet that can produce one or more versions of the application in HTML, perhaps one full-featured version for desktop browsers, one medium-featured version for larger screen PDAs, and one for the most screen-constrained devices. Transcoding Publisher can then translate the selected content for the specific markup language of the target device."

  • [March 30, 2001] "A Brief History of SOAP." By Don Box (DevelopMentor Inc.). March 30, 2001. "... For the most part, people have stopped arguing about SOAP. SOAP is what most people would consider a moderate success. The ideas of SOAP have been embraced by pretty much everyone at this point. The vendors are starting to support SOAP to one degree or another. There are even (unconfirmed) reports of interoperable implementations, but frankly, without interoperable metadata, I am not convinced wire-level interop is all that important. It looks like almost everyone will support WSDL until the W3C comes down with something better, so perhaps by the end of 3Q2001 we'll start to see really meaningful interop. SOAP's original intent was fairly modest: to codify how to send transient XML documents to invoke/trigger operations/responses on remote hosts. Because of our timing, we were forced to tackle issues that the schemas WG has since solved, which caused the S in SOAP to be somewhat lost. At this point in time, I firmly believe that only two things are needed for mid-term/long-term convergence: (1) The XML Schemas WG should address the issue of typed references and arrays. Adding support for these two 'synthetic' types would obviate the need for SOAP section 5. These constructs are broadly useful outside the scope of messaging/rpc applications, so it makes sense (to me at least) that the Schemas WG should address this. (2) Define the handful of additional constructs needed to tie the representational types from XML Schemas into operations and SUDS-style interfaces/WSDL-style portTypes. WSDL comes close enough to providing the necessary behavioral constructs to XML Schemas, and I am cautiously optimistic that something close to WSDL could subsume SOAP entirely. I strongly encourage you to study the WSDL spec and submit comments/improvements/errata so we can get convergence and interop in our lifetime..." See "Simple Object Access Protocol (SOAP)" and "Web Services Description Language (WSDL)."

  • [March 30, 2001] "A Busy Developer's Guide to SOAP 1.1." By Dave Winer and Jake Savin (UserLand Software). March 28, 2001. "This specification documents a subset of SOAP 1.1 that forms a basis for interoperation between different environments much as the XML-RPC spec does. When we refer to 'SOAP' in this document we're referring to this subset of SOAP, not the full SOAP 1.1 specification. What is SOAP? For the purposes of this document, SOAP is a Remote Procedure Calling protocol that works over the Internet. A SOAP message is an HTTP-POST request. The body of the request is in XML. A procedure executes on the server and the value it returns is also formatted in XML. Procedure parameters and returned values can be scalars, numbers, strings, dates, etc.; and can also be complex record and list structures..." See also the political background [Dave's SOAP Journal, part 2] and the compatible validator running on SoapWare.Org. See "Simple Object Access Protocol (SOAP)."

  • [March 30, 2001] "Expressing Qualified Dublin Core in RDF." Draft Version-2001-3-29. By Dublin Core Architecture Working Group. Authors: Stefan Kokkelink and Roland Schwänzl. Supersedes Guidance on expressing the Dublin Core within the Resource Description Framework (RDF). "In this draft Qualified Dublin Core is encoded in terms of RDF, the Resource Description Framework as defined by the RDF Model & Syntax Specification (XML namespace for RDF). RDF is a W3C recommendation. Also RDFS the RDF Schema specification 1.0 is used (XML namespace for RDFS). RDFS is a W3C candidate recommendation. Quite often the notion of URI (Uniform Resource Identifier) is used. The notion of URI is defined by RFC 2396 The notion of URI embraces URL and URN. We also discuss colaboration of qualified DC with other vocabularies and DumbDown. In this paper explicit encodings are provided for classical classification systems and thesauri. Additionally a procedure is discussed to create encodings for more general schemes. One of the majour changes with respect to the data model draft is the more systematic use of RDF Schema. It is understood that all DC related namespace references are currently in final call at the DC Architecture Working Group. They will be fixed in a forthcoming version of the current draft..." For related work, see CARMEN (Content Analysis, Retrieval and MetaData: Effective Networking) and especially CARMEN AP 6: MetaData based Indexing of Scientific Resources. See: "Dublin Core Metadata Initiative (DCMI)."

  • [March 29, 2001] "XSLT Processor Benchmarks." By Eugene Kuznetsov and Cyrus Dolph. From XML.com. March 28, 2001. [The latest benchmark figures for XSLT processors show Microsoft's processor riding high, with strong performance from open source processors... XML.com is pleased to bring you the results of performance testing on XSLT processors. XSLT is now a vital part of many XML systems in production, and choosing the right processor can have a big impact. Microsoft's XSLT processor, shipped with their MSXML 3 library, comes top of the pile by a significiant margin. After Microsoft, there's a strong showing from the Java processors, with James Clark's XT--considered by many an "old faithful" among XSLT engines--coming ahead of the rest. Still, speed isn't everything, and most XSLT processors are incomplete with their implementation of the XSLT 1.0 Recommendation. On this score, Michael Kay's Saxon processor offers good spec implementation as well as respectable performance.'] "XSLTMark is a benchmark for the comprehensive measurement of XSLT processor performance. It consists of forty test cases designed to assess important functional areas of an XSLT processor. The latest release, version 2.0, has been used to assess ten different processors. This article describes the benchmark methodology and provides a brief overview of the results... The performance of XML processing in general is of considerable concern to both customers and engineers alike. With more and more XML-encoded data being transmitted and processed, the ability to both predict and improve XML performance is critical to delivering scalable and reliable solutions. While XSLT is a big part of delivering on the overall value proposition of XML (by allowing XML-XML data interchange and XML-HTML content presentation), it also presents the greatest performance challenge. Early anecdotal evidence showed wide disparities in real-life results, and no comprehensive benchmark tools were available to obtain more systematic assessments and comparisons... Of the processors included in this release of the benchmark, MSXML, Microsoft's C/C++ implementation, is the fastest overall. The three leading Java processors, XT, Oracle and Saxon, have surpassed the other C/C++ implementations to take 2nd through 4th place respectively. This suggests that high-level optimizations are more important than the implementation language in determining overall performance. The C/C++ processors tend to show more variation in their performance from test case to test case, scoring some very high marks alongside some disappointing performance. XSLTC aside, the C/C++ processors won first place in 33 of the 40 test cases, in some cases scoring two to three times as well as their Java competitors (attsets, dbonerow). This suggests that there is a lot of potential to be gained from using C/C++, but that consistent results might be harder to obtain..." Tool: XSLTMark; see also Kevin Jones' XSLBench test suite. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 29, 2001] "XSLT Benchmark Results." By Eugene Kuznetsov and Cyrus Dolph. From XML.com. March 28, 2001. ['The full results from the DataPower XSLT processor benchmarks.'] XSLTMark gauges the capabilities of XSLT processing engines by testing them on a common platform with a variety of stylesheets and inputs that sample the gamut of possible applications. See the XSLTMark overview for more information about the benchmark itself and how to download it. These results were obtained by DataPower on a Pentium III/500 machine running Linux. We encourage XSLT engine authors and users to submit benchmark results on their platforms, as well as drivers for new processors. Test results for the following XSLT processors are available: Overall Chart; 4Suite 0.10.2 (Fourthought); Gnome XSLT 0.5.0 (Gnome Project); MSXML 3.0 (Microsoft); Oracle XSLT 2.0 (Oracle); Sablotron 0.51 (Ginger Alliance); Saxon 6.2.1 (Michael Kay); TransforMiiX 0.8 (Mozilla Project); Xalan-C++ 1.1 (Apache Project); Xalan-Java 2.0.0 (Apache Project); XSLTC alpha 4(Sun); XT 19991105 (James Clark); Key." See previous article. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 29, 2001] "XML Q&A: DTDs, Industry Markup Languages, XSLT and Special Characters." By John E. Simpson. From XML.com. March 28, 2001. 'John Simpson solves hairy problems with DTDs and 'special characters.' John also provides some pointers on where to start with using industry markup languages.'

  • [March 29, 2001] "XML-Deviant: Schemas by Example." By Leigh Dodds. From XML.com. March 28, 2001. ['There has been a lot of activity in the area of XML schema languages recently: with several key W3C publications and another community proposed schema language. Another alternative schema language has emerged from the XML community, relying entirely on example instance documents.'] (1) "W3C XML Schema: The finish line is now in sight for the members of the W3C XML Schemas Working Group. The XML Schema specifications are an important step closer to completion with their promotion to Proposed Recommendation status. All that remains now is for Tim Berners-Lee, as Director of the W3C, to approve the specifications before they become full Recommendations. The road has been long and hard, and it's had a number of difficult sections along the way." (2) Examplotron: "Eric van der Vlist has been helping to realize Rick Jelliffe's vision of a plurality of schema languages by publishing Examplotron, a schema language without any elements. Examplotron's innovation lies in its '"schema by example' approach to schema generation. Rather than define a dedicated schema language with which a document can be described, Examplotron uses sample instance documents, annotated with several attributes that carry schema specific information such as occurrence of elements, and assertions about element and attribute content. Like Schematron before it, Examplotron is implemented using XSLT. An Examplotron instance document can be converted into a validating stylesheet by applying a simple transformation..." For schema description and references, see "XML Schemas."

  • [March 28, 2001] "No More Speaking In Code." By L. Scott Tillett. In InternetWeek (March 12, 2001). "An IT industry group has released specifications aimed at allowing business process-specific code in applications to be removed, shared and analyzed in much the same way data can be isolated from application logic. The language, called Business Process Modeling Language, is intended to let enterprises easily share business process details with suppliers and partners, diminishing the need to customize code when two businesses use the Internet for core processes such as monitoring inventory or manufacturing a product. If it's widely embraced, the standard would be used by software makers to break out business process code from their apps. The industry group behind the BPML specification calls itself the Business Process Management Initiative and includes more than 75 heavyweights including Computer Sciences Corp., Intalio, Nortel Networks, Sybase, Sun Microsystems, Blaze Software and Hewlett-Packard. Business Process Management Initiative members envision a day when business processes, like data, can reside in their own management systems -- where they can be analyzed to determine the best way of conducting business, or from which they might be passed along to business partners in a common language describing how a particular process should be performed. BPML, an 'object-oriented description of a process,' according to BPMI members, can be expressed in XML, making it easy for businesses to pass business-process specifications back and forth. 'BPML is a language for modeling processes both within and between businesses,' said Howard Smith, chief technology officer for BPMI member CSC." See (1) the announcement and (2) "Business Process Modeling Language (BPML)."

  • [March 28, 2001] "[XML Transformations] Part 2: Transforming XML into SVG." By Doug Tidwell (Cyber Evangelist, developerWorks XML Team). From IBM developerWorks, XML Education. Updated: March 2001. ['The first section of our tutorial showed you how to transform XML documents into HTML. We used a variety of XML source documents (technical manuals, spreadsheet data, a business letter, etc.) and converted them into HTML. Along the way, we demonstrated the various things you can do with the XSLT and XPath standards. In this section, we'll use the World Wide Web Consortium's emerging Scalable Vector Graphics format (SVG) to convert a couple of our original documents into graphics.'] "For our transformations, we'll use two of our original six source documents: some spreadsheet data and a Shakespearean sonnet. The other documents from our original set aren't easily converted to SVG; we'll discuss why later... SVG is a language for describing two-dimensional graphics in XML. You use SVG elements to describe text, paths (sets of lines and curves), and images. Once you've defined those images, you can clip them, transform them, and manipulate them in a variety of interesting ways. In addition, you can define interactive and dynamic features by assigning event handlers, and you can use the Document Object Model (DOM) to modify the elements, attributes, and properties of the document. Finally, because SVG describes graphics in terms of lines, curves, text, and other primitives, SVG images can be scaled to any arbitrary degree of precision... We've taken a couple of our documents and transformed them into SVG. The column and pie charts are really useful examples that demonstrate what SVG can do, and our transformed sonnet displays the sonnet and its rhyme scheme clearly. These transformations used several important concepts in stylesheets. We used parameters and variables, we added extension functions when we needed them, and we used the mode attribute to control how templates were invoked. All of these were necessary because of the kind of documents we were creating. Despite this, our approach to writing stylesheets remains the same: (1) Determine the kind of document you want to create. (2) Look at the contents of that target document, and determine what information you need to complete it. (3) Build a stylesheet that creates the elements of the target document, and either retrieve or calculate the information you need for each part of the target document. The more text-intensive documents demonstrate what SVG doesn't do very well. Anything that contains text that needs to be broken into lines and paragraphs is difficult to do with SVG. You have to calculate the line breaks yourself, and you have to figure out how tall each line of text should be. Furthermore, if you wanted to use rich text features in your SVG document (display certain words in other fonts, different type sizes, different colors, etc.), your job would be even more difficult. See also tutorial articles (1) "Transforming XML into HTML" and (2) "Transforming XML into PDF." See: "W3C Scalable Vector Graphics (SVG)."

  • [March 28, 2001] "Scalable Vector Graphics. [Integrated Design.]" By Molly E. Holzschlag. In WebTechniques Volume 6, Issue 4 (April 2001), pages 30-34. ['Scalable Vector Graphics is Up For Candidate Recommendation before the W3C. 'Will it be a Flash killer?' Wonders Molly E. Holzschlag.] "Scalable Vector Graphics (SVG) is a perfect example of technology and design meeting on a level playing field. Via XML markup, you can create and implement graphic images, animations, and interactive graphic designs for Web viewing. Of course, browsers must support SVG technology, which is one reason that many developers haven't looked into it too seriously, or perhaps haven't heard of it. SVG is being developed under the auspices of the W3C. As a result, developers have worked to make it compatible with other standards including XML, XSLT, CSS2, Document Object Model (DOM), SMIL, HTML 4.0, XHTML 1.0, and sufficient accessibility options via the Web Accessibility Initiative (WAI). As of this writing, SVG's status is Candidate Recommendation. The working group responsible for SVG has declared it stable, and if it passes several more tests, it moves into the Recommendation phase. Perhaps the most important concept to grasp when first studying SVG is its scalability. Graphics aren't limited by fixed pixels. Like vector graphics, you can make scalable graphics larger or smaller without distorting them. This is very important for designing across resolutions. Scalable graphics adjust to the available screen resolution. This alone makes SVG attractive to Web designers, as it solves one of the most frustrating issues we face: creating designs that are as interoperable, yet as visually rich, as possible... While SVG support in browsers obviously isn't immediately available, it's a technology that's worth watching and using. The fact that major companies are investing time and money to create tools that support it is indicative of the hope SVG holds. What's more, the fact that standards compliance is being written into these tools early on is very exciting -- an unprecedented event when it comes to client-side markup! So while SVG might not be something you'll actually use for awhile, it's absolutely worth taking out for a test drive, if only for the sheer fun of it." See: "W3C Scalable Vector Graphics (SVG)."

  • [March 28, 2001] "An SVG Tool Kit for Java: Batik SVG Toolkit. [Product Review.]" By Clayton Crooks. In WebTechniques Volume 6, Issue 4 (April 2001), pages 40-41. ['Pros: Offers Java developers an easy way to add SVG capabilities to their programs. Cons: Unless you're developing custom solutions, apps are limited.'] "Batik, an open-source project lead by the Apache Software Foundation, is a Java-based tool kit for incorporating Scalable Vector Graphics (SVG) into applications. In addition to offering the developer tools that let you view, generate, or manipulate images, the Apache Software Foundation has released a set of applications with basic SVG functions that can be used with any standard application. The goal is to provide a complete set of core modules that can be used individually or together to develop SVG projects... Batik provides complete applications and modules, making it easy for Java-based applications to use SVG content. According to the Web site, using Batik's SVG Generator, you can develop a Java application to export any graphics format to the SVG format. Another application can be developed using Batik's SVG processor and Viewer to easily integrate SVG viewing capabilities. Still another application uses Batik's modules to convert SVG documents to various formats, such as popular raster formats like JPEG or PNG. Since its inception, Batik has been an open-source project. It was created when several groups working on independent SVG-related projects combined their efforts. The original teams included employees from industry giants like Eastman Kodak, Sun Microsystems, and IBM. The groups decided that their respective projects could benefit from the offerings of the others, and that combining the projects would result in a much more complete tool." See: "W3C Scalable Vector Graphics (SVG)."

  • [March 28, 2001] "Zope: An Open-Source Web Application Server. [Review.]" By Brian Wilson (Harbro Systems in Santa Rosa, CA). In WebTechniques Volume 6, Issue 4 (April 2001), pages 80-81. 'Zope has rich set of content-management and database features; fairly steep learning curve.' "Many of the Web projects I work on are for nonprofit organizations, and I must lean heavily on volunteers who have little experience working on Web sites. As a result, I'm very interested in tools that help me set up and maintain a basic site layout, while letting beginners enter and maintain content. I heard that Zope could help me, so I decided to try it. Zope was developed by Digital Creations, which provides commercial support for it. The introduction to the online Zope Book says that Zope is a framework for building Web applications. It allows for powerful collaboration, simple content management, and Web component use. Sounds good so far. Because Zope is open source and runs on Red Hat Linux, I'll have access to updates and bug fixes. Zope is written in Python, making it portable across many platforms (www.python.org). Currently, it's available in binary format for Windows (9x/NT), Linux, and Solaris, plus it can be compiled on other Unix platforms. I used the pre-built Linux version for this article (Zope 2.2.4), which I tested on both versions 6.2 and 7.0 of Red Hat Linux... The heart of Zope is Document Template Markup Language (DTML). Yes, DTML requires that you learn yet another language, but it builds on HTML, so it should be familiar. It's also incredibly powerful. You can create pages through the Web interface, and use special Zope DTML tags to do things like iterate over the objects in a folder and insert them into a table. I began creating pages right away -- without knowing any DTML. . . Zope holds out the promise of being able to do everything I need for my Web sites. As with many open-source projects, Zope suffers from having a fabulously rich feature set that I cannot (yet) access because the documentation isn't finished. I know that in time, I could read through mailing list archives and scattered online docs to learn what I need to know, but that route is definitely no picnic. Although I found Zope impressive, I'm still fond of Apache. Hence, my next step will be to look at Midgard, which is based on Apache, MySQL, and PHP. It's definitely harder to install than Zope, but Midgard builds on the base of three tools I'm already using." See also "Zope Parsed XML Project Releases ParsedXML Version 1.0."

  • [March 28, 2001] "Zope: Open Source Alternative for Content Management. Zope Proves Utility of Open-Source Web Tools." By Mark Walter and Aimee Beck. In The Seybold Report on Internet Publishing Volume 5, Number 7 (March 2001), pages 11-15. In depth review with case studies. ['SRIP looks at Zope, a free toolkit developed by Digital Creations that's gained favor among daily newspapers, corporations, government agencies and a host of Web startups. Included are details on Zope's new content-management framework, due out this spring.'] "With Net budgets plunging in parallel with the high-tech stock swoon, site managers are seeking lower-priced alternatives to premium content-management systems. That's good news for Digital Creations and Zope, its open-source Web publishing framework built on top of Python. This month Digital Creations is extending Zope even further, releasing a full-blown content-management system based on the Zope framework... Coming in the next release, due out later this spring, will be a simple syndication server that helps administrators set up automated polling for inbound feeds and lets authorized customers pull content for outgoing material. Also under development is an overhaul to the underlying presentation templates: Digital Creations plans to change its "document template markup language" and its reliance on custom tags to an XHTML-based scheme driven from custom attributes on standard tags. That change will make it much easier for template designers to get WYSIWYG feedback from within popular Web-design products, like Dreamweaver or GoLive... Every system has its limitations, and Zope, for all its power and flexibility, relies on Python, which at this point is not yet the language of the masses. The upside, of course, is that Zope is open source: If you're willing to roll up your sleeves, you can save considerable money on software. In following Linux, Digital Creations has confirmed the merits of the open source software model and garnered supporters from across the globe. With CMF, Digital Creations has taken a big step toward bringing Zope to an even wider audience. The downside to open-source products, compared to their commercial counterparts, is that users have to assume primary responsibility for support. In the Zope CMF, customers get a nice combination -- free code, and, in Digital Creations, a consultant with deep experience solving complex publishing problems. At a time when Web budgets are being trimmed, but the volume of content continues to rise, Zope could be poised for even faster growth. Fredericksburg.com's Muldrow concludes, 'I've honestly not seen a product that so completely improved the way we do things -- I built a product to post jobs online in less than a day. We haven't been able to do that with anything else'." See also "Zope Parsed XML Project Releases ParsedXML Version 1.0."

  • [March 28, 2001] "Trailblazing with XPath. [XML@Large.]" By Michael Floyd. In WebTechniques Volume 6, Issue 4 (April 2001), pages 66-69. ['XPath will keep you from getting lost in your document trees whether you're using XSLT or the DOM. Michael Floyd provides guidance.'] "As in desert enduro, finding your way through XML documents isn't always a straightforward task. Fortunately, the designers of XML have included a mechanism, called XPath, that helps you navigate through documents. XPath partly defines a syntax that lets you easily traverse a tree's structure and select one or more of its nodes. Once you've selected a node or nodes, you can manipulate, reorder, or transform them in any way you desire. The mechanism that lets you select tree nodes is called a pattern. A pattern is actually a limited form of what XPath calls location paths. (We'll get to location paths in a moment.) Much of XPath's expression language was originally described in the early XSL specification. Eventually, however, the W3C broke the XSL specification into three parts: XSL, which describes the formatting objects used to display XML elements; the XSL Transformation Language, which lets you transform XML into other formats; and XPath. So it's easy to associate XPath expressions with XSLT. It turns out, however, that these expressions are also useful in other tree-related models, including the Document Object Model (DOM) and XPointer. You can also use XPath expressions as arguments to DOM function calls... Of course, there's a great deal more to XPath than I've described here. In future months, I'll cover the other functions, including number, Boolean, and node-set functions. More importantly, I'll show you how to use them in DOM work and in creating style sheets."

  • [March 27, 2001] "ebXML Specification Released for Public Review." By Michael Meehan. In InfoWorld (March 27, 2001). "Starting this week, the public will be able to get a detailed look at what could be the key to unifying the fragmented world of business-to-business e-commerce, as the public review of electronic business XML (ebXML) gets under way. Included in the standard will be protocols to handle transport routing, trading partner agreements, security, document construction, naming conventions, and business process integration -- the soup-to-nuts menu for online commerce. More than 2,000 people from 30-plus countries have helped develop the ebXML specifications, which are set for final approval in Vienna in May. Behind the 18-month effort are a United Nations e-business trade bureau called UN/CEFACT and a consortium called the Organization for the Advancement of Structured Information Standards, or OASIS. The standards group was led by executives from IBM, Sun, and Microsoft, which contributed some late but important input. The ebXML organizing body last month agreed to incorporate the transport sequence for the Microsoft-backed Simple Object Access Protocol (SOAP), making it far easier for businesses to swap information. SOAP is Microsoft's sole contribution to date. The addition of SOAP is 'a tremendous plus for us,' said Neal Smith, an IT architect at Chevron in San Francisco. 'We have a lot of Microsoft technology, and we like anything that makes it easier for us to use the stuff we have.' He said he hopes ebXML will set basic standards that oil industry exchanges can then build upon. 'Ideally, you can just take the parts you need and leave out the ones you don't, without disrupting anything,' Smith said. T. Kyle Quinn, director of e-business information systems at Boeing in Seattle, has also been involved in the ebXML standard. He argued that users must steer the standard's development. 'The Unix/Windows debate is still alive, and one of the things we want to do is drive the standards discussion to make it go away,' Quinn said. 'The point of e-commerce is we're all supposed to be working together, and it's crucial to keep the standards open.' Most of the work is now done. What remains to be seen is how the public will react..." See "Electronic Business XML Initiative (ebXML)."

  • [March 24, 2001] "A Web Odyssey: From Codd to XML. [Invited Presentation.]" By Victor Vianu (UC San Diego). With 100 references. (so!) Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. "What does the age of the Web mean for database theory? It is a challenge and an opportunity, an exciting journey of rediscovery. These are some notes from the road... What makes the Web scenario different from classical databases? In short, everything. A classical database is a coherently designed system. The system imposes rigid structure, and provides queries, updates, as well as transactions, concurrency, integrity, and recovery, in a controlled environment. The Web escapes any such control. It is a free-evolving, ever-changing collection of data sources of various shapes and forms, interacting according to a exible protocol. A database is a polished artifact. The Web is closer to a natural ecosystem. Why bother then? Because there is tremendous need for database-like functionality to efficiently provide and access data on the Web and for a wide range of applications. And, despite the differences, it turns out that database knowhow remains extremely valuable and effective. The design of XML query and schema languages has been heavily influenced by the database community. XML query processing techniques are based on underlying algebras, and use rewrite rules and execution plans much like their relational counterparts. The use of the database paradigm on the Web is a success story, a testament to the robustness of databases as a field. Much of the traditional framework of database theory needs to be reinvented in the Web scenario. Data no longer fits nicely into tables. Instead, it is self-describing and irregular, with little distinction between schema and data. This has been formalized by semi-structured data. Schemas, when available, are a far cry from tables, or even from more complex object-oriented schemas. They provide much richer mechanisms for specifying exible, recursively nested structures, possibly ordered. A related problem is that of constraints, generalizing to the semi-structured and XML frameworks classical dependencies like functional and inclusion dependencies. Specifying them often requires recursive navigation through the nested data, using path expressions. Query languages also differ significantly from their relational brethren. The lack ofschema leads to a more navigational approach, where data is explored from specific entry points. The nested structure of data leads to recursion in queries, in the form of path expressions. Other paradigms have also proven useful, such as structural recursion... One of the most elegant theoretical developments is the connection of XML schemas and queries to tree automata. Indeed, while the classical theory of queries languages is intimately related to finite-model theory, automata theory has instead emerged as the natural formal companion to XML. Interestingly, research on XML is feeding back into tree automata theory and is re-energizing this somewhat arcane area of language theory. This connection is a recurring theme throughout the paper... In order to meaningfully contribute to the formal foundations of the Web, database theory has embarked upon a fascinating journey of rediscovery. In the process, some of the basic assumptions of the classical theory had to be revisited, while others were convincingly reaffirmed. There are several recurring technical themes. They include extended conjunctive queries, limited recursion in the form of path expressions, ordered data, views, incomplete information, active features. Automata theory has emerged as a powerful tool for understanding XML schema and query languages. The specific needs of the XML scenario have inturn provided feedback into automata theory, generating new lines of research. The Web scenario is raising an unprecedented wealth of challenging problems for database theory -- a new frontier to be explored."

  • [March 24, 2001] "On XML Integrity Constraints in the Presence of DTDs." By Wenfei Fan (Bell Labs and Temple University), and Leonid Libkin (University of Toronto). Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. With 32 references. "Abstract: "The paper investigates XML document specifications with DTDs and integrity constraints, such as keys and foreign keys. We study the consistency problem of checking whether a given specification is meaningful: that is, whether there exists an XML document that both conforms to the DTD and satisfies the constraints. We show that DTDs interact with constraints in a highly intricate way and as a result, the consistency problem in general is undecidable. When it comes to unary keys and foreign keys, the consistency problem is shown to be NP-complete. This is done by coding DTDs and integrity constraints with linear constraints on the integers. We consider the variations of the problem (by both restricting and enlarging the class of constraints), and identify a number of tractable cases, as well as a number of additional NP-complete ones. By incorporating negations of constraints, we establish complexity bounds on the implication problem, which is shown to be coNP-complete for unary keys and foreign keys." Detail: Although a number of dependency formalisms were developed for relational databases, functional and inclusion dependencies are the ones used most often. More precisely, only two subclasses of functional and inclusion dependencies, namely, keys and foreign keys, are commonly found in practice. Both are fundamental to conceptual database design, and are supported by the SQL standard. They provide a mechanism by which one can uniquely identify a tuple in a relation and refer to a tuple from another relation. They have proved useful in update anomaly prevention, query optimization and index design. XML (eXtensible Markup Language) has become the prime standard for data exchange on the Web. XML data typically originates in databases. If XML is to represent data currently residing in databases, it should support keys and foreign keys, which are an essential part of the semantics of the data. A number of key and foreign key specifications have been proposed for XML, e.g., the XML standard (DTD), XML Data, and XML Schema. Keys and foreign keys for XML are important in, among other things, query optimization, data integration, and in data exchange for converting databases to an XML encoding. XML data usually comes with a DTD that specifies how a document is organized. Thus, a specification of an XML document may consist of both a DTD and a set of integrity constraints, such as keys and foreign keys. A legitimate question then is whether such a specification is consistent, or meaningful: that is, whether there exists a (finite) XML document that both satisfies the constraints and conforms to the DTD. In the relational database setting, such a question would have a trivial answer: one can write arbitrary (primary) key and foreign key specifications in SQL, without worrying about consistency. However, DTDs (and other schema specifications for XML) are more complex than relational schemas: in fact, XML documents are typically modeled as node-labeled trees, e.g. in XSL, XQL, XML Schema, XPath, and DOM. Consequently, DTDs may interact with keys and foreign keys in a rather nontrivial way, as will be seen shortly. Thus, we shall study the following family of problems, where C ranges over classes of integrity constraints... We have studied the consistency problems associated with four classes of integrity constraints for XML. We have shown that in contrast to its trivial counterpart in relational databases, the consistency problem is un- decidable for C[K,FK], the class of multi-attribute keys and foreign keys. This demonstrates that the interac- tion between DTDs and key/foreign key constraints is rather intricate. This negative result motivated us to study C{Unary}[K,FK], the class of unary keys and foreign keys, which are commonly used in practice. We have developed a characterization of DTDs and unary constraints in terms of linear integer constraints. This establishes a connection between DTDs, unary constraints and linear integer programming, and allows us to use techniques from combinatorial optimization in the study of XML constraints. We have shown that the consistency problem for C{Unary}[K,FK] is NP-complete. Furthermore, the problem remains in NP for C{Unary}[K-neg,IC-neg], the class of unary keys, unary inclusion constraints and their negations. We have also investigated the implication problems for XML keys and foreign keys. In particular, we have shown that the problem is undecidable for C[K,FK] and it is coNP-complete for C{Unary}[K,FK] constraints. Several PTIME decidable cases of the implication and consistency problems have also been identified. The main results of the paper are summarized in Figure 4. It is worth remarking that the undecidability and NP-hardness results also hold for other schema specifications beyond DTDs, such as XML Schema and the generalization of DTDs proposed in [Y. Papakonstantinou and V. Vianu. 'Type inference for views of semistructured data']. This work is a first step towards understanding the interaction between DTDs and integrity constraints. A number of questions remain open. First, we have only considered keys and foreign keys defined with XML attributes. We expect to expand techniques developed here for more general schema and constraint specifications, such as those proposed in XML Schema and in a recent proposal for XML keys. Second, other constraints commonly found in databases, e.g., inverse constraints, deserve further investigation. Third, a lot of work remains to be done on identifying tractable yet practical classes of constraints and on developing heuristics for consistency analysis. Finally, a related project is to use integrity constraints to distinguish good XML design (specification) from bad design, along the lines of normalization of relational schemas. Coding with linear integer constraints gives us decidability for some implication problems for XML constraints, which is a first step towards a design theory for XML specifications." Note the longer version of the paper referenced on Wenfei Fan's web site. [cache]

  • [March 24, 2001] "XML with Data Values: Typechecking Revisited." By Noga Alon (Tel Aviv University), Tova Milo (Tel Aviv University), Frank Neven (Limburgs Universitair Centrum), Dan Suciu (University of Washington), and Victor Vianu (UC San Diego). Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. Abstract: "We investigate the typechecking problem for XML queries: statically verifying that every answer to a query conforms to a given output DTD, for inputs satisfying a given input DTD. This problem had been studied by a subset of the authors in a simplified framework that captured the structure of XML documents but ignored data values. We revisit here the typechecking problem in the more realistic case when data values are present in documents and tested by queries. In this extended framework, typechecking quickly becomes undecidable. However, it remains decidable for large classes of queries and DTDs of practical interest. The main contribution of the present paper is to trace a fairly tight boundary of decidability for typechecking with data values. The complexity of typechecking in the decidable cases is also considered." Details: "Databases play a crucial role in new internet applications ranging from electronic commerce to Web site management to digital government. Such applications have redefined the technological boundaries of the area. The emergence of the Extended Markup Language (XML) as the likely standard for representing and exchanging data on the Web has confirmed the central role of semistructured data but has also redefined some of the ground rules. Perhaps the most important is that XML marks the 'return of the schema' (albeit loose and flexible) in semistructured data, in the form of its Data Type Definitions (DTDs), which constrain valid XML documents. The benefits of DTDs are numerous. Some are analogous to those derived from schema information in relational query processing. Perhaps most importantly to the context of the Web, DTDs can be used to validate data exchange. In a typical scenario, a user community would agree on a common DTD and on producing only XML documents which are valid with respect to the specified DTD. This raises the issue of (static) typechecking: verifying at compile time that every XML document which is the result of a specified query applied to a valid input document, satisfies the output DTD... On the decidability side, we show that typechecking is decidable for queries with non-recursive path expressions, arbitrary input DTD, and output DTD specifying conditions on the number of children of nodes with a given label. We are able to extend this to DTDs using star-free regular expressions, and then full regular expressions, by increasingly restricting the query language. We also establish lower and upper complexity bounds for our typechecking algorithms. The upper bounds range from pspace to non-elementary, but it is open if these are tight. The lower bounds range from co-np to pspace . On the undecidability side, we show that typechecking be- comes undecidable as soon as the main decidable cases are extended even slightly. We mainly consider extensions with recursive path expressions in queries, or with types decoupled from tags in DTDs (also known as specialization). This traces a fairly tight boundary for the decidability of typechecking with data values... The main contribution of the present paper is to shed light on the feasibility of typechecking XML queries that make use of data values in XML documents. The results trace a fairly tight boundary of decidability of typechecking. In a nutshell, they show that typechecking is decidable for XML-QL-like queries without recursion in path expressions, and output DTDs without specialization. As soon as recursion or specialization are added, typechecking becomes undecidable..." [cache]

  • [March 24, 2001] "Representing and Querying XML with Incomplete Information." By Serge Abiteboul (INRIA), Luc Segoufin (INRIA), and Victor Vianu (UC San Diego). Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. With 25 references. Abstract: "We study the representation and querying of XML with incomplete information. We consider a simple model for XML data and their DTDs, a very simple query language, and a representation system for incomplete information in the spirit of the representations systems developed by Imielinski and Lipski for relational databases. In the scenario we consider, the incomplete information about an XML document is continuously enriched by successive queries to the document. We show that our representation system can represent partial information about the source document acquired by successive queries, and that it can be used to intelligently answer new queries. We also consider the impact on complexity of enriching our representation system or query language with additional features. The results suggest that our approach achieves a practically appealing balance between expressiveness and tractability. The research presented here was motivated by the Xyleme project at INRIA, whose objective it is to develop a data warehouse for Web XML documents... The main contribution of this paper is a simple framework for acquiring, maintaining, and querying XML documents with incomplete information. The framework provides a model for XML documents and DTDs, a simple XML query language, and a representation system for XML with incomplete information. We show that the incomplete information acquired by consecutive queries and answers can be effciently represented and incrementally refined using our representation system. Queries are handled effciently and exibly. They are answered as best possible using the available information, either completely, orby providing an incomplete answer using our representation system. Alternatively, full answers can be provided by completing the partial information using additional queries to the sources, guaranteed to be non-redundant. Our framework is limited in many ways. For example, we assume that sources provide persistent node ids. Order in documents and DTDs is ignored, and is not used by queries. The query language is very simple, and does not use recursive path expressions and data joins. In order to trace the boundary of tractability, we considered several extensions to our framework and showed that they have significant impact on handling incomplete information, ranging from cosmetic to high complexity or undecidability. This justifies the particular cocktail of features making up our framework, and suggests that it provides a practically appealing solution to handling incomplete information in XML." See: "Xyleme Project: Dynamic Data Warehouse for the XML Data of the Web." [cache]

  • [March 24, 2001] "Xyleme, une start-up de l'Inria pour structurer le Web en XML." From 01net.com. March 01, 2001. Xyleme veut structurer les données sémantiques du Web en XML. Objectif? Construire un moteur de recherche professionnel, interrogeable à partir du systhme d'information de l'entreprise." ["The Web is moving from HTML to XML, with all the major players, Microsoft, IBM, Oracle, content providers, B2B enablers, behind this revolution. Xyleme exploits this revolution to create a new service through an indexed XML repository that stores Web knowledge and that is capable of answering queries from applications and users. The outcome is a seamless integration between the web and corporate information systems... Xyleme is designed to store, classify, index and monitor XML data on the Web. The emphasis is on high level services that are difficult or impossible to support with the current Web technologies. In particular, we consider more complex query processing than the simple keyword search of actual search engines, semantic data integration and sophisticated monitoring of changes..."] See: "Xyleme Project: Dynamic Data Warehouse for the XML Data of the Web."

  • [March 24, 2001] "SCHUCS: A UML-Based Approach for Describing Data Representations Intended for XML Encoding." By Michael Hucka (Systems Biology Workbench Development Group ERATO Kitano Systems Biology Project). 'Version of 11 December 2000'. UML to XML Schema mappings. Note: this document supplements the SBML Level 1 final specification, which uses a simple UML-based notation to describe the data structures: Systems Biology Markup Language (SBML) Level 1: Structures and Facilities for Basic Model Definitions." See the corresponding news item on SBML. "There are three main advantages to using UML class diagrams as a basis for defining data structures. First, compared to using other notations or a programming language, the UML visual representations are generally easier to read and understand by readers who are not computer scientists. Second, the visual notation is implementation-neutral -- the defined structures can be encoded in any concrete implementation language, not just XML but other formats as well, making the UML-based definitions more useful and exible. Third, UML is a de facto industry standard, documented in many books and available in many software tools including mainstream development environments (such as Microsoft Visual Basic 5 Enterprise Edition). Readers are therefore more likely to be familiar with it than other notations. Readers do not need to know UML in advance; this document provides descriptions of all the constructs used. The notation presented here can be expressed not only in graphical diagram form (which is what UML is all about) but also in textual form, allowing descriptions to be easily written in a text editor and sent as plain-text email. The scope of the notation is limited to classes and their attributes, not class methods or operations. One of the goals of this effort has been to develop a consistent, systematic method for translating UML-based class diagrams into XML Schemas. Another goal has been to maintain a reasonably simple notation and UML-to-XML mapping. An important side-effect of this is that the vocabulary of the notation is purposefully limited to only a small number of constructs. It is explicitly not intended to cover the full power of UML or XML. This limited vocabulary has nevertheless been sufficient for the applications to which it has been applied so far in the Systems Biology workbench project... The notation proposed in this document is based on a subset of what could be used and what UML provides. It is not intended to cover the full scope of UML or XML. The subset was chosen to be as simple as possible yet allow the expression of the kinds of data structures that need to be encoded in XML for the ERATO Kitano Systems Biology workbench. The notation proposed here is not carved in stone, and will undoubtedly continue to evolve..." See: "Systems Biology Markup Language (SBML)." [cache]

  • [March 24, 2001] "RDF Protocol." By Ken MacLeod. March 24, 2001. "RDF Protocol is simple structured text alternative to standard ASCII line-oriented protocol (as used in FTP, NNTP, SMTP, et al.). RDF Protocol also subsumes the features of RFC-822-style headers as used in MIME, SMTP, and HTTP." Includes Core RDF Protocol; IRC in RDF Protocol; Replication in RDF Protocol. [From the posting: 'Toying With an Idea: RDF Protocol': "RDF Protocol really isn't a protocol so much as setting down some conventions for passing bits of RDF around. Well, ok, some of the bits work a lot like a protocol, so it's gotta look like that, but here goes... I'm playing with a Python implementation of the basic message read/write and using IRC as the example protocol to emulate, using Dave Beckett's IRC in RDF schema. In case anyone was wondering, there are no APIs and no RPCs at this layer, it's all XML instance passing, with RDF triples as the content..." See "Resource Description Framework (RDF)."

  • [March 24, 2001] "DocBook TREX Schema V4.1.2.2." From Norman Walsh. 03-12-01. DocBook TREX Schema V4.1.2.2 "is the current experimental TREX Schema version of DocBook. This version was (mostly) generated automatically from the RELAX version. This version is available as a zip archive. Includes: docbook.trex (the DocBook TREX Schema); dbhier.trex (the DocBook TREX Schema 'hierarchy' module); dbpool.trex (the DocBook TREX Schema 'information pool' module); dbtables.trex (the DocBook TREX Schema tables module); text.xml (a test document). See: "Tree Regular Expressions for XML (TREX)." Also: (1) RELAX DocBook schema; (2) W3C XML DocBook schema. [cache]

  • [March 24, 2001] "SOAP Toolkit 2.0: New Definition Languages Expose Your COM Objects to SOAP Clients." By Carlos C. Tapang. From MSDN Online. March 20, 2001, "April 2001" issue. ['This article describes a custom tool, IDL2SDL, which takes an IDL file and produces Web Services Description Language (WSDL) and Web Services Meta Language (WSML) files without waiting for a DLL or TLB file to be generated. This article assumes you're familiar with XML, SOAP, COM, and Visual C++.'] "In SOAP Toolkit 2.0, the Services Description Language (SDL) has been replaced with the Web Services Description Language (WSDL) and the Web Services Meta Language (WSML). WSDL and WSML files describe the interfaces to a service and expose COM objects to SOAP clients. This article describes a custom tool, IDL2SDL, which takes an IDL file and produces WSDL and WSML files without waiting for a DLL or TLB file to be generated. Also shown is a customized development environment in which WSDL and WSML files automatically reflect the changes to IDL files... When the November 2000 release of the Microsoft SOAP Toolkit 1.0 became widely available, I wrote an Interface Description Language (IDL) to Service Description Language (SDL) translator, which I named IDL2SDL. Since SDL has been replaced with Web Services Description Language (WSDL) and Web Services Meta Language (WSML) in version 2.0 of the SOAP Toolkit, I have rewritten the translator to support WSDL and WSML. In this article I will explain how to use the translator and introduce version 2.0 of the SOAP Toolkit. You will get to know IDL2SDL and learn how to incorporate it into your development environment. The tool is available at http://www.infotects.com/IDL2SDL, together with a very simple C++ sample COM object on the server side and a Visual Basic-based app on the client side. This tool is free, and I welcome questions and suggestions for improvement. The WSDL and WSML files describe the interfaces to your service and expose your COM object to SOAP clients. The SOAP Toolkit already provides the WSDLGenerator tool. The generator derives the service description from the object's TypeLib. (TypeLib is usually embedded in the DLL file in which a COM component resides.) Whereas the WSDLGenerator tool is very well-suited for situations in which you only want to reuse available components in a Web service, the IDL2SDL tool is more appropriate for situations in which you are designing your server components completely from the ground up. During development, interface specifications can change often, even during testing. The IDL2SDL utility allows you to change your IDL file and produce both WSDL and WSML files without having to wait for the DLL or TLB file to be generated. You can set up your development environment with IDL2SDL such that your WSDL and WSML files automatically reflect the changes to your IDL file. In a later section, I will describe the simple steps you need to take to make IDL2SDL part of the Visual Studio development environment. Since SOAP is designed to be universal, it is applicable to remote procedure call component architectures other than COM. Likewise, IDL can express interface contracts for component architectures other than COM. There is no standard for IDL, but IDL2SDL can be modified to easily accommodate inputs for the Microsoft MIDL and for the DCE IDL compiler... The sample Web service shown here demonstrates that version 2.0 of the SOAP Toolkit is a completely different implementation from version 1.0. However, it is just as easy to use. Like version 1.0, version 2.0 accommodates both users who just want to expose their COM object to SOAP and users who have a need to generate the SOAP messages. The IDL2SDL tool even makes it easier by automating the production of WSDL, WSML, and ASP files. The IDL2SDL tool is freely available, but it is not part of the SOAP Toolkit. This tool was built using the Flex lexical analyzer and the BISON parser generator, which are available from http://www.monmouth.com/~wstreett/lex-yacc/lex-yacc.html. The sample files and tools are also available from the Infotects Web site." [Note: the SOAP Toolkit 2.0 Beta 2 available for download has "several major enhancements, including a new ISAPI listener and support for simple arrays."] See: "Web Services Description Language (WSDL)."

  • [March 24, 2001] "XML Web Service-Enabled Office Documents." By Chris Lovett. In MSDN Column 'Extreme XML'. March 22, 2001. ['Chris Lovett explores Office XP and .NET Web Services, and how you can use them together to deliver powerful desktop solutions for your business.'] "Are you ready for a marriage of Microsoft Office XP and .NET Web Services? In a networked world of B2B e-commerce, why not deliver the power of Web Services to the end user by integrating business process workflow right into everything people do from their desktop? What am I talking about? Well, an Excel spreadsheet that looks something like [Figure 1]... This is not just an ordinary spreadsheet. It uses UDDI to find company addresses and it uses a Catalog Web Service to find product information. It also does an XML transform on the XML spreadsheet format to generate a RosettaNet PIP 3 A4 Purchase Order Request format when you click the Send button. When you type in the name of the company you are purchasing from, and then click on the Find button, some VBA code behind the spreadsheet makes a UDDI call and fills out the rest of the address section... When you type in a quantity of, say, 23, in the 'Purchase From" field and then the term Pear in the description field, then press the TAB key, some VBA code queries a SOAP Catalog Web Service to see if it can find a matching product, then it fills out the details... When you're done, you click the Send button and the RosettaNet PIP 3 A4 XML Purchase Order format is generated, and the order is sent..." With sample code. See also "UDDI: An XML Web Service." References: (1) UDDI; SOAP; (3) RosettaNet.

  • [March 23, 2001] "Software Verification and Functional Testing with XML Documentation." By Ernest Friedman-Hill. In Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS-34), edited by R. H. Sprague. Los Alamitos, CA, USA: IEEE Computer Society. Meeting: January 3-6, 2001. Maui, Hawaii. Abstract: "Continuous testing is an important aspect of achieving quality during rapid software development. By making the user documentation for a software product into part of its testing machinery, we can leverage each to benefit the other. The documentation itself can be automatically tested and kept in a state of synchronization with the software. Conversely, if the documentation can be machine interpreted, evaluation of the software's adherence to this description simultaneously verifies the documentation and serves as a functional test of the software. This paper presents an application of these ideas to a real project, the manual for Jess, the Java Expert System Shell. The Jess manual is rich in machine-interpretable information and is used in several distinct modes within Jess' extensive functional and unit test suites. The effort to maintain the accuracy and completeness of Jess's documentation has dropped significantly since this method was put in place." [Note: "Jess is a rule engine and scripting environment written entirely in Sun's Java language by Ernest Friedman-Hill at Sandia National Laboratories in Livermore, CA. Jess was originally inspired by the CLIPS expert system shell, but has grown into a complete, distinct Java-influenced environment of its own. Using Jess, you can build Java applets and applications that have the capacity to 'reason' using knowledge you supply in the form of declarative rules."] Details: "The Jess project is primarily a research project. While the basic syntax of the Jess language stays relatively constant, features are added and removed on a regular basis as requirements evolve and new ideas are tried out. Nevertheless, Jess is a small project, supported by one person working part-time. Taken together, the small project size, the dynamic nature of the software itself, and the large user base make the problem of maintaining up-to-date documentation for Jess particularly acute... It is also very easy to extend the Jess language with new commands written in Java or in Jess itself, and so the Jess language can be customized for specific applications. Jess is therefore used in a range of different ways, meaning that its documentation must cover many topics. The software is in use at hundreds of sites around the world in industries including e-commerce, insurance sales, telecommunications, and R&D, so the documentation must be of sufficient quality and completeness to satisfy the broad user base. If documentation were interpretable by computer, then the behaviour described in the documentation could be verified by the test machinery. Writing documentation would no longer be a 'superfluous' activity, but instead it would be an integral part of the development process. Inaccurate documentation becomes as serious as any other bug detected during testing. We have applied this technique to a real project, the ongoing development of Jess', the Java Expert System Shell, using XML as the documentation format. This paper describes this effort and suggests some potential enhancements for future work... The validation system described here proved itself to be very useful in the development process from Jess 4.0 to 5.1. The effort required to maintain good user documentation was greatly reduced. Approximately ten alpha and beta releases of Jess over the space of a year were made, and each shipped with a completely up-to-date manual. All of the examples in each of the manuals were correct; conversely, the software always performed as described in the manual. Many extensions to this scheme are possible. The possibility for expanded use of <functiondef> has already been implied. If the argument and return-value descriptions were machine readable, then a series of simple tests for every documented function could be automatically generated to verify that the types and number of arguments, and the type and sometimes identity of the return value, adhered to the documentation. Another possibility would be the confirmation of the existence and signature of Java API functions mentioned in the manual. A special tag is already used to format such references in the printed documentation. Again, it should be possible to automatically generate some very simple unit tests for such functions."

  • [March 23, 2001] "Using XML/XMI for Tool Supported Evolution of UML Models." By F. Keienburg and Andreas Rausch (Institut für Informatik, Technische Universität München). In Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS-34). Edited by: R. H. Sprague. With 19 references. Los Alamitos, CA, USA: IEEE Computer Society, 2001. Meeting: January 3-6, 2001. Maui, Hawaii. Abstract: "Software components developed with modern tools and middleware infrastructures undergo considerable reprogramming before they become reusable. Tools and methodologies are needed to cope with the evolution of software components. We present some basic concepts and architectures to handle the impacts of the evolution of UML models. With the proposed concepts, an infrastructure to support model evolution, data schema migration, and data instance migration based on UML models can be realized. To describe the evolution path we use XML/XMI files." Details: "One needed important thing for delivering transparent model changes is a neutral model specification format. For reasons of currently becoming a respected standard and being adopted by a lot of UML Case Tools vendors, XMI is chosen in this architecture as a neutral exchange format between different Case Tools. In addition there is a explosion of tools for handling XML documents very comfortable. The XMI standard specifies with a Document Definition Type (DTD), how UML models are mapped into a XML file. Besides this functionality XMI also specifies how model changes can be easily mapped into an XML document. Therefore XMI is a very good solution for solving some of the requested requirements for UML model evolution... XMI specifies a possibility for transmitting metadata differences. The goal is to provide a mechanism for specifying the differences between documents in a way that the entire document does not need to be transmitted each time. This is especially important in a distributed and concurrent environment where changes have to be transmitted to other users or applications very quickly. This design does not specify an algorithm for computing the differences, just a form of transmitting them. Only occurring model changes are transmitted. In this way different instances of a model can be maintained and synchronized more easily and economically. The idea is to transmit only the changes made to the model together with the necessary information to be able to apply the necessary changes to the old model. With this information you have the possibility for model merging. This means you can combine difference information plus a common reference model to construct the appropriate new model. A important remark to this topic is that model changes are time sensitive. This means changes must be handled in the exact chronological order for achieving the wanted result... In this paper we have shown that modern middleware infrastructures for the development of distributed applications provide rich support for model based development and code generation. But there is almost no support in case of model evolution. We have introduced some concepts and architectures to realize a tool supporting model evolution and data migration and to integrate this tool in modern infrastructures. To specify the model evolution the developer should use an XMI based difference description. Based on this concepts we have already implemented a first prototype. This is a very primitive version but it is already integrated in our framework AutoMate. Based on this experience we have realized the new version of the tool called ShapeShifter. ShapeShifter is now a stand alone tool supporting model evolution and data migration on top of Versant's object-oriented database. With ShapeShifter you specify the model difference in XMI and the model and the database are automatically migrated. ShapeShifter is now used in a first industrial project. The next step will be a complete integration in a CASE tool. Currently one can export and import XMI model files from some CASE tools. But for a full integration of ShapeShifter we need more sophisticated tools to generate the XMI difference file from to XMI based model versions. Moreover we plan to integrate ShapeShifter into several Enterprise Java Beans Container." Paper also available in Postscript format. See "XML Metadata Interchange (XMI)." [cache]

  • [March 23, 2001] "Tip: Using JDOM and XSLT. How to find the right input for your processor." By Brett McLaughlin (Enhydra strategist, Lutris Technologies). From IBM developerWorks. March 2001. ['In this tip, Brett McLaughlin tells how to avoid a common pitfall when working with XSLT and the JDOM API for XML developers working in Java. You'll learn how to take a JDOM document representation, transform it using the Apache Xalan processor, and obtain the resulting XML as another JDOM document. Transforming a document using XSLT is a common task, and JDOM makes the transformation go quite easily once you know how to avoid the missteps. The code demonstrates how to use JDOM with the new Apache Xalan 2 processor (for Java).' "Being one of the co-creators of JDOM, I simply couldn't pass up the chance to throw in a few JDOM tips in a series of XML tips and tricks. This tip provides the answer to one of the most common questions I get about JDOM: 'How do I use JDOM and XSLT together?' People aren't sure how to take a JDOM Document object and feed it into an XSLT processor. The confusion often arises because most XSLT processors take either DOM trees or SAX events as input streams. In other words, there is not one obvious way to provide a JDOM Document as input in all cases. So how do you interface JDOM with those processors? The key to solving this problem is understanding the input and output options. First determine the input formats that your XSLT processor accepts. As I mentioned above, you'll usually be able to feed a DOM tree or I/O stream into the processor. But which of those is the faster solution? You're going to have to do a little digging to answer that question. That's right, I'm not going to give you a specific answer, but a method for figuring it out..."

  • [March 23, 2001] "xADL: Enabling Architecture-Centric Tool Integration With XML." By Rohit Khare, Michael Guntersdorfer, Nenad Medvidovic, Peyman Oreizy, and Richard N. Taylor. In Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS-34), edited by R. H. Sprague. Los Alamitos, CA, USA: IEEE Computer Society. With 29 references. Meeting: January 3-6, 2001. Maui, Hawaii. Abstract: "In order to support architecture-centric tool integration within the ArchStudio 2.0 Integrated Development Environment (IDE), we adopted Extensible Markup Language (XML) to represent the shared architecture-in-progress. Since ArchStudio is an architectural style-based development environment that incorporates an extensive number of tools, including commercial off-the-shelf products, we developed a new, vendor-neutral, ADL-neutral interchange format called Extensible Architecture description Language (xADL), as well as a "vocabulary" specific to the C2 style (xC2). This paper outlines our vision for representing architectures as hypertext, the design rationale behind xADL and xC2, and summarizes our engineering experience with this strategy." Details: "A future Unified Modeling Language (UML) graphical editor could produce SVG documents which could be transparently annotated with xADL and xC2 descriptions of the components and connectors those boxes and lines represent. Second, the approach we have adopted in xADL can be easily extended to support multiple architecture description languages (ADLs), even within a single XML schema. Our extensive study of ADLs has indicated that most all mainstream ADLs agree on the existence of components, connectors, and their configurations. A small number of ADLs, including Rapide and Darwin, do not explicitly model connectors. However, even these ADLs support simple component interconnections; furthermore, Rapide employs specialized "connection components" to support more complex interactions. Additionally, all ADLs model component interfaces and do so in a relatively uniform fashion. Therefore, these shared aspects of ADLs would become part of the basic xADL schema. That basic schema could then be extended in a number of ways to represent the varying parts of architectural descriptions across ADLs, such as the manner in which ADLs model architectural semantics, support evolution (both at system design time and run time), constrain the architecture (and its evolution), and so forth. Thus, for example, an xADL schema could simultaneously describe architectures specified in C2SADEL and Wright. If a particular tool is interested in the static model of behavior, it would access C2SADEL's component invariants and pre-and postconditions; alternately, if the tool is interested in the system's dynamic semantics, it would access Wright's CSP-related items and ignore others. Another possibility that xADL affords us is the support for multiple configurations of the same set of components, where we access the part of the schema representing the specific configuration we are interested in, disregarding all other configurations... We adopted XML as a key technology for enabling architecture-centric tool integration in the ArchStudio 2.0 IDE. The C2 style eased the evolution from the previous version's custom text file format, C2SADEL, to a generic XML AST as the repository. This had immediate benefits for integrating several tools' data in the same file, for annotating existing data without interfering with its original use, and for hyperlinking to external data transparently. Furthermore, we developed a new ontology for describing entire families of Architecture Description Languages (ADLs). By extracting the five most common abstractions and their relations into a top-level xADL namespace, we were able to separately represent data specific to the C2 architectural style and C2SADEL in a subsidiary xC2 namespace. These technologies directly aided a strictly distributed team to integrate a substantial set of research and commercial tools within ArchStudio 2.0. Our eventual aim is even wider, to support Internet-scale development, with potentially large and varying developer communities composing systems over long times and distances. Representing architectures as hypertext affords us reach; extracting our ontology in XML promises depth, through integration with generic, non-ADL-aware XML applications." See also the xADL discussion and references. [cache]

  • [March 23, 2001] "Structured Data Exchange Format (SDXF)." By M. Wildgrube. Network Working Group, Request for Comments 3072. March 2001. "This document specifies a data exchange format and, partially, an API that can be used for creating and parsing such a format. The IESG notes that the same problem space can be addressed using formats that the IETF normally uses including ASN.1 and XML. The document reader is strongly encouraged to carefully read section 13 before choosing SDXF over ASN.1 or XML. Further, when storing text in SDXF, the user is encourage to use the datatype for UTF-8, specified in section 2.5." Abstract: "This specification describes an all-purpose interchange format for use as a file format or for net-working. Data is organized in chunks which can be ordered in hierarchical structures. This format is self-describing and CPU-independent." Compare ASN.1: "The idea behind ASN.1 is: On every platform on which a given application is to develop descriptions of the used data structures are available in ASN.1 notation. Out off these notations the real language dependent definitions are generated with the help of an ASN.1-compiler. This compiler generates also transform functions for these data structures for to pack and unpack to and from the BER (or other) format. A direct comparison between ASN.1 and SDXF is somehow inappropriate: The data format of SDXF is related rather to BER (and relatives). The use of ASN.1 to define data structures is no contradiction to SDXF, but: SDXF does not require a complete data structure to build the message to send, nor a complete data structure will be generated out off the received message." SDXF vs. XML: "On the one hand SDXF and XML are similar as they can handle any recursive complex data stream. The main difference is the kind of data which are to be maintained: (1) XML works with pure text data (though it should be noted that the character representation is not standardized by XML). And: a XML document with all his tags is readable by human. Binary data as graphic is not included directly but may be referenced by an external link as in HTML... (2) SDXF maintains machine-readable data, it is not designed to be readable by human nor to edit SDXF data with a text editor (even more if compression and encryption is used). With the help of the SDXF functions you have a quick and easy access to every data element..." [cache]

  • [March 23, 2001] Examplotron 0.1." By Eric van der Vlist (Dyomedea). "The purpose of examplotron is to use instance documents as a lightweight schema language -- eventually adding the information needed to guide a validator in the sample documents. 'Classical' XML validation languages such as DTDs, W3C XML Schema, Relax, Trex or Schematron rely on a modeling of either the structure (and eventually the datatypes) that a document must follow to be considered as valid or on the rules that needs to be checked. This modeling relies on specific XML serialization syntaxes that need to be understood before one can validate a document and is very different from the instance documents and the creation of a new XML vocabulary involves both creating a new syntax and mastering a syntax for the schema. Many tools (including popular XML editors) are able to generate various flavors of XML schemas from instance documents, but these schemas do not find enough information in the documents to be directly useable leaving the need for human tweaking and the need to fully understand the schema language. Examplotron may then be used either as a validation language by itself, or to improve the generation of schemas expressed using other XML schema languages by providing more information to the schema translators..." From the XML-DEV posting: "Beating Hook, Rick Jelliffe's single element schema language has been quite a challenge, but I am happy to announce examplotron a schema language without any element. Although examplotron does include an attribute, this attribute is optional and you can build quite a number of schemas without using it and I think it fair to say that examplotron is the most natural and easy to learn XML schema language defined up to know ;=) ... The idea beyond examplotron -and the reason why it's so simple to use- is to define schemas giving sample documents. Although examplotron can be used as a standalone tool, it can also be used to generate schemas for more classical -and powerful- languages and I don't think it will compete with them but rather complement them. Thanks for your comments..." See also: (1) the XML-DEV posting, and (2) "XML Schema Element and Attribute Validator." For schema description and references, see "XML Schemas."

  • [March 22, 2001] "When less is more: a compact toolkit for parsing and manipulating XML. Designing a fast and small XML toolkit by applying object-oriented techniques." By Graham Glass (CEO/Chief architect, The Mind Electric). From IBM developerWorks. March 2001. ['This article describes the design and implementation of an intuitive, fast and compact (40K) Java toolkit for parsing and manipulating XML -- Electric XML -- the XML engine of the author's company. It shows one way to apply object-oriented techniques to the creation of an XML parser, and it provides useful insight into API design. The source code for the non-validating parser described in this article may be downloaded and used freely for most commercial uses.'] "XML is finding its way into almost every aspect of software development. For example, SOAP, the rapidly emerging standard that is likely to replace CORBA and DCOM as the network protocol of choice, uses XML to convey messages between Web services. When my company decided to create a high performance SOAP engine, we started by examining the existing XML parsers to see which would best suit our needs. To our surprise, we found that the commercially available XML parsers were too slow to allow SOAP to perform as a practical replacement for technologies like CORBA and RMI. For example, parsing the SOAP message in Listing 1 took one popular XML parser about 2.7 milliseconds... Our initial experiments indicated that we could build a small, fast, intuitive toolkit for parsing and manipulating XML documents that would allow our distributed-computing platform to approach the performance of existing traditional systems. We decided to complete the parser and make it available to the developer community, partly to earn some good karma, and partly to demonstrate that powerful toolkits do not need to be large or complex. I personally yearn for the days of Turbo Pascal when companies shipped full-blown development and runtime environments that took up just 30K! The main design decisions were: (1) Selecting a hierarchy for the object model that fitted naturally with the tree structure of an XML document; (2) Pushing the knowledge of how to parse, print, and react to removal into each 'smart' node; (3) Using a Name object to represent a namespace-qualified name; (4) Allowing get and remove operations to accept an XPath expression; (5) Using selection nodes to keep track of XPath result sets. The resulting parser achieves the goal of processing a SOAP message about as quickly as RPC over RMI. Table 1 shows a comparison of parsing the sample SOAP message in Listing 1 with the production release of Electric XML and with a popular DOM parser 10,000 times and calculating the average time to parse the document... [Popular DOM-based parser: 2.7 milliseconds; Electric XML: 0.54 milliseconds]. I hope that the article provides useful examples of object-oriented design in action, as well as an instance of the adage "less is more." I hope also that Electric XML might prove useful for your XML development efforts." The the source code for the Electric XML parser is available for download. Article also in PDF. [cache]

  • [March 22, 2001] "XOIP: XML Object Interface Protocol." By Morten Kvistgaard Nielsen and Allan Bo Jørgensen. Centre for Object Technology, COT/3-34-V1.0. 116 pages. [Master's Thesis, Department of Computer Science, Aarhus University, 2001.] "XOIP describes a way in which heterogeneous networked embedded systems can interface to a variety of distributed object architectures using XML. An implementation of XOIP is available for download. This document is a thesis for the Masters Degree in Computer Science at the University of Aarhus... In this thesis we shall present our solution to the problem of achieving interoperability between heterogeneous distributed object architectures and paradigms. What makes our solution special is that it is specifically designed to address the problems faced by embedded systems, where lack of system resources have hitherto prevented their participation in distributed object systems. Since embedded systems are more likely to be placed in heterogeneous object systems than their desktop counterparts, the two issues are naturally linked." [cache]

  • [March 22, 2001] "Gates Unveils Hailstorm." By Barbara Darrow. In Computer Reseller News (March 19, 2001). "Microsoft Chairman Bill Gates Monday unveiled Hailstorm, one more step in the company's attempt to transform itself into a provider of software-as-services. Hailstorm -- which the company positions as a set of user-centric services to ease e-commerce and Web applications--is not slated for production until 2002. These services theoretically will enable users with any Web-connected devices, including handheld machines and cell phones, to easily and securely access applications and information on the Net... Similar to Novell's DigitalMe service unveiled two years ago, Hailstorm will let a user log on once to the system, which would then remember critical information, including passwords to diverse Web sites and services. Other services will include calendar, address book, notification and authentication.CRN first broke the story of the Hailstorm platform, called by one source as Microsoft Passport on steroids, in January. Microsoft made a design preview of the service available Monday and brought a number of potential partners -- including eBay, Groove Networks and American Express--onstage for demonstrations. By integrating Hailstorm services with its own auction APIs, for example, eBay would enable its own users to get realtime notification when someone has overbid them on a planned purchase. Similarly, American Express Blue Card users trying to order an out-of-stock book would receive notification from the merchant when the title is back in stock, and then click on that message to initiate the transaction... Certain base-level functionality -- such as single log-in -- will continue to be offered for free, but users will be charged for value-added services and on usage, company executives say. Still, it remains to be seen whether Microsoft, whose relationships with partners have been problematic at times, will be the partner of choice here." See: "Microsoft Hailstorm."

  • [March 22, 2001] "Interview: Tim Berners-Lee on the W3C's Semantic Web Activity." By Edd Dumbill. From XML.com. March 21, 2001. ['The World Wide Web Consortium has recently embarked on a program of development on the Semantic Web. This interview outlines the vision behind the new Activity, and how it relates to XML in general.'] "Tim Berners-Lee: The W3C operates at the cutting edge, where relatively new results of research become the foundations for products. Therefore, when it comes to interoperability these results need to become standards faster than in other areas. The W3C made the decision to take the lead -- and leading-edge -- in web architecture development. We've had the Semantic Web roadmap for a long time. As the bottom layer becomes stronger, there's at the same time a large amount falling in from above. Projects from the areas of knowledge representation and ontologies are coming together. The time feels right for W3C to be the place where the lower levels meet with the higher levels: the research results meeting with the industrial needs... We always design the Activity to suit the needs of the community at the time. Examples of infrastructural work in which we did this are the HTTP, URI, and XML Signature work. We wanted the attention of the community experts, and things required wide review. More of our Activities and working groups are moving toward a more public model; XML Protocol is a perfect example. SW needs to be really open, as many resources for its growth are from the academic world. We need people who may at some point want to give the group the benefit of their experience, without having a permanent relationship with the consortium. It's not particularly novel. It's combining the RDF Interest Group with W3C internal development stuff. We need to find what the Knowledge Representation community have got that's ripe for standardization, and what it hasn't and so on. Coordination will be very important." See: "XML and 'The Semantic Web'."

  • [March 22, 2001] "Tutorial: An Introduction to Scalable Vector Graphics." By J. David Eisenberg. From XML.com. March 21, 2001. ['This introduction to SVG teaches you all you need to know about the W3C's vector graphics format in order to start putting it to use in your own web applications.'] "If you're a web designer who's worked with graphics, you may have heard of Scalable Vector Graphics (SVG). You may even have downloaded a plug-in to view SVG files in your browser. The first and most important thing to know about SVG is that it isn't a proprietary format. On the contrary, it's an XML language that describes two-dimensional graphics. SVG is an open standard, proposed by the W3C... This article gives you all the basic information you need to start putting SVG to use. You'll learn enough to be able to make a handbill for a digital camera that's on sale at the fictitious MegaMart..." [From the W3C SVG Web site: " SVG is a language for describing two-dimensional graphics in XML. SVG allows for three types of graphic objects: vector graphic shapes (e.g., paths consisting of straight lines and curves), images and text. Graphical objects can be grouped, styled, transformed and composited into previously rendered objects. Text can be in any XML namespace suitable to the appplication, which enhances searchability and accessibility of the SVG graphics. The feature set includes nested transformations, clipping paths, alpha masks, filter effects, template objects and extensibility. SVG drawings can be dynamic and interactive. The Document Object Model (DOM) for SVG, which includes the full XML DOM, allows for straightforward and efficient vector graphics animation via scripting. A rich set of event handlers such as onmouseover and onclick can be assigned to any SVG graphical object. Because of its compatibility and leveraging of other Web standards, features like scripting can be done on SVG elements and other XML elements from different namespaces simultaneously within the same Web page."] See: "W3C Scalable Vector Graphics (SVG)."

  • [March 22, 2001] "Perl & XML: Using XML::Twig." By Kip Hampton. From XML.com. March 21, 2001. ['XML::Twig provides a fast, memory-efficient way to handle large XML documents, which is useful when the needs of your application make using the SAX interface overly complex.'] "If you've been working with XML for a while it's often tempting frame solutions to new problems in the context of the tools you've used successfully in the past. In other words, if you are most familiar with the DOM interface, you're likely to approach new challenges from a more-or-less DOMish perspective. While there's plenty to be said for doing what you know will work, experience shows that there is no one right way to process XML. With this in mind, Michel Rodriguez's XML::Twig embodies Perl's penchant for borrowing the best features of the tools that have come before. XML::Twig combines the efficiency and small footprint of SAX processing with the power of XPath's node selection syntax, and it adds a few clever tricks of its own..."

  • [March 22, 2001] "Overcoming Objections to XML-based Authoring Systems." By Brian Buehling. From XML.com. March 21, 2001. ['When deploying an XML-based content management system, common misconceptions must be corrected. This article helps IT professionals do just that.'] "During a recent development effort, one of our clients was alarmed at the conversion costs of the proposed XML-based content management system compared to the existing MS Word-based process. This was just one instance of an alarming trend of balking at XML-based systems in favor of using public web folders, indexed by some full-text search engine, as part of a local intranet. In the short run, these edit, drop, and index solutions have some appealing features, including low development and conversion costs. But they are short-lived systems that either wither from lack of functionality or rapidly outgrow their design. Fortunately, the initial objections to the cost of building an XML-based content repository have become fairly predictable. In most cases they are based on misconceptions about XML or on an overly optimistic view of alternative approaches. Even though implementing an XML-based content management system is not always the best approach for an organization, any architectural decision should be made only after thoroughly overcoming the common misconceptions of the technology involved. The list of questions below is intended to be a guide for IT professionals to discuss intelligently the pros and cons of developing an XML document repository..."

  • [March 22, 2001] "Building User-Centric Experiences. An Introduction to Microsoft HailStorm." A Microsoft White Paper. Published: March 2001. "... For users, HailStorm will be accessed through their applications, devices and services (also known as 'HailStorm end-points'). A HailStorm-enabled device or application will, with your consent, connect to the appropriate HailStorm services automatically. Because the myriad of applications and devices in your life will be connected to a common set of information that you control, you'll be able to securely share information between those different technologies, as well as with other people and services. Developers will build applications and services that take advantage of HailStorm to provide you with the best possible experience. The HailStorm platform uses an open access model, which means it can be used with any device, application or services, regardless of the underlying platform, operating system, object model, programming language or network provider. All HailStorm services are XML Web services, which are based on the open industry standards of XML and SOAP; no Microsoft runtime or tool is required to call them. Naturally, the .NET infrastructure provided by Visual Studio.NET, the .NET Framework, and the .NET Enterprise Servers will fully incorporate support for HailStorm to make it as simple as possible for developers to use HailStorm services in their applications. From a technical perspective, HailStorm is based on Microsoft Passport as the basic user credential. The HailStorm architecture defines identity, security, and data models that are common to all HailStorm services and ensure consistency of development and operation. HailStorm is a highly distributed system and can help orchestrate a wide variety of applications, devices and services. The core HailStorm services use this architecture to manage such basic elements of a user's digital experience as a calendar, location, and profile information. Any solution using HailStorm can take advantage of these elements, saving the user from having to re-enter and redundantly store this information and saving every developer from having to create a unique system for these basic capabilities. HailStorm is expressed and accessed as a set of industry standard XML Web services. HailStorm-enabled solutions interact with specific HailStorm facilities via XML message interfaces (XMIs), which are simply a set of XML SOAP messages. The initial set of HailStorm services will include: myAddress: electronic and geographic address for an identity; myProfile: name, nickname, special dates, picture; myContacts: electronic relationships/address book; myLocation: electronic and geographical location and rendez-vous; myNotifications: notification subscription, management and routing; myInbox: inbox items like e-mail and voice mail, including existing mail systems; myCalendar: time and task management; myDocuments: raw document storage; myApplicationSettings: application settings; myFavoriteWebSites: favorite URLs and other Web identifiers; myWallet: receipts, payment instruments, coupons and other transaction records; myDevices: device settings, capabilities; myServices: services provided for an identity; myUsage: usage report for above services. The HailStorm architecture is designed for consistency across services and seamless extensibility. It provides common identity, messaging, naming, navigation, security, role mapping, data modeling, metering, and error handling across all HailStorm services. HailStorm looks and feels like a dynamic, partitioned, schematized XML store. It is accessed via XML message interfaces (XMIs), where service interfaces are exposed as standard SOAP messages, arguments and return values are XML, and all services support HTTP Post as message transfer protocol..." See: "Hailstorm."

  • [March 22, 2001] [Transcript of] Remarks by Bill Gates. HailStorm Announcement. Redmond, Washington, March 19, 2001. "...schema is the technical term you're going to be hearing again and again in this XML world. It's through schemas that information can be exchanged, things like schemas for your appointments, schemas for your health records. The work we're announcing today is a rather large schema that relates to things of interest to an individual. And you'll recognize very quickly what those things are, things like your files, your schedule, your preferences, all are expressed in a standard form. And so, by having that standard form, different applications can fill in the information and benefit from reading out that information and benefit from reading out that information. And so it's about getting rid of these different islands. It's really a necessary step in this revolution that there be services like HailStorm. There's no way to achieve what users expect and really get into that multiple device, information any time, anywhere world without this advance. So you can envision the XML platform as having two pieces. The foundation pieces that are done in the standards committee, going back to that original XML work in 1996, but now complemented by a wide range of things, things like X-TOP, X-LINK, the schema standards that have come along. One of the really key standards is this thing called SOAP, that's the way that applications that were not designed together can communicate and share information across the Internet. You can think of it as a remote procedure call that works in that message-based, loosely coupled environment. Now, the XML movement has gained incredible momentum. I'd say the last year has really been phenomenal in terms of the momentum that this has developed. Part of that is we also have other large companies in the industry, besides Microsoft, really join into this. So if you look at two of the recent standards, SOAP and UDDI, we had many partners, including IBM, that were involved in a very deep way, helping to design that standard, and really standing up and saying that was critical to their whole strategy. And so you're seeing a real shift towards these XML Web services, a real shift away from people saying it's one computer language, or it's just about one kind of app server, to an approach now that is far more flexible around XML. The kind of dreams that people have had about interoperability in this industry will finally be fulfilled by the XML revolution. And so, although we're focusing on HailStorm today, it's important to understand that this XML approach allows data of all types, business application data, to move easily between different platforms, between different companies in a very simple way..." See: "Hailstorm." [cache]

  • [March 22, 2001] "Exclusive DevX Q&A with the HailStorm Team." From DevX. March 22, 2001. ['On March 19, and in a private design preview four days earlier, Microsoft unveiled what Bill Gates called "probably the most important .NET building block service." Codenamed HailStorm, this suite of user-centric XML Web services turns things inside out, said its architect and distinguished engineer Mark Lucovsky. "Instead of having an application be your gateway to the data, in HailStorm, the user is the gateway to the data." After the press conference, XML Magazine Editor-in-Chief Steve Gillmor sat down with Lucovsky and Microsoft director of business development Charles Fitzgerald to discuss what Gates calls the beginning of the XML revolution.'] "Gillmor: Can you give us an XML-focused view of HailStorm? Lucovsky: The key thing is that we take the individual and hang a bunch of services off that individual -- and those services are exposed as an XML document. Off of an ID or a person, we hang a calendar -- and the calendar has an XML schema and a set of access mechanisms to manipulate that XML data. We take our whole service space and wrap that around this identity-based navigation system, and expose those services as XML that you can process using any tool set that you like. If you do a query, you can specify your query string as either an XPath expression or an XQL query string. It will give you back a document fragment. Once it's in your control, you can process it with your own DOM or SAX parser -- whatever makes sense for the application. You can use an XSL transform and throw away half of what we gave back because you only cared about this element or that attribute; it's up to the application. The four basic verbs that we support are 'Add,' 'Query,' 'Update,' 'Delete' -- they all relate back to XPointer roots. We're not inventing any kind of new navigation model; we're just utilizing existing XML standards. There are additional domain-specific methods on some of the services. But the fundamental primitive is that you think of the service as if it were an XML document, and that document has a schema that includes types that are specific to that document. Gillmor: Where's the document stored? Lucovsky: The system is set up so that each service instance has its own address. It's very distributed -- or it can be. My 'MyAddress' service and your 'MyAddress' service can be at two different data centers on two different front-end clusters anywhere. That's all done dynamically -- we can partition with the granularity of 'an individual service instance can be located anywhere on the network' -- and we look up that address as part of the SOAP protocol to talk to it. The actual data for a given service, if it's a persistent service -- like 'MyAddress' or something like that -- is then shredded from its XML form into a relational database using our shredding technology. We map the XML into element IDs and attribute IDs, smash it into a database, query it out using our database tables, and then reconstitute the XML. It's like -- it is an XML database; that's how you do an XML database. We're not taking a blob and storing it and going crazy like that... In HailStorm, you're talking XML natively -- so that whole section disappears. Our type model is XML; our type model is XSD schema. Our type model isn't an object hierarchy that we then have to figure out how to factor into XML. And the bulk of the work in SOAP moving forward -- there's a lot of efforts in SOAP -- but one piece of work in SOAP is beefing up that section of the spec. Other activities in SOAP are working on routing headers and other headers that you would carry in that SOAP-header element. We're embracing all of SOAP, but there's not a lot there that's directly relevant to us in the serialization... Are we using XML signatures? We're working on that to see if it can do what we need it to do with respect to the body element. We think we can. Are we're using Kerberos wrapped in XML? Yes. The SOAP processor -- that's a meaningless thing -- everybody has to write that themselves. But we've done a lot of very interesting innovation in the routing, and we're working with other industry players in that key piece of SOAP to ensure that that key 'how you address an endpoint, and how you route to the endpoint' becomes part of everybody's standard way of addressing endpoints. That's a key thing that I think is missing out of SOAP right now, is how you express an endpoint. Putting something in the SOAP action verb of an HTTP header doesn't cut it; you have to really put the endpoints in the SOAP envelope. We're working on that. The operation stuff is all HailStorm plumbing, so that wouldn't have anything to do with SOAP or XML, but we'll be firing XML events out the back end of the service. We look at the standards and the community of XML developers is an opportunity to say, hey, we're not going to invent a new format for time duration if there's a format for time duration already out there. You look at the base type model of XSD and a lot of the stuff that we need to do already has an XSD type, so we're not coming up with a new type for time duration -- it exists and we're going to use that. People know how to code against that..." See: "Hailstorm."

  • [March 20, 2001] "Microsoft's HailStorm Unleashed." By Joe Wilcox. In CNET News.com (March 19, 2001). "Microsoft on Monday launched a HailStorm aimed at upstaging rival America Online. The software giant unveiled a set of software building blocks, grouped under the code name HailStorm, for its .Net software-as-a-service strategy. Along with HailStorm, Microsoft marshaled out new versions of its Web-based Hotmail e-mail service, MSN Messenger Service, and Passport authentication service. The Redmond, Wash.-based software company is positioning HailStorm as way of enticing developers to create XML (Extensible Markup Language)-based Web services deliverable to a variety of PC and non-PC devices such as handhelds and Web appliances. Microsoft said HailStorm is based on the company's Passport service and permits applications and services to cooperate on consumers' behalf. HailStorm also leans heavily on instant messaging services provided by MSN Messenger and on Microsoft's Hotmail e-mail service. Microsoft envisions HailStorm as a way for consumers and business customers to access their data -- calendars, phone books, address lists -- from any location and on any device. That model closely mirrors AOL's model by which members access AOL's service via a PC, handheld, or a set-top box to retrieve their personal information. Microsoft on Monday also disclosed five development partners for its .Net plan, including eBay, which announced its partnership last week. eBay and Microsoft entered into a strategic technology exchange that includes turning the eBay API (application programming interface) into a .Net service. HailStorm is based on Passport's user-authentication technology, which Microsoft uses for Hotmail, MSN Messenger, and some MSN Web services. The company describes the XML-based technology as user rather than device specific. Rather than keeping information on a single device such as a PC, Microsoft envisions people accessing content and personal information through a number of devices created using XML tools. Microsoft is looking to launch two types of .Net services: broad horizontal building-block services such as HailStorm and application-specific services. HailStorm initially will comprise 14 software services including MyAddress, an electronic and geographic address for an identity; MyProfile, which includes a name, nickname, special dates and pictures; MyContacts, an electronic address book; MyLocation for pinpointing locations; MyNotifications, with will pass along updates and other information; and MyInbox, which includes items such as e-mail and voicemail. Microsoft said HailStorm will enter beta testing later this year and will be released next year. Rather than solely relying on Microsoft technology to become the standard for these services, the company is using established Web development languages such as XML, SOAP (Simple Object Access Protocol) and UDDI (Universal Description Discovery and Integration). IBM also is pushing XML, the emerging choice du jour for creating Web pages, and UDDI, a sort of Web services Yellow Pages for developers. IBM last week used XML and UDDI to beef up its WebSphere Application Server and has been aggressively using the tools to woo developers to its middleware software. Technology Business Research analyst Bob Sutherland said that while he expects competition between Microsoft and IBM will be fierce over XML, 'they will woo customers not so much on the benefits of the XML platform but what their products have to offer'." See: "Hailstorm."

  • [March 20, 2001] "Microsoft Launches HailStorm Web-Services Strategy." By Tom Sullivan and Bob Trott. In InfoWorld (March 19, 2001). "Microsoft executives detailed a key piece of the company's strategy for delivering user-centric Web services here on Monday. The strategy, code-named HailStorm, is a new XML-based platform that lives on the Internet, and is designed to transform the user experience into one in which users have more control over their information. 'It's probably the most important .NET building block service,' said Microsoft Chairman Bill Gates. 'This is a revolution where the user's creativity and the power of all their devices can be used.' Currently, Gates said, users are faced with disconnected islands of data, such as PCs, cell phones, PDAs, and other devices. HailStorm is designed to combine the different islands and move the data behind the scenes so users don't have to move it themselves, thereby providing Microsoft's latest mantra of anytime, anywhere access to data from any device, according to Gates. To that end, Microsoft will provide a set of services under HailStorm, such as notifications, e-mail, calendaring, contacts, an electronic wallet, and favorite Web destination, designed for more effective communication. 'Stitching those islands together is about having a standard schema, in fact a rich schema, for tying all that info together,' he added. That schema will be constructed largely of XML, which Gates called the foundation of HailStorm. 'The kind of dreams people have had about interoperability in this industry will finally be fulfilled with the XML foundation,' he said. The first end point of HailStorm will be Microsoft's forthcoming Windows XP, the next generation of Windows 2000, due later this year. Gates said that XP makes it easier to get at HailStorm services. 'HailStorm is not exclusively tied to any particular OS,' he added. Although Microsoft said that HailStorm will work with platforms from other vendors, such as Linux, Unix, Apple Macintosh, and Palm, the company maintained that HailStorm services will work most effectively with Windows platforms... Microsoft plans to tap into the 160 million users of its Passport single-sign-on service as early users of HailStorm, and will offer them free services. Gates added that HailStorm will consist of a certain level of free services, but customers that want more will be charged for it..." See: "Hailstorm."

  • [March 20, 2001] "Legal Storm Brewing Over Microsoft's HailStorm." By Aaron Pressman and Keith Perine [The Industry Standard]. In InfoWorld (March 20, 2001). Even before Microsoft announced its new online services plan -- dubbed HailStorm -- on Monday, some of the company's leading competitors were quietly registering complaints about the effort with government antitrust regulators. The competitors, including AOL Time Warner and Sun Microsystems, allege that HailStorm and other pieces of Microsoft's .NET initiative are designed to limit their access to customers and further leverage Microsoft's dominant Windows market share... Microsoft denies that anything in its .NET plan is improper. The company's new HailStorm product is not limited to Windows and can be accessed by consumers running Linux, Apple's Macintosh operating system, or even on a Palm handheld device, Microsoft notes. The company also said HailStorm is built on open standards and is available for use by any Web site, including AOL. However, Microsoft plans to charge consumers, developers, and participating Web sites... The next version of Windows, called XP, will integrate HailStorm services into the operating system, encouraging consumers to sign up when they start their computers for the first time. The operating system also features an integrated media player and a copyright-protection scheme to prevent users from distributing copies of music purchased online. Competitors complain that XP won't allow consumers to choose a competing media player as the default program for playing music on their PCs."

  • [March 20, 2001] "Shifting to Web Services." By Tom Sullivan, Ed Scannell, and Bob Trott. In InfoWorld Volume 23, Issue 12 (March 19, 2001), pages 1, 27. "Web services may be all the rage these days, but users, developers, and even vendors are only nibbling at the edges of what this still-unfolding shift in software architecture and delivery means to them. Microsoft on Monday will attempt to demystify Web services a bit more, when Chairman Bill Gates and other officials roll out a major technology component to their .NET strategy, dubbed Hailstorm, at an event in Redmond, Wash. Hailstorm, a Web-services development platform first unveiled last week at an exclusive conference for developers and partners, relies on industry standards XML, SOAP (Simple Object Access Protocol), and UDDI (Universal Description, Discovery, and Integration) and will include next-generation versions of Microsoft offerings such as Hotmail, MSN Messenger, and Passport, the software giant's Internet identification service. Developers can embed these and related services into their applications. One source, who requested anonymity, described Hailstorm as being a 'building block' approach to Web services that will open up new ways to communicate and transmit data in an instant message, peer-to-peer format. Microsoft rivals Sun Microsystems and IBM separately last week also tried to put some reality behind their own Web-services plays. Just how Web services will be used is shaping up to be the nascent market's million-dollar question. In the wake of the dot-com fadeout, brick-and-mortar companies are picking up the slack, hoping Web services will generate e-commerce revenue. But perhaps even more pertinent to enterprises is the potential to use the Web services model to tie together existing, in-house applications using XML standards. The coming Hailstorm: Microsoft's Hailstorm initiative will offer a platform for Web services. (1) Represents an expansion of instant-messaging-type p-to-p technology. (2) Allows developers to embed Web services, such as Passport, for identification in their apps. (3) Is based on XML, SOAP, and UDDI... Also, eBay, in San Jose, Calif., agreed to support .NET with its community-based commerce engine, and the two companies envision that Web sites supporting .NET will be able to list relevant items up for auction on eBay through an XML interface. Mani Chandy, co-founder and chief scientist at Oakland-based iSpheres and a computer science professor at Cal Tech, said that because of Web-services standards, large companies that have big IT staffs will start moving toward the architecture. '"A lot of brick-and-mortar companies offer Web services, but they don't even know it. They may not offer them in SOAP, but they might offer them in HTML,' Chandy added. A new generation of companies, some brick-and-mortars, others dot-com successes, are growing up with the notion of Web services. Denver-based Galileo, an early partner of the .NET program, is currently working to convert its Corporate Travel Point software into a Web service by adding support for standards, such as UDDI, XML, SOAP, and the WSDL (Web Services Description Language) specification for standardization..."

  • [March 19, 2001] "IBM Experiments With XML." By Charles Babcock. In Interactive Week (March 19, 2001). "IBM is experimenting with eXtensible Markup Language as a query language to get information from a much broader set of resources than rows and tables of data in relational databases. It has also built a working model of a 'dataless' database that assembles needed information from a variety of sources, after breaking down a user's query into parts that can be answered separately. The response sent back to the user offers the information as a unified, single presentation. The disclosures came as IBM pulled back the curtain on its database research at its Almaden Research Lab in San Jose where Project R was first fledged 25 years ago, leading to the DB2 database management system in the mid-1980s. At the briefing, it also disclosed that Don Chamberlin, IBM's primary author of the Structured Query Language (SQL), which became instrumental to the success of relational databases, was also behind XQuery, IBM's proposed XML query language before the World Wide Web Consortium. The W3C's XML Query Working Group released its first working draft of an XML query language on Feb. 15. IBM Fellow Hamid Pirahesh said 'XQuery has been taken as a base' by the W3C working group and would lead to a language that could be used more broadly than SQL. An XML-based query language could query repositories of documents, both structured and unstructured, such as e-mail, to find needed information... IBM, Microsoft and Software AG are all committed to bring out products based on an XML query language. Software AG, through its former American subsidiary, established Tamino as an XML-based database system over the last year. An IBM product will be launched before the end of June, Pirahesh said. Such future products may make it possible for sites rich in many forms of content, such as CNN, National Geographic or the New York Times, may find many additional ways to allow visitors to seek what they want or ask questions and obtain answers, said Jim Reimer, distinguished engineer at IBM.... Besides the proposed query language, IBM has built an experimental 'dataless' database system that gets the user the information needed from a variety of sources by breaking down a query into its parts. Each part is addressed to the database system or repository that can supply an answer, even though the data may reside in radically different systems and formats. When the results come back, they are assembled as one report or assembled view to the user. IBM plans to launch a product, Discovery Link, as an add-on to its DB2 Universal Server system in the second quarter. Discovery Link itself will contain no data but will have a database engine capable of parsing complex queries into simpler ones and mapping their route to the systems that can respond with results. The user will not need to know the name of the target database or repository or how to access it. Discovery Link will resolve those issues behind the scenes, said IBM Fellow Bruce Lindsay. The system will be a 'virtual database' or a federation of heterogeneous databases, and a pilot Discovery Link system has been in use for several months by pharmaceutical companies trying to research and manufacture new drugs..." "XML and Query Languages."

  • [March 19, 2001] "Untangling the Web. SOAP Uses XML as a Simple And Elegant Solution that Automates B2B Transactions." By Greg Barish. In Intelligent Enterprise Volume 4, Number 5 (March 27, 2001), pages 38-43. "What B2B really needs is an easy way to integrate the back-end systems of participating organizations. And we're not just talking about a solution that involves each business maintaining multiple interfaces to that data. That's the way things work today and, to a large extent, visual interfaces have often proved to be unwieldy solutions. IT managers want a way to consolidate their data and functionality in one system that can be accessed over the Web by real people or automatically by software agents. The Simple Object Access Protocol, better known as SOAP, is aimed squarely at this data consolidation problem. Recently approved by the World Wide Web Consortium (W3C), SOAP uses XML and HTTP to define a component interoperability standard on the Web. SOAP enables Web applications to communicate with each other in a flexible, descriptive manner while enjoying the built-in network optimization and security of an HTTP-based messaging protocol. SOAP's foundations come from attempts to establish an XML-based form of RPC as well as Microsoft's own efforts to push its DCOM technology beyond Windows. SOAP increases the utility of Web applications by defining a standard for how information should be requested by remote components and how it should be described upon delivery. The key to achieving both of these goals is the use of XML to provide names to not only the functions and parameters being requested, but to the data being returned... SOAP simply and elegantly solves the major problems with both the HTML-based and DCOM/CORBA approaches by using XML over existing HTTP technology. Use of XML yields three important benefits: (1) XML makes the data self-describing and easy to parse. (2) Because XML and XSL separate data from presentation, useful data is distinguished from the rendering metadata. Thus, pages used as data sources for software agents can be reused for human consumption, eliminating the need for redundant data views. (3) XML enables complicated data structures (such as lists or lists of lists) to be easily encoded using flexible serialization rules. Using XML for encoding data also represents an alternative to ANSI-based Electronic Data Interchange (EDI). While EDI has been successfully used for years, it does have its problems. For example, it is cryptic and difficult to debug. Also, it is more expensive and requires the server and client to have special software installed to handle the format. What's more, EDI over HTTP is problematic: It doesn't completely support important HTTP encryption and authentication standards, and thus secure transactions are limited or simply not possible. In contrast, SOAP keeps things simple. It's extensible, the data is self-describing, simple to debug, and it can enjoy the benefits of HTTP-based security methods. While a SOAP message requires more bandwidth than an EDI message, bandwidth has become less of a concern as the Internet itself becomes faster - particularly between businesses that can afford high-speed network access. Finally, you can deploy SOAP over a number of protocols, including HTTP. This capability is important because it allows the firewall issues to be avoided and retains the optimizations that have been built into HTTP... While SOAP messages consist of XML- compliant encoding, they can be also be communicated via alternative transport mechanisms, such as RPC. Communication via RPC points back to the history of SOAP in its XML-RPC form. XML- based RPC cuts to the chase: It says, "Let's forget all this stuff about Web servers and Web clients, we just want distributed objects to be interoperable between disparate systems." SOAP over HTTP, in contrast, is a more general form of object-to-object (or agent-to-agent) communication over the Internet. It assumes what is minimally necessary: that objects are accessible via HTTP and that the data they return is self-describing." See "Simple Object Access Protocol (SOAP)."

  • [March 19, 2001] STEPml Product Identification and Classification Specification. "This STEPml specification addresses the requirements to identify and classify or categorize products, components, assemblies (ignoring their structure) and/or parts. Identification and classification are concepts assigned to a product by a particular organization. This specification describes the core identification capability upon which additional capabilities, such as product structure, are based. Those capabilities are describe in other STEPml specifications and their use is dependent upon use of this specification... The structure of the STEPml markup for product identification and classification was designed based on the object model found in programming languages such as Java and on object serialization patterns. It is called the Object Serialization Early Binding (OSEB). An overview of the OSEB describes the design philosophy of this approach and the fundamental structure of the elements as well as a description of the header elements. The OSEB uses the ID/IDREF mechanism in XML to establish references between elements rather than using containment. UML object diagrams, with one extension, are used to depict the structure of the elements and attributes in these examples. Each element is represented by an instance of a class with same name as the element..." The following files supporting this STEPml specification are available. (1) the basic product identification and classification OSEB DTD; (2) a sample XML document containing the completed examples based on the simple DTD; (3) the full OSEB DTD for product identification and classification; (4) the ISO 10303-11 EXPRESS data modeling language schema upon which the DTD is based; (5) the STEP PDM Schema Usage Guide with which this STEPml specification is compatible; (6) an overview of the OSEB and the complete OSEB from the ISO Draft Technical Specification. Items 4-6 will be most useful to reviewers who are "literate in the EXPRESS language and the STEP ISO 10303 standard." See: (1) "STEPml XML Specifications", and (2) "STEP/EXPRESS and XML".

  • [March 19, 2001] "The eXtensible Rights Markup Language (XrML). By Bradley L. Jones. From EarthWeb Developer.com, March 16, 2001. ['Is Digital Rights Management important? You know what the music industry will say! We asked Brad Gandee, XrML Standard Evangelist, about a standard that is here to help. XrML is the eXtensible Rights Markup Language that has been developed by Xerox Palo Alto Research Center and licensed to the industry royalty free in order to drive its adoption. Simply put, this is an XML-based language that is used to mark digital content such as electronic books and music. Brad Gandee, XrML Standard Evangelist, took the time to answer a few questions on XrML for Developer.com'] "Q: Who and how many have licensed XrML to date? A: More than 2000 companies and organizations, from multiple industries including DRM, publishing, e-media (audio & video), intellectual property, enterprise, etc., have licensed XrML since April 2000. The actual number of licensees as of 2/28/01 was 2031. Q: What does the adoption forecast look like for the next six months? A: There are approximately 30 new licensees every week. Over the next six months we forecast, according to the current rate of 30 licensees per week, additional licensees in the neighborhood of 2720. We anticipate that this figure may increase once an XrML SDK is released and as XrML becomes involved with more standards organizations. In addition, the rate of new licensees could rise due to the increased attention to rights languages within MPEG and with the restart of work on the EBX specification within OEBF. Q: Why hasn't XrML been handed over to a standards organization yet? A: We have not handed XrML over to a standards organization yet for a couple reasons. First there are many different standards bodies focused on different content types, each with their own perspective. We see the need for keeping XrML open and applicable to all of the content types. If we put the language under the control of one of these organizations too early, then it may end up perfect for one type content but become inflexible for many others. With the potential that the digital content market holds, we see a world where many different types of content come together dynamically in new recombinant forms, marketed and "published" across new channels and in new ways. The rights language that is used to express all of the new business models needs to remain content neutral. Another reason we have not handed XrML over to a standards body is that there has not yet been one that is prepared to take on the role of overseeing a rights language. For example, the W3C, which might be considered a good candidate for a home for XrML, just held a DRM Workshop in the third week of January in order to determine if they have a role to play in the DRM space. As a result of that workshop they may be considering the formation of a working group to look into rights languages, which will take time. In the meantime there is a DRM market out there that is moving forward..." See: "Extensible Rights Markup Language (XrML)."

  • [March 19, 2001] "Extended Path Expressions for XML." By Murata Makoto (IBM Tokyo Research Lab/IUJ Research Institute, 1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan; Email: mmurata@trl.ibm.co.jp). [Extended abstract] prepared a presentation at PODS (Principles of Database Systems) 2001. With 35 references. ZIP format. Abstract: "Query languages for XML often use path expressions to locate elements in XML documents. Path expressions are regular expressions such that underlying alphabets represent conditions on nodes. Path expressions represent conditions on paths from the root, but do not represent conditions on siblings, siblings of ancestors, and descendants of such siblings. In order to capture such conditions, we propose to extend underlying alphabets. Each symbol in an extended alphabet is a triplet (e1; a; e2), where 'a' is a condition on nodes, and 'e1 (e2)' is a condition on elder (resp. younger) siblings and their descendants; 'e1' and 'e2' are represented by hedge regular expressions, which are as expressive as hedge automata (hedges are ordered sequences of trees). Nodes matching such an extended path expression can be located by traversing the XML document twice. Furthermore, given an input schema and a query operation controlled by an extended path expression, it is possible to construct an output schema. This is done by identifying where in the input schema the given extended path expression is satisfied." Details: "XML has been widely recognized as one of the most important formats on the WWW. XML documents are ordered trees containing text, and thus have structures more exible than relations of relational databases. Query languages for XML have been actively studied. Typically, operations of such query languages can be controlled by path expressions. A path expression is a regular expression such that underlying alphabets represent conditions on nodes. For example, by specifying a path expression, we can extract figures in sections, figures in sections in sections, figures in sections in sections in sections, and so forth, where section and figure are conditions on nodes. Based on well-established theories of regular languages, a number of useful techniques (e.g., optimization) for path expressions have been developed. However, when applied to XML, path expressions do not take advantage of orderedness of XML documents. For example, path expressions cannot locate all <figure> elements whose immediately following siblings are <table> elements. On the other hand, industrial specifications such as XPath have been developed. Such specifications address orderedness of XML documents. In fact, XPath can capture the above example. However, these specifications are not driven by any formal models, but rather designed in an ad hoc manner. Lack of formal models prevents generalization of useful techniques originally developed for path expressions. As a formal framework for addressing orderedness, this paper shows a natural extension of path expressions. First, we introduce hedge regular expressions, which generate hedges (ordered sequences of ordered trees). Hedge regular expressions can be converted to hedge automata (variations of tree automata for hedges) and vice versa. Given a hedge and a hedge regular expression, we can determine which node in the hedge matches the given hedge regular expression by executing the hedge automaton. The computation time is linear to the number of nodes in hedges. Second, we introduce pointed hedge representations. They are regular expressions such that each 'symbol' is a triplet (e1, a1, e2), where e1 e2 are hedge regular expressions and a is a condition on nodes. Intuitively, e1 represent conditions on elder siblings and their descendants, while e2 represent conditions on younger siblings and their descendants. As a special case, if every hedge regular expression in a pointed hedge representation generates all hedges, this pointed hedge representation is a path expression. Given a hedge and a pointed hedge representation, we can determine which node in the hedge matches the given pointed hedge representation. For each node, (1) we determine which of the hedge regular expressions matches the elder siblings and younger siblings, respectively, (2) we then determine which of the triplets the node matches, and (3) we finally evaluate the pointed hedge representation. Again, the computation time is linear to the number of nodes in hedges. Another goal of this work is schema transformation. Recall that query operations of relational databases construct not only relations but also schemas. For example, given input schemas (A; B) and(B;C), the join operation creates an output schema (A; B; C). Such output schemas allow further processing of output relations. It would be desirable for query languages for XML to provide such schema transformations. That is, we would like to construct output schemas from input schemas and query operations (e.g., select, delete), which utilize hedge regular expressions and pointed hedge representations. To facilitate such schema transformation, we construct match-identifying hedge automata from hedge regular expressions and pointed hedge representations. The computation of such automata assigns marked states to those nodes which match the hedge regular expressions and pointed hedge representations. Schema transformation is effected by first creating intersection hedge automata which simulate the match-identifying hedge automata and the input schemata, and then transforming the intersection hedge automata as appropriate to the query operation... In Section 2, we consider related works. We introduce hedges and hedge automata in Section 3, and then introduce hedge regular expressions in Section 4. In Section 5, we introduce pointed hedges and pointed hedge representations. In Section 6, we define selection queries as pairs of hedge regular expressions and pointed hedge representations. In Section 7, we study how to locate nodes in hedges by evaluating pointed hedge representations. In Section 8, we construct match-identifying hedge automata, and then construct output schemas. In Section 9, we conclude and consider future works... We have assumed XML documents as hedges and have presented a formal framework for XML queries. Our selection queries are combinations of hedge regular expressions and pointed hedge representations. A hedge regular expression captures conditions on descendant nodes. To locate nodes, a hedge regular expression is first converted to a deterministic hedge automaton and then it is executed by a single depth-first traversal. Meanwhile, a pointed hedge representation captures conditions on non-descendant nodes (e.g., ancestors, siblings, siblings of ancestors, and descendants of such siblings). To locate nodes, a pointed hedge representation is first converted to triplets: (1) a deterministic hedge automaton, (2) a finite-index right-invariant equivalence of states, and (3) a string automaton over the equivalence classes. Then, this triplet is executed by two depth-first traversals. Schema transformation is effected by identifying where in an input schema the given hedge regular expression and pointed hedge representation is satisfied. Interestingly enough, as it turns out our framework exactly captures the selection queries definable by MSO, as do boolean attribute grammars and query automata. On the other hand, our framework has two advantages over MSO-driven approaches. First, conversion of MSO formulas to query automata or boolean attribute grammars requires non-elementary space, thus discouraging implementations. On the other hand, our framework employs determinization of hedge automaton, which requires exponential time. However, we conjecture that such determinization usually works, as does determinization of string automata. Second,(string) regular expressions have been so widely and successfully used by many users because they are very easy to understand. We hope that hedge regular expressions and pointed hedge representations will become commodities for XML in the near future. There are some interesting open issues. First, is it possible to generalize useful techniques (e.g., optimization) developed for path expressions to hedge regular expressions and pointed hedge representations? Second, we would like to introduce variables to hedge regular expressions so that query operations can use the values assigned to such variables. For this purpose, we have to study unambiguity of hedge regular expressions. An ambiguous expression may have morethan one way to match a given hedge, while an unambiguous expression has at most only one such way. Variables can be safely introduced to unambiguous expressions." See "SGML/XML and Forest/Hedge Automata Theory." [cache]

  • [March 16, 2001] "Introduction to the Darwin Information Typing Architecture. Toward portable technical information." By Don R. Day, Michael Priestley, and Dave A. Schell. From IBM developerWorks. March 2001. "The Darwin Information Typing Architecture (DITA) is an XML-based architecture for authoring, producing, and delivering technical information. This article introduces the architecture, which sets forth a set of design principles for creating information-typed modules at a topic level, and for using that content in delivery modes such as online help and product support portals on the Web. This article serves as a roadmap to the Darwin Information Typing Architecture: what it is and how it applies to technical documentation. The article links to representative source code." See overview/discussion.

  • [March 16, 2001] "Specialization in the Darwin Information Typing Architecture. Preparing topic-based DITA documents." By Michael Priestley (IBM Toronto Software Development Laboratory From IBM developerWorks. March 2001. Adjunct to a general article on DITA "Introduction to the Darwin Information Typing Architecture." Priestley's article "shows how the 'Darwin Information Typing Architecture' also a set of principles for extending the architecture to cover new information types as required, without breaking common processes. In other words, DITA provides the base for a hierarchy of information types that anyone can add to. New types will work with existing DITA transforms, and are defined as "deltas" relative the existing types - reusing most of the existing design by reference." From the introduction: "This in-depth look at the XML-based Darwin Information Typing Architecture (DITA) for the production of modular documentation tells how to prepare topic-based DITA documents. The instructions cover creating new topic types and transforming between types. An appendix outlines the rules for specialization. The point of the XML-based Darwin Information Typing Architecture (DITA) is to create modular technical documents that are easy to reuse with varied display and delivery mechanisms, such as helpsets, manuals, hierarchical summaries for small-screen devices, and so on. This article explains how to put the DITA principles into practice. Specialization is the process by which authors and architects define new topic types, while maintaining compatibility with existing style sheets, transforms, and processes. The new topic types are defined as an extension, or delta, relative to an existing topic type, thereby reducing the work necessary to define and maintain the new type..." See the main bibliographic item.

  • [March 16, 2001] "Towards an Open Hyperdocument System (OHS)." By Jack Park. Version 20010316 or later. "In the big picture, this paper discusses one individual's (my) view of an implementation of an Open Hyperdocument System (OHS) as first proposed by Douglas Engelbart. Persistence: This project begins with persistent XTM, my implementation of an XTM engine that drives a relational database engine. It will expand to include flat-file storage of some topic occurrences. These occurrences are saved in an XML dialect specified by a DTD in the eNotebook project discussed below, and can be rendered to web pages using XSLT as desired. Collaboration: It is intended that the OHS engine, rendered as a Linda-like server as discussed below under the project jLinda, will be capable of allowing many users to log into the server and participate in IBIS discussions in the first trials. This assumes multicasting capabilities in the Content layer, which are not yet implemented. Topic Map capability: This project takes the view that navigation of a large hyperlinked document space is of critical importance; Topic Maps, particularly, those constructed to the XTM 1.0 standard are applied to the Knowledge Organization and Navigation issues. Perhaps unique to this specific project is the proposal that the XTM technology shall serve, at once, as a kind of interlingua between Context and Content by serving as the indexing scheme into a Grove-like architecture, and as the primary navigation tool for the Context layer..." [From the posting: "Recently, I have combined jTME [topic map engine] into a much larger project, a version of an Open Hyperdocument System as proposed by Douglas Engelbart http://www.bootstrap.org (as interpreted by me). An ongoing 'weblog' on that project can be found at http://www.thinkalong.com/ohs/jpOHS.pdf. To discuss this project, particularly the jTME part of it, contact me at jackpark@thinkalong.com."] See: "(XML) Topic Maps."

  • [March 16, 2001] "XML Schemas: Best Practice. [Homepage.]" By Roger L. Costello (Mitre). March 13, 2001. Table of Contents: Motivation and Introduction to Best Practices; Default Namespace - targetNamespace or XMLSchema?; Hide (Localize) Versus Expose Namespaces; Global versus Local; Element versus Type; Zero, One, or Many Namespaces; Variable Content Containers; Creating Extensible Content Models; Extending XML Schemas. Roger says: "I have created a homepage containing all of our work. Also, based upon our recent discussions (especially on Default Namespace) I have updated all the online material and examples. In so doing I fixed a lot of typos, clarified things, etc. [You can] download Online Material Plus Schemas: I have zipped up all the online discussions, along with the schemas and instance documents that are referenced in the online material. Now you can download all this material and run all the examples. Also download Best Practice Book: I have put the Best Practice material into book form. You can download this book and print it out... In a few days I would like to start up again our discussions on Creating Extensible Schemas..." For schema description and references, see "XML Schemas."

  • [March 16, 2001] XML Encoding of XPath: DTD. Work in progress from Wayne Steele and others. See also XML Encoding of XPath: Examples, and the XML-DEV thread. Also from Ingo Macherius: A JavaCC parser for XPath and XSLT patterns. 'Here is another XPath-JavaCC grammar. I think Paul's [JavaCC grammar of Xpath] is clearer (e.g., does not use LOOKAHEAD), while ours is more complete and Unicode aware. Maybe you want to mix them, so: just in case part 2.'

  • [March 16, 2001] Microsoft.NET." Special Issue of InternetWorld devoted to the Microsoft .NET Program. March 15, 2001. 18+ separate articles. "Microsoft.NET is big. Very big. Microsoft's evangelists and corporate communications directors have had difficulty explaining .Net to the financial and lay press. It's not easy to reduce the strategic vision of the largest software company in the world to a single sound bite. 'Where do you want to go today?' doesn't tell you much. We hope the analysis that follows will..."

  • [March 16, 2001] " Dissecting .NET .Net may be the biggest change to Microsoft's strategy since it introduced Windows 3.0." By Leon Erlanger. In InternetWorld (March 15, 2001), pages 30-35. "Microsoft.NET is an infrastructure, a set of tools and services, and a flood of applications. Above all, it is a vision of a new user experience. From the user's perspective, there are four main principles: (1) The Internet will become your personal network, housing all your applications, data, and preferences. Instead of buying software in shrink-wrapped form, your organization will rent it as a hosted service. (2) The PC will remain your principal computing device, but you will have 'anywhere, anytime' access to your data and applications on the Internet from any device. (3) You will have many new ways to interact with your application data, including speech and handwriting. (4) The boundaries that separate your applications from each other and from the Internet will disappear. Instead of interacting with an application or a single Web site, you will be connected to what Microsoft calls a 'constellation' of computers and services, which will be able to exchange and combine objects and data to provide you with exactly the information you need... It is heavily dependent on four Internet standards: (1) HTTP, to transport data and provide access to applications over the Internet; (2) XML (Extensible Markup Language), a common format for exchanging data stored in diverse formats and databases; (3) SOAP (Simple Object Access Protocol), software that enables applications or services to make requests of other applications and services across the Internet; and (4) UDDI (Universal Description, Discovery, and Integration), a DNS-like distributed Web directory that would enable services to discover each other and define how they can interact and share information..."

  • [March 16, 2001] ".NET Framework. Something for Everyone? When .Net arrives, will Java fans JUMP or run?" By Jacques Surveyor. In InternetWorld (March 15, 2001), pages 43-44. "In order to accommodate the shift to pervasive computing that uses a Web-distributed model and browser interface as the dominant mode of corporate development, Microsoft has embraced three Net technologies which they previously resisted or adopted only reluctantly: Java, fully object-oriented programming (OOP), and open, standardized XML... In almost contradictory fashion, Microsoft is thus far sticking to an open and standardized version of XML as the third pillar of its .Net strategy. XML is being embraced throughout the .Net Framework to make data and processes more interchangeable and interoperable. Working with IBM and other W3C participants, including the often combative Sun Microsystems, Microsoft has helped to define some key XML extensions, including SOAP, for invoking remote processes through XML, and deployed UDDI as a universal directory of Web services. In addition, the company has ceded its own W3C recommendations and adopted such XML standards as XML Schema (XSDL) for extended XML schema definitions of documents. This is in strong contrast to Microsoft's treatment of related W3C recommendations such as HTML, CSS, DOM, and other browser-based standards, where Microsoft Internet Explorer 5.5 now lags behind Netscape 6.0. In the XML arena, Microsoft has been fairly well behaved, vigorously proposing standard alternatives or updates but adhering closely to W3C final recommendations. Microsoft's adoption of XML in the .NET Framework and the .NET Enterprise Servers will help close an interoperability gap. Currently, Microsoft does not directly support either CORBA or Java 2 Enterprise Edition, including such common Web development technologies as Java Servlets and Enterprise JavaBeans. Although other indepdendent software vendors support these technologies on Windows and other OS platforms, Microsoft will now be able to offer its own direct-connect solution based on XML and SOAP. Combining this with an easier-to-use ASP and the creation of Web Services on its own Windows 2000 platform, Microsoft will have a compelling .Net message."

  • [March 16, 2001] "Got SOAP? XML as a distributed computing protocol." By David F. Carr. In InternetWorld (March 15, 2001), pages 72-74. "Microsoft is promoting SOAP as a way for developers to apply the same techniques for distributed computing on an intranet, adding capabilities to a Web site, or publishing a Web service on the intranet. But it's also careful to say that it's not abandoning DCOM, the distributed version of the Component Object Model. For one thing, too many existing applications rely on DCOM. But it's also true that for many applications that are closely tied, DCOM will provide a tighter linkage and higher performance than would be possible with SOAP. Like DCOM, CORBA and Java RMI use binary protocols. That means speedier transmission across the network and more instantaneous processing by the recipient. XML messages suck up more bandwidth and have to be run through a parser before processing. However, XML messaging fans argue that those disadvantages are negligible, given the rapidly increasing speed of parsers and CPUs. Besides, they point to the success of the Web, which is also based on relatively verbose protocols but achieved far more widespread adoption than any competing network computing technology. And because XML messages are inherently open-source, a developer who is struggling with the subtleties of an API doesn't have to rely on the published documentation: He can intercept some sample messages and study them using a variety of XML tools. A protocol that is simple and open also stands the best chance of being implemented on a wide variety of operating systems and programming languages. The place where there's a clear case for XML messaging is on the Internet, where traffic in protocols like DCOM, CORBA IIOP, and RMI is rare. For one thing, firewalls tend to block them. But SOAP piggybacks on other Internet protocols that are already ubiquitous, meaning HTTP primarily, but also SMTP, FTP, and secure Web protocols such as SSL and TLS..."

  • [March 16, 2001] ".NET Analysis. Microsoft Is Not Alone: Web Services Initiatives Elsewhere in the Industry." In InternetWorld (March 15, 2001). Microsoft is doing such a good job of identifying itself as a leader in the Web services movement that you'd think it invented the idea of delivering services over the network... such network computing stalwarts as Sun Microsystems Inc. and Oracle Corp. were classified as laggards in a Gartner Group analysis of the Web services trend published in October. IBM was on the rise, as it joined with Microsoft and others to define emerging standards such as SOAP. The 'visionaries' on Gartner's trademark Magic Quadrant were Hewlett-Packard, Microsoft Corp., and Bowstreet, a startup rubbing elbows with the big platform vendors. As for leaders, there are none yet in the sense that none have yet demonstrated the ability to execute on the vision. Presumably, a few more products are going to have to come out of beta before that happens... [1] HP has paid particular attention to the problems of securing Web services and authenticating users, creating a protocol of its own called Session Level Security (SLS). The SOAP specifications themselves don't specify how messages should be secured, and the simplest solution is probably to send them over a Web security protocol such as SSL. [2] Sun has probably done more than anyone over the years to promote the idea of delivering services over the network, popularizing terms like 'Web tone' (from 'dial tone') to describe a telecommunications-like environment where getting computing resources is no more complicated than picking up the phone. Sun's Jini technology also has a lot in common with the Web services approach advocated by Microsoft, including the idea of services that are published on the network and registered in searchable directories [but, says Sun] 'It became clear to us two years ago that Jini was not the appropriate technology to deliver widescale services, so we jumped on the ebXML bandwagon.' [3] John McGee, director of Internet platform marketing at Oracle, expresses similar reservations about SOAP, while claiming that Oracle is way ahead of Microsoft in delivering on the general concept of Web services. [4] IBM is more enthusiastic about SOAP, having joined Microsoft, UserLand, and DevelopMentor in co-authoring the specification. It's also participating in the development of many related technologies, such as UDDI and the Web Services Description Language (WSDL). [5] Bowstreet, a three-year-old company that was among the first to promote the concept of Web services, created its tools for aggregating and reorganizing services before the current crop of emerging standards took shape. Bowstreet has also been active in the development of standards such as Directory Services Markup Language (DSML), Transaction Authority Markup Language (XAML), and UDDI, and it plans to turn its Businessweb.com directory of services into a UDDI registry..."

  • [March 16, 2001] "The .Net Initiative: Dave Winer. The President of UserLand and SOAP Co-Creator Surveys the Changing Scene." By David F. Carr. In InternetWorld (March 15, 2001), pages 53-58. "UserLand Software Inc. President Dave Winer is one of the co-authors of SOAP, the remote procedure call (RPC) now being popularized by Microsoft Corp. He has also promoted XML-RPC, an earlier spinoff of his collaboration with Microsoft and DevelopMentor. Later, IBM and its Lotus division also got involved in the development of SOAP. Now a long list of corporate supporters are backing SOAP as the foundation of the World Wide Web Consortium's XML Protocol (XP) project. And, of course, it is the foundation for distributed computing in Microsoft.NET, largely replacing DCOM (the distributed object computing version of Microsoft's Component Object Model) and challenging technologies such as Java Remote Method Invocation (RMI) and the Object Management Group's Common Object Request Broker Architecture (CORBA). Winer -- equal parts industry gadfly and software-development guru -- originally made his mark creating software for the early Apple PCs. He created several commercial hits for Living Videotext, which later became part of Symantec Corp. He has concentrated most of his development efforts on software for organizing and publishing information. He is also a prolific writer whose essays on everything from code to politics and culture appear in many industry publications as well as his self-published newsletters and the DaveNet Web site..."

  • [March 16, 2001] "The Internet World Interview: Jeffrey Richter Wintellect's co-founder on teaching .Net programming to the Microsoft workforce." By Jonathan Hill. In InternetWorld (March 15, 2001), pages 61-63. "When you go in and train Microsoft employees, they have varying skill sets and backgrounds. What are some of the hot-button items, the things that are the toughest to explain? Richter: The .NET Framework is an object-oriented programming platform, and some people don't have a strong object-oriented foundation. Visual Basic programmers, for example -- they'll have some difficulties picking up some of the concepts, such as inheritance, polymorphism, and data extraction, which are the three tenets of object-oriented programming. The platform is incredibly rich and large, so in the class we cover many topics, and it happens very quickly. I'm sure that a lot of people walk out and need to go back to documentation. They won't remember everything I say, because there's so much material. IW: Do you find that the object-oriented concepts are things that you need to go over a lot, or do you refer people? Richter: No. I give them a reading list. But object-oriented programming really started to get into favor in the early '80s, so it's over 20 years old now. I think even Visual Basic programmers who may not have worked with it have had some exposure to it. I've also had some VB programmers come into the class where they do the labs in C#, Microsoft's new programming language, and they had no problem doing that. So, in certain cases, yes, I need to review with them and show them polymorphism, what it means. But I think they're able to pick it up pretty quickly..."

  • [March 16, 2001] "C#: Not Just Another Programming Language." By Jeff Prosise. In InternetWorld (March 15, 2001). "Microsoft intends to provide five language compilers for .NET: Visual Basic, C++, C#, JScript, and MSIL. Third parties are actively working on .Net compilers for about 25 other languages, including Smalltalk, Perl, Python, Eiffel, and yes, even COBOL. But the language that has garnered the most attention by far is C# ('C-Sharp'). C# has become a lightning rod of sorts for the anti-Microsoft camp and is frequently characterized, fairly or not, as Microsoft's answer to Java. In reality, C# is a relatively minor player in the .Net initiative. It's one of many languages that a developer can use to write .Net apps. It's arguably the best language as well, because it's the only one built from the ground up for .Net. But at the end of the day, arguing the merits of C# versus Java is a red herring. It's the .NET Framework -- the combination of the CLR and the FCL -- that is the essence of .Net. C# is merely the cherry on top. These points nonwithstanding, C# could become one of the most popular programming languages ever if developers embrace .Net. Few C++ programmers that I know write .Net code in C++; most use C# instead. It's an easy transition, and C# code is more elegant and understandable than the equivalent code written in C++. Even a few Visual Basic developers I know are moving -- or are considering moving -- to C#. In all likelihood, the vast majority of .Net developers will do their work in either VB or C#. If .Net is in your future, then there's a good chance that C# is, too."

  • [March 16, 2001] "Mainframe .NET." By Don Estes. In eAI Journal (March 2001), pages 35-40. ['Although mainframe strategies aren't well documented in the blueprints from Redmond, the architecture may be ideal for bringing legacy systems into the distributed computing world. The way forward is XML encapsulation, which can be surprisingly easy.'] "Microsoft has published a blueprint for the next generation of computing services, the '.NET' strategy. This isn't a proprietary vision. Microsoft is joining industry visionaries and other vendors to describe how Internet-coupled computing will function. Selected users, through a discovery process, will access available services on each Internet site. External parties will use the services, which will be based on loosely cou-pled transactions and will provide a robust, fault-resilient, low-cost replacement for Electronic Data Interchange (EDI) implementations. Most important, use of eXtensible Markup Language (XML) as the foundation of the data exchange allows for commonly accepted dictionaries of data semantics. This provides a practical and scalable solution for many-to-many data exchange... NET recognizes that old architectures appropriate for local computing don't scale to the Web. Some organizations have legacy mainframes that support as many as 1,000 variant data exchange formats with nearly 20 EDI customers. Keeping the system working smoothly on their end requires constant attention, not to mention the effort at each customer's site. As the number of point-to-point data exchange partners increases, the sum total effort to keep everything working smoothly increases geometrically. Point-to-point, or two-tier strategies cannot scale to the Web, where 20 clients could become 20,000 or 20 million. What's required is a three-tier data exchange strategy. Each client transforms their data into a universally accepted (or at least industry accepted) format. Then, the correspondent transforms received data from the universal format into their own local format. The total effort of supporting a three-tier strategy scales linearly with the number of formats used for data exchange at each site. This provides a practical solution. XML is key to scalability of the .NET strategy, and here we begin to see why. XML reduces the effort to implement three-tier data exchange in two ways. First, universal dictionaries of XML data tags and their semantic definitions provide the intermediary for three-tier exchange. Second, a subset of XML, the XML Stylesheet Language Transformations (XSLT) process, provides standard engines for translating from one dialect to another. Internet scale issues also accrue for computing services. Providing human readable menus or documentation similarly cannot scale to the Web. The .NET and similar visions provide for publishing available services in a dialect of XML, Web Services Description Language (WSDL), and a discovery process to navigate through available services. The process of redeploying legacy applications as XML-encapsulated, trusted components in a .NET or similar architecture can be surprisingly easy. There are first-generation solutions available providing XML encapsulation via middleware solutions. There are also second-generation solutions with native XML logic providing the encapsulation and componentization. Reviewing a COBOL program that has been subjected to the XML encapsulation transformations, the immediate response may be raised eyebrows and a comment to the effect that, 'This is pretty simple code!' Because the n-tier architectures are new, there's a tendency to think of them as too complex to merit study, given your growing daily workload. But the truth is that this approach represents a simple, straightforward, and sensible strategy to evolve valued legacy programs into the Web-based future of computing. What should you do when you need to reorganize 20- or 30-year-old legacy applications for e-business in the Internet age? The options available to renovate legacy applications to enable e-commerce are surprisingly rich. Although Microsoft's .NET strategy of encapsulating legacy applications via XML isn't (and won't be) the only vision of future computing worth consideration, it's clear that it can cost-effectively deliver trusted processes into the Internet age. Does it make more sense to evolve legacy applications or build new? With XML encapsulation, there's no technical reason to throw away valued applications. Considering the risks involved in replicating critical business processes precisely, preserving legacy applications is sensible. It's easy, inexpensive, and low risk. So if your legacy applications are still fulfilling their business purpose, XML encapsulation may be the best strategy, particularly if you can also resolve any other structural issues during the implementation. If, on the other hand, your business has moved so far in another direction that your legacy applications only partially fulfill business needs, you should seriously consider wholesale replacement of those systems and weigh the cost, benefit, and risk."

  • [March 16, 2001] Extended DumbDown for Dublin Core metadata. From Stefan Kokkelink. Experimental. "I have set up an online demonstration of a (extended) dumb-down algorithm for Dublin Core metadata. There are several examples available, try the E[1-6] buttons. RDF documents using DC properties should be responsible for seeing that for every DC property (or subProperty) a meaningfull literal value can be calculated by the algorithm described below. Documents respecting this algorithm can use any rdfs:subPropertyOf or any additional vocabularies (e.g. for structured values) they want: the algorithm ensures that these documents can be used for simple resource discovery however complex their internal structue may be. Extended DumbDown algorithm: This algorithm transforms an arbitrary RDF graph containing Dublin Core properties (or rdfs:subPropertyOf) in an RDF graph whose arcs are all given by the 15 Dublin Core elements pointing to an 'appropriate literal'..."

  • [March 16, 2001] "Querying and Transforming RDF." By Stefan Kokkelink. "QAT basic observation: The data model of XML is a tree, while the data model of RDF is a directed labelled graph. From a data model point of view we can think of XML as a subset of RDF. On the other hand XML has a strong influence on the further development of RDF (for example XML Schema <-> RDF Schema) because it is used as serialization syntax. Applications should take into account this connection. We should provide query and transformation languages for RDF that are as far as possible extensions of existing (and proven) XML technologies. This approach automatically implies to be in sync with further XML development." See the working papers: (1) "Quick introduction to RDFPath" and (2) "Transforming RDF with RDFPath" ['The Resource Description Framework (RDF) enables the representation (and storage) of distributed information in the World Wide Web. Especially the use of various RDF schema leads to a complex and heterogenous information space. In order to efficiently deploy RDF databases, we need simple tools to extract information from RDF and to perform transformations on RDF. This paper describes two approaches for transforming RDF using the RDF path language RDFPath. The first approach realizes transformations within an Application Programming Interface (API) and the second approach describes a declarative transformation language for RDF (analogously to XSLT for XML).'] From the 2001-03-16 posting: "After investigating the currently available techniques for querying and transforming RDF (for example see [1]) I would like to propose an alternative approach that is connected more closely to the XML development. Basically I would like to have the counterparts of XPath,XSLT and XQuery in the RDF world: RDFPath,RDFT and RQuery. This approach has (in my opinion) some advantages: (1) benefit from the lessons learned from XML; (2) don't reinvent the wheel: copy and paste as long as possible, extend if necessary; (3) be in sync with XML development. This approach is feasible because from a *data model* point of view XML (tree) is a subset of RDF (directed labelled graph)..." See "Resource Description Framework (RDF)."

  • [March 16, 2001] ".Net Gets XML Right." By Jim Rapoza. In eWEEK (March 12, 2001). "Perhaps creating a product in a new field where there are no established leaders to catch up to (or copy) is a good thing for Microsoft Corp. The company's BizTalk Server 2000 is an excellent platform for managing XML data processing among businesses and is one of the best first-version offerings eWeek Labs has seen from Microsoft. Although BizTalk Server 2000 includes a server element for handling data transfers, its real strength lies in its suite of tools, which provide powerful, intuitive interfaces for creating and transforming Extensible Markup Language files and for collaborative creation of business processes. The product is one of the most important in Microsoft's .Net initiative because XML is at the core of .Net. Despite its still less-than-perfect support for standards, we believe BizTalk Server 2000 sets an impressive standard for functionality and usability in XML processing. For these reasons, it is an eWeek Labs Analyst's Choice. BizTalk Server 2000, which shipped last month, comes in a $4,999-per-CPU standard edition that supports up to five applications and five external trading partners, and in a $24,999 enterprise edition with unlimited support for applications and trading partners. Like most .Net servers, the product runs only on Windows 2000 Advanced Server and requires SQL Server 7.0 or later. BizTalk Server also requires Microsoft's Visio 2000 charting application and its Internet Explorer 5.0 Web browser or later. One core tool in the product is BizTalk Editor, which makes it very simple for users to create schemas specific to their business needs using an intuitive, tree-based builder interface. Another useful tool in tests was BizTalk Mapper, which let us transform XML and other data documents such as electronic data interchange and text files, using a straightforward interface to map the documents into proper formats. BizTalk Mapper then generates an Extensible Stylesheet Language Transformations file to manage the document transformations. By default, BizTalk Server 2000 is still based on Microsoft's XML-Data Reduced schema. However, the product includes a command-line conversion utility to convert data to the World Wide Web Consortium's XSD (XML Schema Definition) standard. Although this works, we would like to have XSD support built into the tools to make the server easier to integrate with other XML data systems. The server also supports Simple Object Access Protocol, an XML-based protocol for issuing remote calls... Companies that expect XML to become the lingua franca of business data interactions will find BizTalk Server 2000 to be an excellent translator. The product provides some of the most powerful and intuitive tools available for creating, managing and distributing XML data, making it an Analyst's Choice."

  • [March 16, 2001] "Introducing the 4Suite Server. An XML data server for Unix." By Uche Ogbuji. In UnixInsider (March 2001). ['Over the last few months, Uche Ogbuji has covered XML and its applicability to Unix professionals in various articles for Unix Insider. In this feature, Uche continues to share his work on XML with our readers by introducing the 4Suite Server, the tool that most nearly realizes XML's goal of standardizing and simplifying data processing.'] "The 4Suite Server is an open source XML data server, designed for maximum integration into other tools. Its XML capabilities are meant to complement your existing application framework and provide XML services as they're needed. While it does run on Windows, we recommend Unix for its superior OS architecture and implementation. In this article, I take a hands-on look at the 4Suite Server, illustrating a variety of XML processing tasks, using built-in facilities such as the command-line utilities. You don't have to know any particular programming language to follow the examples in this article, but you do need a basic understanding of XML. In addition, an understanding of XSLT and RDF are helpful for the last section, and gentle introductions to these concepts are in the Resources section... Standards have always been a major issue for the Unix community, but open standards are the key to lowering the notoriously high long-term cost of maintaining computer software. The 4Suite Server provides implementations of many XML-related standards in order to make it easier for users to begin using the more exciting technologies that build on XML, and to make it easier to plug into other software tools and frameworks. Interoperability and flexibility in working with other software packages is a major design goal of the 4Suite Server. Here we discuss some of the facilities of the 4Suite Server based on XML and related standards. [1] Resource Description Framework Resource: Description Framework (RDF) is a standard for Internet-based metadata that is recommended for use alongside XML and even HTML. It provides an abstract model for statements about resources (which are URIs) and relationships among resources. It also defines an XML-based syntax for exchanging descriptions and a schema mechanism for expressing logical statements about resources. The 4Suite Server includes RdfServer, which maintains an RDF model efficiently, using a choice of persistence backend similar to that of the repository itself. You can import and export XML serializations of RDF, for instance, as well as those used by the Open Directory project, the rpmfind database system, or the WordNet project. It allows RDF statements to be made or managed directly or to be extracted automatically based on rules whenever an XML document is added or modified. In addition, a special query language for RDF metadata is provided. RDF Inference Language (RIL) is an open specification developed by Fourthought for rules-based querying of RDF models. [2] Extensible Stylesheet Language for Transforms: The built-in Extensible Stylesheet Language for Transforms (XSLT) transform engine allows translations from one XML format to another or from XML to HTML for viewing with legacy Web browsers. [3] Extensible Linking: XML's linking standard, Extensible Linking, or XLink, starts by implementing HTML's simple unidirectional hyperlinks and adds a thatch of features, including links with multiple end points, bidirectional links, and links that aren't expressed in the document at either end point. In addition, there's XPointer support. XPointer allows access to partial constructs within an XML document. The XML repository also allows users to define custom URI handlers, permitting sneaky substitution or redirection of URIs for effects such as proxies, filters, or notation systems. . . The 4Suite Server is a quick way to get into XML as a learning tool, a rapid prototyping tool, or a final deployment platform for XML-based apps."

  • [March 16, 2001] "XML and WAP." By John Evdemon (Chief Architect, XML Solutions). January, 2001. A presentation given to the Washington Area SGML/XML Users Group. 54 slides, PDF format. "Basic Definitions: [Wireless Application Protocol (WAP), eXtensible Markup Language (XML), Wireless Markup Language (WML)]; WAP's Differentiators: [Bluetooth, DoCoMo i-mode; Combining XML with a wireless protocol standard]; The Trouble with WAP; The Wireless Future. What is WAP? WAP is a technology based on Internet technologies for use by digital phones WAP is backed by major vendors: Nokia, Ericsson, Motorola, Microsoft, IBM. WAP Forum is open for all: Over three hundred companies have joined the WAP Forum. WAP supports several wireless systems: GSM, IS-136, CDMA, PDC etc. WAP has a layered architecture: The same application can be used via several systems. WAP 2.0: Next generation of WAP will include XHTML (with backwards compatibility to WML); TCP support; Color graphics; Animation; Large file downloading; Location-smart services; Streaming media; Data synchronization with desktop PIM. Specs are being built in anticipation of Network evolution and Handheld evolution..." See: "WAP Wireless Markup Language Specification (WML)."

  • [March 16, 2001] "WSDL Specification Sent to W3C." By Christopher McConnell. In ent - The Independent Newspaper for Windows NT Enterprise Computing [Online]. (March 15, 2001). "A key specification for Microsoft Corp's .NET initiative has been submitted for review to the World Wide Web Consortium (W3C). The Web Services Description Language (WSDL) provides a grammar for XML, enabling computer-to-computer transactions via the web. A number of Microsoft partners have joined in co-submitting the spec to the W3C. Companies range from database vendor Oracle Corp. to ERP giant SAP AG to purveyors of development tools such as BEA Systems Corp. and Ariba Corp. to OEMs Compaq Computer Corp. and Hewlett Packard Co. WSDL complements the Simple Object Acess Protocol (SOAP) by describing the nature of a transaction through XML. With a WSDL implementation, programs can understand what types of data are transferred and how to use the data. Microsoft has aggressively pushed key .NET specifications to the W3C. It submitted SOAP for review in May, 2000, and is preparing UDDI for review..." See discussion.

  • [March 16, 2001] "Can XML Succeed Where EDI Has Failed?" By Lauren Gibbons Paul. From IDGNet. March 01, 2001. "Envera provides an electronic link between chemical companies and their customers. Sounds simple, but why should the XML-based platform succeed where EDI failed? ... Clearly, the way chemical companies conduct transactions is in need of an overhaul. The question remains whether a marketplace such as Envera -- using XML as the linchpin -- can provide the answer. Last March, Mooney, Mike Giesler, then Ethyl's CIO, and two other cofounders began knocking on colleagues' doors, talking about creating an electronic hub for the chemical industry they called Envera (roughly translated from Latin, envera means "in truth"). Envera would differ from other electronic trading exchanges that were then making headlines, such as the chemical industry's CheMatch.com and the auto industry's Covisint, in that it would not attempt to match sellers with buyers. Rather, it would serve only as an electronic platform on which already-established business partners could conduct their transactions. Envera would not take a piece of each transaction that it hosted but instead would charge members an annual subscription fee of between $5,000 and $300,000, depending on company size. Mooney and Giesler got a warm reception from their peers, snagging funding from 11 companies. Things moved quickly after that. Giesler and Mooney left Ethyl in July and by the end of the summer Envera had hammered out XML document definitions for eight basic business processes in conjunction with an industry standards group. By the fall, the initial phase of Envera was up and running, with partners such as Lubrizol and Occidental Chemical beginning to conduct business online. To date, only a tiny number of transactions have taken place on Envera. Giesler expects business to jump once Envera's 40 trading partners come online this spring... Just because Envera has made it out of the starting gate is hardly a guarantee of its eventual success. Like all electronic trading exchanges and hubs, Envera faces enormous obstacles. For starters, it has new competition: a similar online exchange for the chemical industry dubbed Elemica. Elemica, a Philadelphia-based e-marketplace that went online in a test phase this past January, is also based on an XML platform, and it is backed by 22 of the largest chemical companies, including BASF, Dow Chemical and DuPont. With Elemica in the picture, Envera may find it harder to sign up more companies as subscribers.Whether Envera can grow beyond its initial image as an extension of Ethyl presents another challenge. The e-hub will succeed only if industry companies see it as a neutral platform that exists for the benefit of all companies. The fact that the nine Envera owners are also its users could become a problem down the road. Despite the uncertainty surrounding electronic exchanges, Envera has earned modest praise from some industry watchers... But just providing a standard language is not enough -- syntax is needed too. For XML to be truly useful requires the definition of standard documents, such as a purchase order, to be used within the industry. And once those business documents have been defined, they must be widely adopted. In the chemical industry -- as everywhere -- multiple groups with multiple agendas are pursuing multiple standards. Envera has made quick progress on its eight initial XML documents, but a potential battle looms with competitor Elemica. XML has an obvious advantage over EDI in that it leverages existing infrastructure (such as the Internet) and is therefore not expensive to adopt. And it does have some technical advantages... This time around, there are hopeful signs. Envera has agreed to share its eight initial XML document definitions for use with CIDX for use by any company in the industry. However, the ability of competitors to coalesce around standards was immediately tested when Elemica announced last summer that it too was working on XML document definitions for a purchase order and an order acknowledgment, among others. Representatives from Envera and Elemica gathered around the bargaining table and hammered out common definitions for the good of all. For their part, Mooney and Giesler say they'll do what's necessary to work out a common standard or arrange for Envera to map to different standards, as needed..."

  • [March 16, 2001] "IBM Package Boosts Standards in WebSphere." By Ed Scannell. In InfoWorld (March 15, 2001). "Touting its ability to provide optimized delivery for Web services, IBM on Wednesday unveiled WebSphere Technology for Developers, which supports several Web standards such as the Universal Description Discovery and Integration (UDDI) specification. By supporting UDDI and the Simple Object Access Protocol (SOAP), IBM's package helps corporate users create e-business applications and services that are better able to interact with other Web-based applications. IBM officials believe Web services are spearheading a new era in e-business where the Internet will be shaped and driven by more robust applications. With the new product, IBM officials claim the company is the first to implement and fully integrate HTTPS, which combines SSL (Secure Sockets Layer) with HTTP as well as HTTP Authentication and SOAP security. The new set of capabilities includes support for digital signatures and the ability to enable end-to-end authentication, integrity, and nonrepudiation for SOAP messages. The new environment also includes Sun Microsystems' Java 2 Enterprise Edition (J2EE), which will give developers the ability to create the foundations of business-oriented applications that can operate across multiple platforms and environments. The program also offers the Web Services Description Languages (WSDL), which is able to describe programs accessible over the Internet and the message formats and protocols that are used to communicate with them. IBM believes WSDL is particularly important because it allows Web services to describe their capabilities in a standard way, making it easier for them to interoperate with other Web services and development tools... Separately, IBM announced a new version of WebSphere for its z/OS and OS/390 mainframe operating systems. It also includes support for J2EE. The new version includes WebSphere Application Server for z/OS and OS/390 and CICS Transaction Server 2.1." See the announcement.

  • [March 16, 2001] "IBM Advances Web Services Strategy." By Mary Jo Foley. In CNET News.com (March 14, 2001). "IBM announced Wednesday the next phase of its Web services game plan. Big Blue is shipping a new version of its WebSphere application server -- fortified with support for the leading Web services protocols and standards -- that it plans to make available to developers for free. As the battle for developer mind share in the Web services market is heating up, each of the major software companies is attempting to play to its strength. In IBM's case, that means its middleware Internet infrastructure software and related development tools... Giga Information Group analyst Mike Gilpin agreed with Hebner's assessment. 'IBM is really the first to (make generally available) tools like these that are needed for Web services to take off,' Gilpin said. Gilpin added that widely available tools, such as those in IBM's WebSphere Technology for Developers release, will likely take the pain and expense out of hand-coded Web services. Most existing payment, insurance and travel Web services have been built by hand from scratch, Gilpin said. WebSphere Technology for Developers includes built-in support for XML (Extensible Markup Language); UDDI (Universal Description and Discovery Integration) standard; SOAP (Simple Object Access Protocol); WSDL (Web Services Description Language); and J2EE (Java 2 Enterprise Edition) technology. XML is the new lingua franca of the Web, designed to make sharing data easier. UDDI acts like a Yellow Pages for Web services by exposing them and helping developers to locate them. SOAP is an emerging standard for distributed computing interoperability. WSDL is an XML format aimed at improving Web services messaging-interoperability technology. And J2EE is a standard technology for developing and launching enterprise applications. To obtain a free copy of WebSphere Technology for Developers, a developer must be 'referred' to IBM as a potential WebSphere customer by either an IBM salesperson or an IBM partner. Developers can contact IBM for a referral. IBM announced the WebSphere release in conjunction with its weeklong WebSphere 2001 trade show in Las Vegas. IBM also announced on Wednesday availability of a version of its WebSphere Internet infrastructure software that has been written to run on its eServer z900 and OS/390 mainframes..." See the announcement.

  • [March 16, 2001] "Commentary: IBM takes lead in services. [Gartner Viewpoint.]" By Massimo Pezzini, Gartner Analyst. In CNET News.com (March 14, 2001). "IBM announced WebSphere Technology for Developers because it wants to keep building the credibility of its e-business middleware strategy and to assert leadership in the emerging Web service arena. No less important is catching up with Java 2 Enterprise Edition competitors. Although several application server vendors have committed to supporting Web services in their J2EE platforms, IBM's announcement on Wednesday makes it the first to deliver a real -- albeit functionally limited -- product. The announcement further validates the notion of Web services. It follows the announcement of IBM's WebSphere strategy in November and positions IBM as a serious candidate for leadership in both J2EE and Web service technology. WebSphere Technology for Developers is not production-ready, but rather a preview of WebSphere v.4, the next major update of the WebSphere Application Server family. It runs only on Windows NT and DB2 and supports Web service protocols such as SOAP, UDDI, WSDL and XML, along with related development tools. The WebSphere 4 product set also will include the zSeries run-time version--that is, WebSphere Application Server for z/OS and OS/390, also announced Wednesday--and still unannounced Unix/Windows 2000 versions likely to be available in the second quarter. The WebSphere release will allow Java developers to familiarize themselves with, and start developing applications for, WebSphere 4 and to experiment with Web service technology. IBM has trailed other vendors in support for J2EE specifications. For example, WebSphere Advanced Edition 3.5 does not support Enterprise JavaBeans 1.1. WebSphere Technology for Developers fills this gap by being the first WebSphere version officially certified by Sun Microsystems as J2EE-compliant. When available, WebSphere 4 will be, too. Thanks to WebSphere Technology for Developers, IBM changes from a follower into a leader. In fact, vendors such as BEA Systems, Hewlett-Packard/Bluestone Software and iPlanet will have to catch up by quickly delivering SOAP/UDDI capabilities in their application servers--or be marked as technology laggards..." See the announcement.

  • [March 16, 2001] "About Multimodal ZVON." By Jiri Jirat. From the ZVON project. March 2001. Abstract: "Multimodal ZVON is a demonstration of a site powered by XML/XSLT technology. The same XML sources have been used to create several different output formats (including graphics). Moreover, a sophisticated search and a site map have been very easily implemented, since all sources are XML." Description: "The 'Multimodal view of ZVON' is a demonstration project which shows how XML with XSLT can be used to create and maintain a website. Multiauthoring is very easy, thanks to the transparency of the XML format. Now the website is completely built using XML and in the following few pages we will briefly describe the framework. The whole process consists of the following main steps: (1) Defining data layout - storage of different data types in XML files, directory and data structure definition. (2) Creating presentation layout: pictures (SVG), various text formats (HTML, XML, PDF) (3) Using secondary information (metadata) - creating specialized search and a site map." [cache]

  • [March 16, 2001] "Comparing Beeyond with XML." By VU/Beeyond Staff. "Beeyond does not support XML, because there is currently no universal standard which defines what XML tags represent. Individual companies have published standards related to their proprietary technologies, but these are merely product specific APIs. In order for XML to really fulfill its promise as a generic solution for data exchange between companies and programming languages, tag definitions must be agreed upon industry wide. When that occurs, and if XML is widely accepted by developers, then we will support it in Beeyond. In the meantime, however, Beeyond provides its own solution for inter-company data exchange that gets around some of the problems that will plague XML even after standards are developed. In particular, it simplifies the process of prior agreement between companies on message definitions and allows messages and their associated applications to be updated without system language programming. ['Beeyond (from Virtual Unlimited (VU), an Internet software development company based in Veldhoven, The Netherlands.) is a system for building and running secure documents and database-backed network applications with Java user interfaces. It is powerful and inexpensive enough to be used for all kinds of applications. In addition to secure applications and documents with Java user interfaces, Beeyond contains a unique class of messages called BeeXchange. These messages allow companies to exchange data easily and securely and form the basis for Beeyond's B2B application functionality. Beeyond's strong authentication and encryption also enable BeeHive servers to create secure, virtual private networks on the Internet. Third, Beeyond uses a new application development model that allows applications to be built quickly and changed easily. A few of the features that set Beeyond apart from other products: Simplicity, Message-oriented, Scriptable applications, Database brain.']

  • [March 15, 2001] "Issues Raised in DFAS XBRL Proof of Concept." Federal CIO Council XML WorkingGroup. March 2001. "Under the guidance of the Defense Information Systems Agency, the Defense Information Infrastructure Common Operating Environment (DII COE) Chief Engineer has established a XML management framework and web based registry for DoD XML users. This framework is designed to address the causes of non-interoperable XML and ballooning management overhead resulting from proliferation of XML groups. One mechanism for addressing data disambiguation (collisions/conflicts) is the namespace concept. Within the DoD XML namespace and registry, the Defense Finance and Accounting Service (DFAS) is the manager for DoD finance and accounting data items (tags). In carrying out this responsibility, DFAS is developing a technology adoption strategy for the XML family of technology. Using this approach, DFAS identifies a technology "area of opportunity " and proactive sponsor. Our first area of opportunity identified was financial reporting and the DFAS sponsor was Accounting Directorate. The Agency selected this area as a 'proof of concept' because the eXtensible Business Reporting Language (XBRL) is a market based framework that provides a method to prepare, publish, extract and exchange commercial financial statements, and that a Federal taxonomy was under development for Department/Agency reporting. Proof of Concept: KPMG Consulting, LLC was contracted by DFAS to provide an XBRL proof of concept. XML and XBRL are emerging technologies, and a proof of concept was necessary to assess the applicability of XBRL in meeting DFAS strategy. The proof of concept was designed to demonstrate the ability to dynamically create DoD financial statements in HTML through the use of XML technology and more specifically XBRL. The HTML was selected as the final output based on its ability to be viewed by a large audience through the use or a browser. In order to demonstrate this, KPMG started with the XBRL (in process) Federal Taxonomy, which restates the OMB Form and Content for Federal financial statements. This taxonomy was extended for differences in the DoD Form and Content. Next, actual balance sheet data was populated in the XML document to create an XBRL instance document. XSLT style sheets were then created to dynamically generate three HTML documents representing a consolidated and consolidating balance sheet as well as one footnote . Additionally, links were inserted in the balance sheets to demonstrate the capability to "drill down" into detailed data from a summary presentation. In Summary, the DFAS XBRL proof of concept project demonstrated the ability to render financial data in HTML through the use of XBRL. By using XBRL, source data modifications resulted in dynamic updates to the HTML documents..." The XBRL Consortium's responses will be posted ca 2001-03-23 at http://xml.gov/documents_work_in_progress.cfm. Note in this connection the presentation 'eXtensible Business Reporting Language (XBRL)' by Zach Coffin, Sergio De la Fe, and Chris Moyer, summarized in Federal CIO Council XML Working Group Meeting Minutes, October 18, 2000. See: "Extensible Business Reporting Language (XBRL)."[cache]

  • [March 15, 2001] "Getting Started with XML-RPC in Perl, Part 1. Using XML-RPC for Web services." By Joe Johnston (Senior Software Engineer, O'Reilly and Associates). From IBM developerWorks, Web services. March 2001 ['Creating an XML-RPC Web service with Perl is almost as easy as CGI scripting. This article will bring you up to speed on what XML-RPC is and how to use Perl's Frontier::RPC library to create simple clients and servers.'] "Remember the thrill of watching your first HTML form work? Perhaps you simply e-mailed the contents of the form to yourself or displayed another HTML page with whatever information the user entered. Whatever you did, you created what an information architect would call a two tiered or client/server system. With just a little additional work, the input gathered from a Web form can be stored in a database. In this way, multiple clients can interact with a single database using only their browser. The information stored in the database can be formatted into an appropriate HTML display on demand by CGI scripts. A typical Web application that uses this sort of architecture is a Weblog like SlashDot. The code that generates the HTML page is called the front end and the part that contains the database and business logic is called the back end. This system works very well until either your database or your Web server can no longer handle the traffic. If the bottleneck lies with your Web server, you may decide to simply add more Web machines to your network. If you connect to your database with native Application Programming Interface (API) calls in your front end, it becomes difficult to change the back end implementation. Switching database vendors or trying to cluster the database servers would mean changing all your front end code. The solution is to separate the presentation logic of the front end from the business logic of the back end, but they still need to be connected. The software that provides the conduit between the front end and the back end is called middleware. And one very simple, open architecture middleware protocol that works well in Web applications is XML-RPC. XML and RPCs: Remote Procedure Calls (RPC) are not a new concept. A client/server system, RPCs have traditionally been procedures called in a program on one machine that go over the network to some RPC server that actually implements the called procedure. The RPC server bundles up the results of the procedure and sends those results back to the caller. The calling program then continues executing. While this system requires a lot of overhead and latency, it also allows less powerful machines to access high powered resources. It also allows applications to harness the computational muscle of a network of machines. A familiar example of this type of distributed computing is the SETI@Home project. Dave Winer, of Frontier and Userland fame, helped extend the concept of RPC with XML and HTTP. XML-RPC works by encoding the RPC requests into XML and sending them over a standard HTTP connection to a server or listener piece. The listener decodes the XML, executes the requested procedure, and then packages up the results in XML and sends them back over the wire to the client. The client decodes the XML, converts the results into standard language datatypes, and continues executing..." Available also in PDF format. See: "XML-RPC." [cache]

  • [March 15, 2001] "The [NEL] Newline Character." By Susan Malaika (IBM). W3C Note 14-March-2001. "The omission of [NEL], the newline character defined in Unicode 3.0, from the End-of-Line Handling section in the XML 1.0 specification causes significant difficulty when processing XML documents and DTDs in IBM mainframe systems. Problem areas include: (1) Processing XML documents or DTDs generated on OS/390 systems, with XML 1.0 compliant parsers. (2) Processing XML documents or DTDs, using native OS/390 system tools. (3) Processing XML documents or DTDs retrieved from OS/390 database or file systems, in non-OS/390 environments. XML documents that contain [NEL] characters are declared invalid or not well-formed by XML 1.0 compliant parsers. We urge the W3C to include [NEL] as a legal line ending in XML, and hence as a legal white space character, in accordance with Unicode 3.0." See also (1) submission request and (2) W3C staff comment. From the W3C staff comment by C. M. Sperberg-McQueen: "XML 1.0 specifies special handling and normalization for line-boundary character sequences, in an effort to control the complexity which results from the variety of ways in which different operating systems and software products mark line boundaries in data streams. This submission describes an unfortunate but apparently reparable consequence of a design decision taken by the then SGML WG of the W3C in the fall of 1996 in specifying this part of XML 1.0, and outlines a simple and relatively non-intrusive means of making the necessary repair in a way compatible with related work. This submission will be referred to the XML Core WG for action. In light of the background, which suggests that the design of this part of XML relies on what has turned out to be a false factual assumption, the XML Core WG may choose to include the suggested change (as well as a change specifying that PS and LS should be treated as space characters, and clarifying whether they should also be treated as line-separators) in XML. The XML Core Working Group has the responsibility for deciding whether to modify the line-end handling rules of XML or to leave them unmodified in any future version of XML prepared by that Working Group." [latest version URL]

  • [March 15, 2001] TAXI to the Future." By Tim Bray. From XML.com. March 14, 2001. ['Tim Bray presents TAXI, a Web application architecture that utilises the power of XML to deliver a responsive user environment.'] "There's not much new about TAXI. I'll claim that if you polled the original group of a dozen or so people, led by Jon Bosak, that defined XML 1.0, you'd find out that something like TAXI was what most of us had in mind. As we all know, XML has mostly been used in backend and middleware data interchange, not in front of the user the way its designers intended and the way TAXI does it. It's long past time for the TAXI model to catch on... TAXI: Transform, Aggregate, send XML, Interact. My claim is that TAXI delivers many of the benefits, and hardly any of the problems, of the previous generations of application architecture discussed above. Let's walk through it... Transform: A lot of business logic boils down to one kind of data transformation or another: applying transactions, generating reports, updating master files. The right place to do most of this work is on the server, where you can assume a rich, high-powered computing environment. Aggregate: The next architectural principle is the aggregation of enough data from around the server to support some interaction with the user. An example would be a list of airplane flights that could be sorted and filtered. Send XML: Once you've gathered an appropriate amount of data together on the server side, you encode it in XML and send it off to the client over HTTP. There's no need to get fancy; we generate XML using printf statements in C code. If you're fortunate, there'll be a well-established XML vocabulary available that someone else invented for use in your application; but probably not, and you'll have to invent your own. Interact: Once the XML has arrived in the client, probably a Web browser, you'll need to parse it. Your browser probably has this built-in; it may be more convenient to compile in Expat or Xerces or one of the other excellent processors out there... Why TAXI is a Good Idea: First, it comes at the user through the browser, something that they've proved they want. Second, the application can run faster and scale bigger than traditional web applications in which the server does all the work. Third, the system is defined from the interfaces out, so nobody can lock it up, and you can switch your black-box clients and servers around with little difficulty or breakage."

  • [March 15, 2001] "EXSLT 1.0 Drafts." From Jeni Tennison. Posting to XSL-List. (1) Common - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - COMMON. "This document describes the common set of EXSLT 1.0. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. The common set of EXSLT 1.0 are those extension elements and functions that provide a base level of common functionality that the rest of EXSLT can build on. XSLT processors are free to support any number of the extension elements and functions described in this document. However, an XSLT processor must not claim to support EXSLT 1.0 - Common unless all the extensions described within this document are implemented by the processor. An implementation of an extension element or function in an EXSLT namespace must conform to the behaviour described in this document." (2) Functions - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - FUNCTIONS. "This document describes EXSLT 1.0 - Functions. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. EXSLT 1.0 - Functions are those extension elements and functions that allow users to define their own functions for use in expressions and patterns in XSLT." (3) Sets - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - SETS. "This document describes EXSLT 1.0 - Sets. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. EXSLT 1.0 - Sets covers those extension elements and functions that provide facilities to do with set manipulation." (4) Math - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - MATH. "This document describes EXSLT 1.0 - Math. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. EXSLT 1.0 - Math covers those extension elements and functions that provide facilities to do with maths." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 15, 2001] XML-Deviant: Extensions to XSLT." By Leigh Dodds and Jeni Tennison. From XML.com. March 14, 2001. ['Members of the XSL mailing list have started a commnunity-based project to standardize extensions for XSLT.'] "The community has discussed alternatives to the contentious <xsl:script> element... the major concerns over xsl:script were that it would encourage scripting code, authored in Java, Javascript, VBScript and other languages, to be embedded inside XSLT stylesheets hampering usability and (potentially) interoperability. The discussion lead to suggestions that XSLT extension functions might usefully be implemented in XSLT itself, rather than, or perhaps in parallel to, implementations in other languages. A number of ways of achieving this functionality were suggested, resulting in Jeni Tennison gathering together the alternatives to further focus the debate and achieve progress: 'There seems to be a reasonable amount of support for user-defined functions written in XSLT, whether to sweeten the syntax of xsl:call-template or to allow XPaths previously only dreamed about. If we're going to move ahead with this, we need to agree on a syntax for (1) declaring the functions and (2) calling the functions. In this email, I'm going to lay out the major designs that have been suggested so far so that we can discuss them and hopefully come up with some kind of resolution that's acceptable to everyone...' What are EXSLT's advantages given that XSLT already provides an user extension mechanism? First, it ensures that the extension functions are well defined in a community-drafted specification, avoiding the need for XSLT developers to rely on proprietary definitions of similar functions provided by their stylesheet engine. Also, while the implementation language for a function may vary, developers can ensure that functions will remain consistent across processors. Second, by providing the means to define their own functions in XSLT, stylesheet authors can create truly portable stylesheets that rely only on a conformant XSLT 1.0 processor that implements the elements defined in EXSLT - Functions... A deciding factor in the success of EXSLT will be whether it's supported by XSLT engines. The prospects are promising, particularly as 4XSLT has already adopted the proposed functions. If the developers of Xalan and other stylesheet engines adopt it, then the future looks decidedly rosy for XSLT developers." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 15, 2001] Transforming XML: Entities and XSLT." By Bob DuCharme. From XML.com. March 14, 2001. ['Using XML entities can be tricky -- this article covers their usage with XSLT in both input and output documents.'] "In XML, entities are named units of storage. Their names are assigned and associated with storage units in a DTD's entity declarations. These units may be internal entities, whose contents are specified as a string in the entity declaration itself, or they may be external entities, whose contents are outside of the entity declaration. Typically, this means that the external entity is a file outside of the DTD file which contains the entity declaration, but we don't say 'file' in the general case because XML and XSLT work on operating systems that don't use the concept of files. A DTD might declare an internal entity to act like a constant in a programming language. For example, if a document has many copyright notices that refer to the current year, declaring an entity cpdate to store the string '2001' and then putting the entity reference '&cpdate;' throughout the document means that updating the year value to '2002' for the whole document will only mean changing the declaration. Internal entities are especially popular to represent characters not available on computer keyboards... Because an XSLT stylesheet is an XML document, you can store and reference pieces of it using the same technique, but you'll find that the xsl:include and xsl:import instructions give you more control over how your pieces fit together. . . All these categories of entities are known as parsed entities because an XML parser reads them in, replaces each entity reference with the entity's contents, and parses them as part of the document. XML documents use unparsed entities, which aren't used with entity references but as the value of specially declared attributes, to incorporate non-XML entities. When you apply an XSLT stylesheet to a document, if entities are declared and referenced in that document, your XSLT processor won't even know about them..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 15, 2001] "More Light Shed on Microsoft, eBay Deal." By Grant Du Bois and Roberta Holland. In eWEEK (March 15, 2001). "The deal between Microsoft Corp. and eBay Inc. announced this week could provide developers with an easier way to add auctioning and e-commerce capabilities to their Internet applications. In the process, the partnership will enable both Microsoft and eBay to expand their rolls as toll collectors on the information superhighway. Under the multifaceted alliance, eBay, of San Jose, Calif., will offer its commerce engine API to Web developers as a Simple Object Access Protocol-based, Extensible Markup Language Web service on the .Net platform, said officials of both companies. SOAP is a Microsoft-sponsored standard that allows different applications and Web sites to communicate over the Internet. In addition, eBay will deploy Windows 2000 Server on all of its front-end Web servers and enable its users to register and sign in on eBay using Microsoft's Passport Internet authentication service. eBay will not replace its Oracle Corp. database for Microsoft's SQL Server database, eBay officials said. Microsoft, of Redmond, Wash., will use the Web service to incorporate the trading functionality of eBay's online marketplace into several Internet properties, including the MSN.com network, the Carpoint online automotive service and WebTV, officials said. Microsoft also will integrate eBay's trading services into its bCentral small-business portal, in which merchants can extend e-commerce by posting items for sale directly on eBay's marketplace from within bCentral, officials said. In addition, bCentral and eBay will provide for full transactional integration between bCentral customer commerce Web sites and eBay, including the ability to determine the status of listings on eBay and view transaction history directly from bCentral, officials added. By driving Internet Web traffic through the eBay commerce engine and Microsoft Passport service, both companies will be able to collect additional licensing and transaction revenues. Some .Net beta users said the deal validates their decision to work with the emerging platform' but some IT managers at Internet World felt that eBay shouldn't extend its marketplace platform to a business-to-business model because that's not its area of expertise." See the announcement.

  • [March 15, 2001] "Microsoft schedules online appointments for .Net." By Mary Jo Foley. In CNET News.com (March 14, 2001). ['Microsoft is preparing to add online appointment scheduling to its suite of services for its upcoming software-as-a-service strategy, according to sources.'] "Microsoft's addition of WebAppoint, which allows for online scheduling for such items as car repair or dentist appointments, is a crucial element in Microsoft's ambitious software-as-a-service strategy, known as .Net. WebAppoint links consumers and companies and offers extra features, such as confirmation of appointments via phone or fax. The start-up's service was launched in the fall 1999. Microsoft is expected to advance the WebAppoint technology and initially launch it later this year as one of its services on Microsoft's bCentral small-business Web site, according to sources. Company representatives confirmed on Wednesday the purchase of WebAppoint.com. Microsoft already has two pilot projects in place where it is testing WebAppoint... A few other services have also fallen in step with .Net plans, such as Microsoft's Passport Internet authentication service and its ClearLead lead-management product. These services have the potential of being worked in to more complex Web applications and services... Further complicating the picture is the imminent arrival of Hailstorm, which is a set of .Net building-block technologies that Microsoft is expected to position as a key part of its overall .Net initiative. Hailstorm, which Microsoft is expected to unveil officially March 19, will incorporate next-generation versions of a number of Microsoft's existing services -- such as its Hotmail e-mail, MSN Messenger instant messaging, and Passport products -- and make them available to developers building XML-based Web services." [Netdocs: "...According to sources, Netdocs is a single, integrated application that will include a full suite of functions, including email, personal information management, document-authoring tools, digital-media management, and instant messaging. Microsoft will make Netdocs available only as a hosted service over the Internet, not as a shrink-wrapped application or software that's preloaded on the PC. Netdocs will feature a new user interface that looks nothing like the company's Internet Explorer Web browser or Windows Explorer. Instead, Netdocs is expected to offer a workspace based on Extensible Markup Language (XML), where all applications are available simultaneously. This interface is based on .Net technology that Microsoft, in the past, has referred as the 'Universal Canvas'."

  • [March 15, 2001] "Build Stateless Components With XML. Implement stateless components to track and store data from multiple Web clients concurrently." By Mark J. Collins and David John Killian. From DevX.com. March 2001. "All but the most trivial Web applications require you to maintain state data -- client-specific information that must be preserved between successive requests. The question is: Where do you keep the state? You've probably encountered a few standard techniques to preserve state in Web-based applications. Some applications use cookies that reside on the client, some pass state information as parameters in the URL. Active Server Pages (ASPs) frequently use the Session object to store state. But none of these techniques help with server-based components, especially when the amount and complexity of the state data is high... Server-based components can be 'stateful' or 'stateless.' Stateful components must be dedicated to a single client. Stateless components, on the other hand, can support many concurrent clients If you're developing components to run on a server and support a potentially large number of clients, consider making your components stateless. In this article, we'll show you how to create complex stateless components that track client information efficiently, improving performance. Our solution uses some of the more advanced features of COM and ATL and assumes at least some familiarity with XML and the Standard Template Library (STL). First you'll turn a stateful class into a stateless one by moving its member attributes outside the class. Then you'll implement nested interfaces using subordinate implementation classes because real-world solutions require more complex state data. Finally, you'll use an XML document to store this complex state data so it can be easily retrieved on subsequent client requests. We've pulled these ideas together into a sample Order component you might find in an e-commerce application. The stateless component collects summary information about the order, the customer, and a list of requested items..." ['Stateful components must be dedicated to a single client. If a particular server operation takes half a second to complete and each client calls it every five seconds, for example, you'd need 10 instances to support 10 active clients -- and each instance will be idle 90 percent of the time. Every instance uses valuable system resources such as memory or database connections, even when idle, bogging down the server without providing the throughput potential. Stateless components, on the other hand, can support many concurrent clients. A single instance of a stateless server component in this scenario could support 10 active clients with no client-perceptible performance degradation.']

  • [March 14, 2001] "An XML Encoding of Simple Dublin Core Metadata." Edited by Dave Beckett, Eric Miller, and Dan Brickley. Dublin Core Metadata Initiative Proposed Recommendation. 2000-12-01 or later. "The Dublin Core Metadata Element Set V1.1 (DCMES) can be represented in many syntax formats. This document explains how to encode the DCMES in XML, provides a DTD to validate the documents and describes a method to link them from web pages." Appendix A contains the DTD for Dublin Core Metadata Element Set 1.1 in XML; Appendix B [Informational] provides an XML Schema for Dublin Core Metadata Element Set 1.1 in XML. Details: "The Dublin Core Metadata Element Set V1.1 (DCMES) can be represented in many syntax formats. This document gives an encoding for the DCMES in XML, provides a DTD to validate the documents and describes a method to link them from web pages. This document describes an encoding for the DCMES in XML subject to these restrictions: (1) The Dublin Core elements described in the DCMES V1.1 reference can be used; (2) No other elements can be used; (3) No element qualifiers can be used; (4) The resulting XML cannot be embedded in web pages. The primary goal for this document is to provide a simple encoding, where there are no extra elements, qualifiers, optional or varying parts allowed. This allows the resulting data to be validated against a DTD and guaranteed usable by XML parsers. A secondary goal was to make the encoding also be valid RDF which allows the document to be manipulated using the RDF model. We have tried to limit the RDF constructs to the minimum, and the result is a mostly standard header and footer for every document. We acknowledge that there will be further documents describing other encodings for DC without these restrictions however this one is for the simplest possible form. One result of the restrictions is that the encoding does not create documents that can be embedded in HTML pages. Please refer to other Dublin Core documents that can describe how to do that. This document is based on previous work such as (1) DTDs for the Dublin Core Element Set [Eric Miller]' Bath Profile Appendix D [Extensible Markup Language (XML) Document Type Definition for Dublin Core Simple]; (3) Museum records transfer DTD ['The use of XML as a transfer syntax for museum records during the CIMI Dublin Core test bed: some practical experiences,' Bert Degenhart Drenth]; (4) CIMI Dublin Core DTD. See: "Dublin Core Metadata Initiative (DCMI)."

  • [March 14, 2001] "XHTML: What You Should do About it, and When." By David R. Guenette and Sebastian Holst (Artesia Technologies). In The Gilbane Report on Open Information & Document Systems Volume 9, Number 1 (February 2001), pages 1-9. ['Last month we covered what was hot at the big annual XML event the GCA produces. At XML 2000, XHTML was out-buzzed by the Semantic Web, Topic Maps, Schemas, and XSLT. Yet XHTML is certainly more important than some of those popular topics, and should be looked at carefully by anyone thinking strategi cally about web applications. We know many of you are already struggling with multiple versions of HTML, and some of you with mixed XML, HTML, and even SGML marked-up content. No doubt you are wondering what the relation is between XML and XHTML, and why we need yet another markup language. The very short answer is that, while XML allows you to build applications that extend well beyond the limitations of HTML, it would be a whole lot easier and less costly if there were something more flexible and robust than HTML to start from. This is true even, or especially, for simply publishing to multiple channels (e.g., wireless devices). This month guest contributor Sebastian Holst joins David to describe how XHTML fits into the evolution of the web, provide advice on what to do about it.'] "... Developers who migrate their content to XHTML 1.0 will realize the following benefits: (1) XHTML documents are XML conforming. As such, they are readily viewed, edited, and validated with standard XML tools. Conformance is critical to interoperability between systems and underlies XHTML's promise to support access to the Web from non-PC devices. (2) XHTML documents can be written to operate as well or better than they did before in existing HTML 4-conforming user agents, as well as in new XHTML 1.0 conforming user agents. This backward compatibility provides a much needed bridge between the current generation of browsers and future XHTML-based user agents. This interoperability between HTML and XHTML is what will give users a few years before XHTML migration is an absolute requirement. (3) XHTML documents can utilize applications (e.g., scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model. This is a similar interoperability feature that preserves programming done today against HTML documents. As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments. As XHTML is supported in more and more exotic and atypical devices and appliances it is likely that content and applications developed today will be supported with little or no conversion required...XHTML is a major initiative exactly in line with the overarching charter of W3C to achieve interoperability across all peoples, geographies, and access devices. XHTML is already playing an increasingly important role in best practices for the entire content lifecycle. To the extent that your organization has been postponing improving the quality of your existing content, or optimizing your content and web site development workflows, and/or planning for a world where the majority of users will be accessing the web through non-PC devices, XHTML is one more important reason to stop procrastinating and start now."

  • [March 14, 2001] "XML Joins the IT Workforce. [Data Management.]" By Rich Seeley. In Application Development Trends Volume 8, Number 3 (March 2001), pages 47-50. "Diploma in hand, and ready to get down to business, XML has quickly found its niche in business computing. Like a graduating college senior, XML has been in the academic realm long enough. The conceptual work is done and now it is time for XML to go to work. Two years ago, XML was a fitting subject for a Scientific American article, but now it is moving from the theoretical to the practical. Business leaders, IT management and the analysts who advise them are looking to find some jobs XML can do in the real world. The prime place to look for a job for XML is in B2B, where the theoretical promise of the language is most likely to be realized. The most significant and widespread use of XML can be found in the B2B space and in the back-end EAI projects required to make B2B work, according to Massimo Pezzini, a GartnerGroup analyst based in Italy. He sees XML as a good way to expose interfaces to business components. Something like XML was needed, he pointed out, since the traditional way of exposing interfaces through Interface Definition Languages (IDL) had a major drawback... While not all technology standards have staying power, the consensus among vendors, developers and theorists is that XML is not only here to stay but is poised to become as ubiquitous in business computing as HTML and ASCII. Most companies that do not have XML incorporated into current applications and tools promise that the standard will be part of the next release. The skeptic would be hard pressed to find any authority in computer science who doubts that e-business can progress without XML. Regardless of the theoretical value of meta data standards, the practical reality is that B2B creates information management challenges that require XML compliance in data exchanges between all participants. XML will soon be as crucial to advancing the conduct of business over the Web as the Dewey Decimal System once was to the organization of great libraries..."

  • [March 14, 2001] "Bill Inmon Sees Advantages and Limitations of 'Fed XML'. [XML Report.]" By Rich Seeley. In Application Development Trends (March 2001). "XML is like Federal Express. According to Bill Inmon, the consultant and author known as the Father of Data Warehousing, it provides an envelope and delivers data from point A to point B. Continuing the overnight delivery analogy during a seminar at the Data Warehousing Institute World Conference Winter 2001 in Palm Spring, Inmon said that once you receive your FedEx package, the important issue is making sense of the contents inside. 'When it comes to the semantics of understanding metadata, XML doesn't do a thing,' he said. The semantics problem involves getting departments within the enterprise and business partners and vendors outside to all agree on definitions of what terms such as 'revenue' mean. 'XML solves the problem of getting data from one place to the next,' he said. 'But it doesn't begin to solve the problem of business vs. technical data, differences of opinion as to what "revenue" means. XML doesn't solve those problems. Isn't designed to solve those problems'..." [Note on XML Report: it "provides the latest news, information, and expert analysis on the state of XML tools and technologies. Produced by the editorial team behind Application Development Trends, Java Report, and The Journal of Object Oriented Programming, XML Report will provide developers and development managers with strategic information on emerging standards-and potential pitfalls-in the fast-growing XML marketplace."]

  • [March 14, 2001] "The Real Impact of XML." By John K. Waters. In Application Development Trends Volume 8, Number 3 (March 2001), page 9. Waters summarizes a Zona report on XML, "XML: The Dash for Dot.com Interoperability." See below. "... at least one prominent observer believes 'that history may regard XML as a more important development than HTML and even the Web'."

  • [March 14, 2001] "XML: The Dash for Dot.com Interoperability." Zona Research Reports Online, Issue 42 (January 2001). "When the history of Web-based ecommerce is written, XML may be regarded as a more important development than HTML in accelerating business on the Web. The reason is that XML promises to do for Web application interaction what HTML did for the human reading of Web-based documents. XML will be able to bridge the islands of information locked away in incompatible computing systems to provide a freer interchange of data between these formerly isolated systems. Through the efforts of many industry consortia, XML has found a place in industries as diverse as medicine, insurance, electronic component trading hubs, petrochemicals, forestry and finance, to name a few. The promise of XML is multifaceted and huge, but has it achieved serious acceptance in the corporate world? To answer this question, Zona Research announces the release of its latest Zona Market Report, XML: The Dash for Dot.com Interoperability. This report is packed with primary research from interviews with enterprise decision makers who are currently deploying XML based solutions or plan to do so during 2001. The report explores the state of XML deployments from the users' perspective and answers these questions, amongst others: Past Approaches Have Fallen Short: Electronic Data Interchange (EDI); Past Approaches Have Fallen Short: Extended Intranets; XML Fundamentally Changes the Speed of Business; XML Fundamentally Changes the Cost of Business; XML Fundamentally Lessens the Pain of Change; XML is Politics; XML Standards: Half Baked, Fully Baked, and Incrementally Baked; XML as a Business Process Disruptor; SOAP and UDDI: A Model for Finding and Acquiring Web Services; Primary Research: What do Users Think?; Summary: What Does It All Mean?..."

  • [March 13, 2001] Codes for the Representation of Languages for Information Interchange. ANSI/NISO Z39.53-200X; ISSN:1041-5653, Revision of ANSI/NISO Z39.53-1994. A Draft American National Standard Developed by the National Information Standards Organization. Status: For Ballot February 9, 2001 - March 23, 2001. The specification provides "a standardized 3-character code to indicate language in the exchange of information is defined. Codes are given for languages, contemporary and historical. The purpose of this standard is to provide libraries, information services, and publishers a standardized code to indicate language in the exchange of information. This standard for language codes is not a prescriptive device for the definition of language and dialects but rather a list reflecting the need to distinguish recorded information by language." From the Foreword: "This standard was originally prepared by Standards Committee C, Language Codes, which was organized in 1979. Charged with 'providing a standard code for indicating languages for information interchange purposes,' the committee produced a standard based on the list of MARC language codes developed by the Library of Congress in cooperation with the National Agricultural Library and the National Library of Medicine. This code list is now published as the MARC Code List for Languages. Practical application of the MARC language codes has shown that in order to serve as an appropriate retrieval device for information, a standard list of language codes must reflect the linguistic content of the universal collection to which it is applied, with language codes assigned as needed to distinguish information in a given language or group of languages. The MARC language codes constitute such a list. The committee's decision to base the standard on the existing MARC list took into account these contributing factors: (a) several years' successful application of the MARC language codes resulting in many millions of bibliographic records containing the accepted MARC codes, (b) the mnemonic relationship of the MARC codes to the English language names of the languages with English being the operational language of most American libraries, information services, and publishers, and (c) the flexibility inherent in a three-character code. The MARC list may be consulted for references from alternative forms of language names, as well as for the assignments to collective codes of languages for which individual codes have not been established. This revised edition reflects a thorough review of the document and includes changes which are a result of requests and demonstrated need from users and implementors. In addition, it includes numerous changes necessary for compatibility with bibliographic language codes in ISO 639-2 (Codes for the representation of names of languages: alpha-3 code). The MARC code list is kept consistent with both ANSI/NISO Z39.53 and ISO 639-2/B." See the Z39.53-200X description and comment form. On the broader issues of language identification using ISO 639, RFC 1766, etc., see also "Language Identification and IT: Addressing Problems of Linguistic Diversity on a Global Scale," by Peter Constable and Gary Simons. Reference: "Names of Languages - ISO 639." [cache]

  • [March 13, 2001] "Scope of a Cohesion Protocol Specification." From Alastair Green and Peter Furniss (Choreology Ltd). 12 March 2001. 8 pages. For the OASIS Business Transactions Technical Committee: Choreology Ltd submission to the Inaugural BT Meeting on 13 March 2001, at San Jose, CA. "Long-running 'business transactions' which may be processed by discrete organizations across the public internet differ from classical atomic transactions in requiring increased protocol security and interoperability, and relaxable atomicity, isolation and durability properties. A protocol is required which is independent of communications mechanism, is capable of supporting fully ACID transaction processing, yet is also capable of supporting different AID qualities of service. Such a protocol would provide 'appropriate transactionality' to applications. 'Cohesive' actions (cohesions) could be processed as a superset of atomic actions, thus enabling a clean integration of legacy transactional resources and services, when appropriate... It is widely perceived that inter-organizational long-duration business transactions require a new protocol or protocols to assist applications in providing reliable service and consistent results. However, the complex logic of extended machine-to-machine conversations will necessarily be an application concern, often assisted by a business process manager. This paper is intended to help in drawing a reasonable boundary between protocol (the concern of the BT TC) and process (the concern of other standardization efforts or of the application). Our principal guide is that a BT protocol should not be aware of application control flow, application message content, or operation algorithms and effect. This criterion excludes business process definition and trading protocols (such as ebXML, in that role). We believe that this leaves three distinct problems which are the potential concern of the Technical Committee: communications, interoperability/security, and ACIDity. In this initial submission we restrict our comments largely to the latter two problems..." See (1) the OASIS Business Transactions Technical Committee and (2) the BEA Proposed Business Transaction Protocol Version 1.0. See also "OASIS Business Transactions Technical Committee." [source]

  • [March 13, 2001] "Abstracting the interface, Part II. Extensions to the basic framework." By Martin Gerlach (Software Engineer, Computer Science Department, IBM Almaden Research Center). From IBM developerWorks. March 2001. ['Martin Gerlach shows you how to build application back ends with XML and XSLT, and describes how to prepare applications to go online.' See the first developerWorks article on Abstracting the interface: "Building an adaptable Web app front end with XML and XSL." This article, a sequel to the author's December article on Web application front ends, describes extensions to the basic framework for XML data and XSL style sheets. It focuses on the back end this time, covering National Language Support (NLS), enhancements of the view structure, and performance issues. While the first article demonstrated the basic architecture for building Web applications using XML and XSLT, this article shows you how to make the applications ready to go online.'] "As in the first installment, this article assumes you are familiar with Web applications based on the HTTP protocol and the HTTP request-response mechanism, and that you have built Web pages using HTML and possibly JavaScript. You should know the Java programming language and how to use Java servlets as access points for Web applications... Where we are going now: I'll also discuss three topics in this article. These are not as tightly connected to each other as the three steps in the first article, and there is no particular order to them. But each of the topics is important in getting the Web application ready to go online. The three topics are: (1) National Language Support (NLS): If you are addressing customers in many countries, you will want to provide a multilingual user interface. You want the system to automatically pick up the user's preferred language and display any NLS relevant information in that language. (2) Enhancing the application structure: Web applications typically are composed of a number of "views." In the example application WebCal, views show the events of a user, or they display input forms for new events. The login and registration pages are also views. Views can be grouped according to different criteria. In WebCal, the day, week, and month views can be seen as calendar views. They share the same subnavigation -- a link for entering a new event. By identifying ways to group views, developers can avoid redundant coding, thereby easing application development and making it less error prone. (3) Performance: You want your Web application to be fast. Users do not want to wait after they follow a link, so you will want to employ caching mechanisms to serve them faster and more reliably..." Also in PDF format. [cache]

  • [March 13, 2001] "XHTML Tags Reference." By Michael Classen. From WebReference.com. March, 2001. "XHTML is a reformulation of HTML 4 as an XML 1.0 application. The stricter nature of XML requires you to follow more rules than before when creating documents..." See: "XHTML and 'XML-Based' HTML Modules."

  • [March 13, 2001] "XML CATALOGS." By [OASIS Entity Resolution Technical Committee.] Edited by Norm Walsh. Revision date: 13 March 2001. Abstract: "The requirement that all external identifiers in XML documents must provide a system identifier has unquestionably been of tremendous short-term benefit to the XML community. It has allowed a whole generation of tools to be developed without the added complexity of explicit entity management. However, the interoperability of XML documents has been impeded in several ways by the lack of entity management facilities: (1) External identifiers may require resources that are not always available. For example, a system identifier that points to a resource on another machine may be inaccessible if a network connection is not available. (2) External identifiers may require protocols that are not accessible to all of the vendors' tools on a single computer system. An external identifier that is addressed with the ftp: protocol, for example, is not accessible to a tool that does not support that protocol. (3) It is often convenient to access resources using system identifiers that point to local resources. Exchanging documents that refer to local resources with other systems is problematic at best and impossible at worst. The problems involved with sharing documents, or packages of documents, across multiple systems are large and complex. While there are many important issues involved and a complete solution is beyond the current scope, the OASIS membership agrees upon the enclosed set of conventions to address a useful subset of the complete problem. To address these issues, this specification defines an entity catalog that maps an entity's external identifier to a URI." See also the updated issued list and the TC web site.

  • [March 13, 2001] "XML Messaging Framework." By Timothy Dyck. In eWEEK (March 11, 2001). "Realizing the cart can't go before the horse, Microsoft Corp. has developed a comprehensive set of proposed standards about how to use XML to send and receive business-to-business messages online. The BizTalk Framework 2.0 specification, released in December, updates its 1.0 predecessor adding ways to check for reliable message delivery, and it includes information on how to use MIME (Multipurpose Internet Mail Extension) and Secure MIME to securely send BizTalk-based Extensible Markup Language messages over e-mail. HTTP delivery of messages is also described in detail. Another big change is that BizTalk Framework has been redesigned to conform to Simple Object Access Protocol 1.1 and XML Schema standards proposals. It also includes XML tags described using the older, nonstandard XML-Data Reduced format. It's possible that vendors other than Microsoft will support the BizTalk messaging framework and thus allow interoperability between Microsoft's own BizTalk Server and non-Microsoft products. It's too soon to tell if this will happen, though. BizTalk Server itself has not caught up to the XML standards that BizTalk Framework relies upon, as BizTalk Server uses XML-Data Reduced-formatted messages internally, not XML Schema (though a separate command-line tool is provided with BizTalk Server to convert XML-Data Reduced-formatted messages to an XML Schema format). The specifics of BizTalk Framework are fairly simple because they describe only the BizTalk message envelope and message characteristics. The items described are sender and receiver names, unique message identifier, time stamps indicating when a message was sent and will expire, topic, request for confirmation of message delivery, request for confirmation of message processing commitment, attachment data, and optional business-specific message information..." See "BizTalk Framework" and the news item "BizTalk.Org Web Site Upgraded."

  • [March 13, 2001] "GXS Works on B2B Integration Issues." By Renee Boucher Ferguson. In eWEEK (March 11, 2001). "Despite promises of end-to-end solutions, many companies are still having difficulties integrating with e-marketplaces, buyers and suppliers. GE Global Exchange Services is out to change that with three initiatives it is using from its e-commerce community -- a community that boasts more than 100,000 trading partners that conduct 1 billion transactions a year. GXS, a subsidiary of General Electric Co. USA, is planning to release an Adaptor Developer Kit next quarter that simplifies the handshake necessary in back-end integration. The idea -- one that GXS competitors such as IBM have already capitalized on -- is that developers can use the kit to shortcut technology integration to the GXS platform without having to customize code. Another initiative, JMS (Java Messaging Service), promises to make it easier to transport objects among data fields within a back-end system. GXS partnered with Progress Software Corp. last year to embed JMS in Progress' SonicMQ Messaging Server and incorporate it internally with GXS' integration products. By combining SonicMQ with JMS, a standardized open architecture is added to GXS' integration products, in effect shortening the time it takes to integrate partners with Web-to-legacy and application-to-application integration. The project will be in production in the next 60 days. The company's third initiative, also scheduled for release in the next 60 days, is NBT (Network-Based Translation). That service allows GXS, of Gaithersburg, MD., to take the EDI (electronic data interchange) or XML (Extensible Markup Language) data format from companies and convert it into a standardized schema in real time. The NBT service can also translate EDI schema for offline companies. Lita Fulton, president of Fulton & Associates Inc., a full-service system and telecommunications technology company in Fairfax, Va., is beta testing NBT for a large government project. 'The way the process works is we have to establish a data model first and determine how we're going to house it,' Fulton said. 'For us, it's only one portion because all vendors are not all EDI. That's what made GXS valuable for us because they can do XML translations, too'."

  • [March 13, 2001] "Microsoft's Ballmer Touts XML Web Standard." By Charles Cooper. From CNET News.com (March 12, 2001). "Microsoft CEO Steve Ballmer [speaking at the quadrennial meeting here of the Association for Computing Machinery] said Monday that the spread of the XML software standard will constitute the 'next revolution' in personal computing. Speaking before a gathering of scientists and technical professionals, Ballmer said the acceptance of XML (Extensible Markup Language) as the new 'lingua franca' of cyberspace would effectively clear away lingering barriers blocking companies from exchanging information over the Internet. 'This will be a much bigger deal' than Java, Ballmer said. He added that the adoption of a common approach embodied by XML will provide a foundation 'so that everyone's work can leverage and build upon' the work of others. 'With the XML revolution in full swing,' he said 'software has never been more important.' Ballmer's two-fisted stump speech was not surprising, given that XML is the linchpin of the Microsoft.Net strategy for software-as-a-service. 'The whole gist of XML relates to the way that things (on the Internet) can talk together,' Ballmer said. In a related vein, Ballmer spoke of the benefits of SOAP (Simple Object Access Protocol) in this next phase of the development of the Internet. SOAP, which is essentially a way to deliver XML payloads around the Internet, was co-developed by Microsoft in association with IBM and UserLand Software and has since been widely adopted by many leading developers." See also referenced here the online video, "Ballmer talks up XML, .Net." [alt URL]

  • [March 12, 2001] "A Request for Proposals: OpenGIS Feature Geometry." From the Open GIS Consortium, OGC Technical Committee, Geometry Working Group. Request Number 12. RFP Issue date: March 2, 2001. Letter Of Intent Due Date: 10-August-2001; Submission Due Date: 10-September-2001. "The purpose of this Request for Proposals (RFP) is to obtain proposals for technologies and needed interfaces required to access and manipulate geospatial information modeled with OpenGIS Feature Geometry. The scope of this RFP includes technologies that create, query, modify, translate, access and transfer geospatial information in the form of Open GIS feature geometry objects or collections of feature geometry objects. Of special interest are open interfaces that conform to the standards of CORBA, DCOM, SQL, and Internet standards such as JAVA and XML. Description of Item: OpenGIS Feature Information Access and Encoding using XML. By 'information encoding and service request using XML' we mean an XML compliant set of rules for the creation, population, query and response to query for the interoperable handling of feature operations, attributes, geometry, and geometry collections. Proposal Guidelines and Conventions Specific to XML: There are at least two distinct ways to use XML in an OGC Feature environment. The first is as a simple encoding and data transfer mechanism. The second is as a message format for the transmittal of requests for services and for the transmittal of the responses to those requests. The submitters must address both issues in their response to this item. Requirements Specific to XML: A proposal for Open GIS Feature Access and Encoding using XML shall additionally include (1) XML SR1: An outline how the specification might be modified to take advantage of ongoing proposals to change or extend XML, such as GML. (2) XML SR2: The specification should indicate the type of XML compliance required. (3) XML SR3: The specification should indicate how profiles (subsets) of the base standard can be defined to allow for simplified version of the XML for applications with specific requirements of compactness or performance (4) XML SR4: It should be possible to define the current GML 2.0 as a profile of the proposed XML encoding specification. (5) XML SR5: It should be possible to define the current Catalog Implementation Specification XML messages as a profile of the proposed XML messaging specification..." See: "Geography Markup Language (GML)."

  • [March 09, 2001] "Open-Source Company Dives Into Web Services SOUP." By Mary Jo Foley. In CNET News.com (March 09, 2001). "While tech kingpins such as Microsoft and Oracle have rushed to one-up each other in introducing Web-delivered software, Ximian is doing work behind the scenes to make sure Web services can run on the Linux and Unix operating systems. Ximian, an open-source software company formerly known as Helix Code, believes it can help achieve Web services compatibility by porting the Simple Object Access Protocol (SOAP) distributed-computing protocol to the Gnome user interface for Linux and Unix systems. Ximian and the Gnome project were both launched by open-source evangelist Miguel de Icaza. The goal is to allow Web-delivered software -- such as the much-touted Microsoft.Net strategy -- from different companies to work on all operating systems, from Windows to Unix and Linux. Ximian has dubbed its resulting technology 'SOUP,' not an acronym but a play on the SOAP name. SOAP, in and of itself, is an interoperability mechanism, explained Aaron Skonnard, an author and trainer with DevelopMentor, a company that trains individuals in distributed-systems technology. 'Toolkit interoperability is more of an issue than SOAP interoperability,' said Skonnard. 'As long as tools are 100 percent SOAP compliant, there's no problem, but people aren't implementing 100 percent to spec.' Web services are software applications delivered as a service over the Web. They can be standalone or integrated. They can be simple, such as automatically updated stock tickers, or more complicated, such as geographically- and device-aware travel services that could reschedule travelers on later flights before their late connection hits the ground. But the full promise of Web services won't be realized unless services developed for one software maker's environment will work with those developed using tools and software from another company. That's where Ximian's SOUP could come into play. Ximian is creating a tool that will allow Web services written for Linux to be compiled for SOAP. De Icaza said the compiler could be available to developers within two months. A compiler changes the software code into language a computer can understand, allowing the computer to run the program. The company also is writing some gateway software that will allow Web services that are written to comply with Gnome's Bonobo object architecture to talk to SOAP clients and servers. Ximian plans to incorporate this middleware into the Gnome 2.0 desktop and its Evolution groupware later this year, de Icaza said. Ximian is being neither helped nor hindered in its efforts by Microsoft or other SOAP backers, de Icaza said. Microsoft representatives said the company is aware of Ximian's work but declined further comment on the significance of SOUP to Microsoft.Net. They noted that a number of companies are developing tools for making Microsoft.Net available on platforms other than those sold by Microsoft..."

  • [March 09, 2001] "VoiceXML and the Voice-driven Internet." By David Houlding (The Technical Resource Connection). In Dr. Dobb's Journal Volume 26, Issue 4 (April 2001), pages 88-94. ['David Houlding examines the concept of voice portals, and shows how simple design patterns -- together with XML and XSL- can be used to deliver Internet content to web browsers and wireless devices.'] "Wireless data services are growing at a phenomenal rate, driven to a large extent by the popularity of the Internet services they are delivering. These wireless-enabled Internet services are generally accessible not only by standard web browsers, but also by some mix of web phones, two-way pagers, and wireless organizers. The adoption of these modes of Internet access is being accelerated by the effects of mainstream Internet usage maturing from an initial novelty/hype phase into a ubiquitous set of services we use as common tools in everyday life. In this mode of use, how information is presented is less important than being able to get to the particular information you require easily, when and where you need it... Voice portals leverage both the most natural form of communication -- speech -- and the most pervasive and familiar communications network -- the global telephone network. This network is accessible by either standard wired or mobile cellphones users already have, together with service plans, so no additional cost needs to be incurred for users to access Internet services via voice portals. This eliminates the expense barriers that are currently limiting the penetration of wireless services into the marketplace. Phones also permit eyes- and hands-free operation, enabling Internet service usage via voice portals in situations where wireless devices will not suffice. In this article, I'll discuss the concept of voice portals and the associated architecture. I'll then show how simple design patterns -- together with XML and XSL -- can be used to deliver Internet content and services cost effectively not only to web browsers and various wireless devices, but also to any telephone via VoiceXML (for more information on the VoiceXML Standard, see http://www.voicexml.org/). I'll then present an implementation of this architecture that uses software that is freely available on the Internet. Finally, I'll examine key business and technical issues associated with voice-driven applications. VoiceXML is a new standard with significant industry backing. It promises to create a level playing field on which voice portals may compete for outsourcing the hosting of voice applications. This will drive down cost and improve quality of service for both application providers and their customers. From the application providers standpoint, creating voice applications using VoiceXML has the advantage that content is portable across different voice portals, delivering flexibility with respect to choosing voice portals to host voice applications. Voice portals driven by VoiceXML provide a powerful complementary new mode of access that empowers users with more options regarding when, where, and how they consume Internet services. Using speech as the most natural form of communication, the existing familiar global telephone network as the most pervasive communications network, and enabling eyes- and hands-free operation, this new mode of access promises to further accelerate the growth and maturity of Internet services into a ubiquitous set of tools we use every day." Additional resources include listings and source code. See "VoiceXML Forum."

  • [March 09, 2001] "Programmer's Toolchest. SAX2: The Simple API For XML." By Eldar A. Musayev. In Dr. Dobb's Journal Volume 26, Number 2 (February 2001), pages 130-133. ['SAX, the "Simple API for XML," is an efficient and high-performance alternative to the Document Object Model. Additional resources include 'sax2.txt' listings and source code. "Just as Perl became the duct tape for the Web, XML is becoming the duct tape for e-business. As a universal data format, XML glues together disparate e-business systems that, in the process of conducting everyday business, need to perform hundreds of transactions per second without outages or crashes. Such systems need XML processors that provide high performance with a small footprint. That's what SAX offers. The article describes SAX, then shows how you can use it in Visual Basic applications via the Microsoft XML (MSXML) parser."

  • [March 09, 2001] "XML Document Production Tools." Prepared by Eric Prud'hommeaux (W3C). 2001-03-09. Pointers to spec-production DTDs, schemas, example documents, and tools. "This is a quick list of XML document production tools taken from Charles McCathieNevile and a quick poll..." Covers (1) XMLSpec-based Tools and (2) XHMTL-based Tools.

  • [March 09, 2001] "Representing UML in RDF." By Sergey Melnik. "A testbed converter that supports automatic translation from UML/XMI to RDFS/RDF/XML is available. The UML community developed a set of useful models for representing static and dynamic components of software-intensive systems. UML is an industry standard and serves as a modeling basis for emerging standards in other areas like OIM, CWM etc. As of today there exist a variety of UML vocabularies for describing object models, datatypes, database schemas, transformations etc. The goal of this work is to make UML 'RDF-compatible'. This allows mixing and extending UML models and the language elements of UML itself on the Web in an open manner. XMI, the current standard for encoding UML in XML by OMG, does not offer this capability. It is based upon a hard-wired DTD. For example, if a third party were to refine the concept 'Event' defined in UML statecharts into say 'ExternalEvent' and 'InternalEvent', it would not be possible to serialize the corresponding event instances in XMI." [Referenced in the 'xmlschema-dev@w3.org' list: "I'd like to support your initiative. In addition to the applications you mentioned, I see UML as well-established schema language that can be used on the Semantic Web along with RDF Schema, XML Schema, DAML-O etc. Webizing UML allows leveraging a broad spectrum of tools and existing UML schemas. I while ago I took a crack at setting up UML on top of RDF and making it interoperate with other schema languages: http://www-db.stanford.edu/~melnik/rdf/uml/." This post from Sergey was in response to a message by David Ezell on a 'Proposed UML Interest Group.'] See (1) "XML Metadata Interchange (XMI)" and (2) "Resource Description Framework (RDF)."

  • [March 09, 2001] "Mapping between ASN.1 and XML." By Takeshi Imamura and Hiroshi Maruyama. Pages 57-64 (with 18 references) in Proceedings 2001 Symposium on Applications and the Internet, edited by K, Ikeda. Los Alamitos, CA: IEEE Computer Society, 2001. [SAINT 2001 Symposium on Applications and the Internet, San Diego, CA, USA, 8-12 January 2001.] "Abstract Syntax Notation One (ASN.1) is a framework for representing tree structured data. Since ASN.1 data are structured data, it should be possible to represent the same information in Extensible Markup Language (XML). The translation between ASN.1 and XML will enable us to manipulate efficient ASN.1 data in a user-friendly manner. We develop a Java library for such translation, called ASN.1/XML translator. We also confirm actual ASN.1 data were translated into expected XML documents and these documents were translated back into the original data if the data were encoded according to Distinguished Encoding Rules (DER). Moreover we discuss still existing issues and try to address them, especially support of XML Schema..." See discussion of the ASN.1/XML Translator in the IBM Security Suite: "Abstract Syntax Notation One (ASN.1) is a framework for representing tree structured data. It is widely used in communication protocols (e.g., SNMP and LDAP), security protocols (e.g., X.509), data formats (e.g., PKCS#7), and so on. ASN.1 is designed for efficiency and the data is usually packed into byte boundaries, and hence is not very readable and is hard to manipulate. Since ASN.1 data is structured data, it should be possible to represent the same information in Extensible Markup Language (XML). XML is not particularly efficient in terms of data length, but is more readable, and it has many off-the-shelf free tools (e.g., XML processors for parsing and generation, XSL processors for rendering, XML editors for authoring, and so on). For such reasons, the translation between ASN.1 and XML will enable us to manipulate efficient ASN.1 data in a user-friendly manner. This is a Java library for such translation. Using this library, ASN.1 can be translated into XML and vice versa..." See also: "ASN.1 Markup Language (AML)."

  • [March 09, 2001] "XML Grammars." By Jean Berstel (Institut Gaspard-Monge, Laboratoire d'informatique Université de Marne-la-Vallée, France) and Luc Boasson (Laboratoire d'informatique algorithmique: fondements et applications - LIAFA). Pages 182--191 (with 7 references) in Mathematical Foundations of Computer Science 2000 = Lecture Notes Computer Science #1893, edited by M.Nielsen, B. Rovan. Proceedings of 25th International Symposium on Mathematical Foundations of Computer Science [MFCS 2000], Bratislava, Slovakia (28 Aug.-1 Sept. 2000). Germany: Springer-Verlag, 2000. "XML documents are described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that an XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars: one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages...The paper is organized as follows. The next section [2] contains the definition of XML-grammars and their relation to DTD. Section 3 contains some elementary results, and in particular the proof that there is a unique XML-grammar for each XML-language. It appears that a new concept plays an important role in XML-languages. This is the notion of surface. The surface of an opening tag a is the set of sequences of opening tags that are children of a. The surfaces of an XML-language must be regular sets, and in fact describe the XML-grammar. The characterization results are given in Section 4. They heavily rely on surfaces, but the second one also uses the syntactic concept of a context. Section 5 investigates decision problems. It is shown that it is decidable whether the language generated by a context-free language is well-formed, but it is undecidable whether there is an XML-grammar for it. On the contrary, it is decidable whether the surfaces of a context-free grammar are finite. The final section is a historical note. Indeed, several species of context-free grammars investigated in the sixties, such as parenthesis grammars or bracketed grammars are strongly related to XML-grammars. This relationship is sketched..." [cache]

  • [March 09, 2001] "Formal Properties of XML Grammars and Languages." By Jean Berstel (Institut Gaspard-Monge, Laboratoire d'informatique Université de Marne-la-Vallée, France), and Luc Boasson. Detailed version of "XML Grammars" cited above. "XML (Extensible Markup Language) is a format recommended by W3C in order to structure a document. The syntactic part of the language describes the relative position of pairs of corresponding tags. This description is by means of a document type definition (DTD). In addition to its syntactic part, each tag may also have attributes. If the attributes in the tags are ignored, a DTD appears to be a special kind of context-free grammar. The aim of this paper is to study this family of grammars. One of the consequences will be a better appraisal of the structure of XML documents. It will also illustrate the kind of limitations that exist in the power of expression of XML. Consider for instance an XML-document that consists of a sequence of paragraphs. A first group of paragraphs is being typeset in bold, a second one in italic. It is not possible to specify, by a DTD, that inavalid document there are as many paragraphs in bold than in italic. This is due to the fact that the context-free grammars corresponding to DTDs are rather restricted. As another example, assume that, in developing a DTD for mathematical documents, we require that in a (full) mathematical paper, there are as many proofs as there are statements, and moreover that proofs appear always after statements (in other words, the sequence of occurrences of statements and proofs is well-balanced). Again, there is no DTD for describing this kind of requirements. Pursuing in this direction, there is of course a strong analogy of pairs of tags in an XML document and the \begin{object} and \end{object} construction for environments in Latex. The Latex compiler merely checks that the constructs are well-formed, but there is no other structuring method. The main results in this paper are two characterizations of XML-langua- ges. The first (Theorem 4.2) is set-theoretic. It shows that XML-languages are the biggest languages in some class of languages. It relies on the fact that, for each XML-language, there is only one XML-grammar that generates it. The second characterization (Theorem 4.4) is syntactic. It shows that XML-languages have a kind of 'saturation property'. As usual, these results can be used to show that some languages cannot be XML. This means in practice that, in order to achieve some features of pages, additional nonsyntactic techniques have to be used. ... Most of the XML languages encountered in practice are in fact regular. Therefore, it is interesting to investigate this case. The main result is that, contrary to the general case, it is decidable whether a regular language is XML. Moreover, XML-grammars generating regular languages will be shown to have a special form: they are sequential in the sense that its nonterminals can be ordered in such away that the nonterminal in the lefthand side of a production is always strictly less than the nonterminals in the righthand side..." [cache]

  • [March 09, 2001] "A Workshop on the Making of America II DTD and Digital Object Encoding." By Alexander Egger (Universitätsbibliothek Graz, Austria). In METAe-news [Newsletter for the Metadata Engine Project] Number 1 (January 2001). ['Introductory text for the meeting to be held in February 2001 at New York University, USA., - Digital Library Federation A Workshop on the Making of America II DTD and Digital Object Encoding.'] "The Making of America II Testbed Project developed a well-documented set of metadata elements needed for digital object management. This metadata set achieved its technological expression through an XML document type definition, the MOA2 DTD. But the MOA2 DTD was only designed to allow for the encoding of a limited range of digital objects, including diaries, still images, ledgers, and letterpress books. The DTD also lacks adequate provisions for encoding of descriptive metadata, provides no support for audio, video, and other time dependent media, and provides only very minimal internal and external linking facilities. The workshop will extend the MOA2 DTD to allow it to support a wider range of digital library objects and operations... Despite its shortcomings, the MOA2 DTD represents a significant step towards developing both a standard set of data elements for describing and managing digital library objects, and a technological mechanism for expressing that information. This workshop will provide an opportunity to build on and extend the MOA2 DTD to allow it to support a wider range of digital library objects and operations, and to discuss what further steps might be taken to further develop and maintain the DTD in the future." Note: The Metadata Engine Project (with 14 institutional partners) is a 5th Framework Programme Project (Digital Heritage and Cultural Content) funded by the European Union. It focuses upon "systematic extraction of metadata from the layout as well as from structural and segmental elements of books simultaneously to the digitisation process...The objective of METAe will be to develop a software which is able to extract as much metadata as possible from the layout of a book and to transform it into XML structured text. In addition to the text METAe will generate Dublin Core metadata and the digital facsimile of the document... the METAe software is intended to be a mayor step towards the objective of facilitating the digitisation of books and journals in order to turn digitisation into a reliable and standard technology for preserving and accessing books and journals." See: "The Making of America II Project."

  • [March 09, 2001] "XML: The Digital Library Hammer." By Roy Tennant (Manager, eScholarship Web & Services Design, California Digital Library). In [Digital] Library Journal. March 15, 2001. "Abraham Maslow once said, 'When the only tool you own is a hammer, every problem begins to resemble a nail.' Once you understand XML and the opportunities it offers for creating and managing digital library services and collections, you will begin seeing nails everywhere. XML (Extensible Markup Language) is born of a marriage of SGML (Standard Generalized Markup Language) and the web. HTML can't do much more than describe the look of a web page, whereas SGML is too complicated and unwieldy for most applications. XML achieves much of the power of SGML without the complexity and adds web capabilities beyond HTML... XML and software If you use software such as the Cocoon publishing framework, when a user requests an XML document from your web server, the request is passed to special software. The software then applies the XML style sheet transformations to produce the HTML version that is sent to the client along with the HTML style sheet. If you don't use special software on the server for these operations, the client software (typically a web browser) must attempt to process the XML file. The latest versions of Microsoft Internet Explorer will attempt to process the file, but you're unlikely to be pleased with the result. Don't even try with Netscape. Few people know this, but any library with an integrated library system from Innovative Interfaces (with Update D) can view XML versions of catalog records. Kyle Bannerjee of Oregon State University has used this capability to provide information essential to relocating 50,000 items to a storage facility. Bannerjee also uses it to solve problems that many other libraries face, as with his program ILL ASAP (Interlibrary Loan Automatic Search and Print). Bannerjee says that 'XML and XSLT are the most significant developments in information management since relational databases and SQL.' Bibliographies are commonplace in libraries, whether as lists of books by a particular author or pathfinders by subject. What are bibliographic citations but a structured set of textual elements? XML is made for this..." See also 'Electronic Discussion Forum on the Use of XML in Libraries'

  • [March 09, 2001] "Setting the Standard: XML on Campus." By Mike Rawlins. In Syllabus Magazine Volume 14, Number 8 (March 2001), pages 30-32. ['XML standards are on the horizon, and a serious long-term campus IT strategy should take them into account.'] See also "PostSecondary Electronic Standards Council XML Forum for Education."

  • [March 08, 2001] "XML." Edited by Steve Litt. In Troubleshooting Professional Magazine Volume 5 Issue 3 (March 2001). "What's up with XML? Is it a revolutionary technology destined to be our livelihood the next few years, or a passing fad? Is it a universal standard specified by the W3C, or has it been usurped and proprietarized by Microsoft? And for some, the most nagging question is "how the heck do I learn it?". This issue of Troubleshooting Professional will attempt to answer all 3 questions. But for those who turn to the last page of the book, let me answer the questions now: (1) XML is a revolutionary technology destined to be our livelihood the next few years. (2) XML is a universal standard specified by the W3C. (3) You can learn the basics of XML in this issue of Troubleshooting Professional. XML was detected by trade mags' radar in 1997 or 1998. It was proclaimed a world changing technology. Learn it and you're rich. We were all skeptical. After all, the trades had predicted similar futures for push technology, ATM, and a hundred other technologies we've all forgotten. But the trades get it right sometimes. Witness Java and Linux. And definitely XML. It's 2001. XML is being incorporated in all sorts of projects. The reason you don't hear about it constantly is the application that reads, writes, changes and renders the XML is written in a traditional language such as Java, Perl, Python or C++. In that respect XML is data. But used correctly, much of an application's logic can be stored as easily modified XML. The actual C++, Java, Python or Perl code then becomes primarily the user interface. Imagine how nice it would be to implement your business rules as XML. You can! Then there's the Microsoft connection. Microsoft is gung-ho about XML. Does that make XML an unwise move? Probably not. Even if Microsoft does what they do best, and somehow manage to proprietarize some dialects of XML, it will be easy to reverse engineer, and may even be legal to do so in spite of UCITA supported anti-reverse engineering license language. Meanwhile, the rest of us can use our own dialects. "Dialects" are numerous. As will be explained later in this magazine, XML itself is just an extremely intuitive general specification for how to declare something that could be considered hierarchical data, or markup language, depending on your viewpoint. Within that specification, an implementer specifies his own set of rules for naming XML elements, and what other elements each element can contain. That specification can be implemented on paper, or technologically enforced with a DTD or schema. If this paragraph loses you don't worry -- everything in this paragraph will be explained in detail in this magazine..."

  • [March 08, 2001] "XML: Like The Air We Breathe?" By Martin Marshall (Zona Research). In InformationWeek (March 05, 2001), pages 47-53. "XML is poised to affect just about everything corporate IT does, from e-commerce applications to legacy data. But the pervasive changes it will bring about won't become apparent until the XML products under development hit the market later this year. IT managers expect XML to fundamentally improve the speed, cost, and flexibility of their business applications. It's also expected to alter the way they build new applications and integrate data from current systems. XML will have a profound effect on business processes, easing the task of exchanging data with trading partners. To some, XML is a business-process catalyst that will pick up where electronic data interchange and extended intranets fell short. Zona Research predicted early last year that the percentage of e-commerce transactions using XML would rise from .5 percent in early 2000 to more than 40 percent by the end of 2003. In a Zona Research Market Report, "XML: The Dash For Dot.com Interoperability," released last month, a survey of more than 200 companies indicates that IT managers expect XML to dramatically improve the adaptability of their businesses. XML is much more than a markup language; it's a fundamental mechanism for the automated exchange of data and the processes that act on that data. XML's data-transformation mechanisms go beyond operating environments, transport protocols, and the arcane barriers of the applications to present true interapplication communication. XML covers everything from data and data-transformation processes to schema, development tools, XML servers, and components. XML also takes into account business-process mechanisms, layered architectures, and vertical-industry bodies that make decisions about XML data representations and process definitions for their industries. XML could supplant EDI as a mechanism for transferring data between businesses and their applications. EDI has been the main way that companies exchange business forms. EDI transactions total about $750 billion per year, with about $2 billion a year spent on EDI development and deployment, according to Zona Research. EDI does the job for bidirectional interaction, but it's expensive to implement, and the embedded business rules are rigid. EDI is a point-to-point solution that must be reengineered every time a company adds a business partner. The mapping of data sets and procedures between two trading partners in an EDI environment is generally accomplished by custom coding. There's a growing movement toward converting EDI systems to XML, according to Zona Research's survey. Among the 72 percent of respondents who use EDI at their companies, seven out of eight plan to convert EDI into XML at some point. The largest group, 30 percent, will convert some of their EDI to XML this year, while 14 percent will do some conversion next year or later. They'll do it on a selective basis, however; very few will convert all of their EDI to XML by either 2001 (2 percent) or 2002 or later (4 percent). About one in eight will convert EDI to XML on an as-needed basis. XML Solutions Corp. is an early implementer in converting EDI to XML. Its XEDI product claims to be able to handle all of the ANSI X.12 EDI interfaces. With its many technical twists, it's easy to overlook the political movement behind XML. As such, it's not born of rosy optimism about global cooperation, but rather about the expedience of operating in trading communities rather than as closed systems. Each vertical industry has a major XML effort under way to define the data term definitions and schemas for industrywide exchange of data...

  • [March 08, 2001] "XML e-Business Standards Agreement in Final Stages." By Chris Preimesberger (Senior Editor). From DevX.com. March 07, 2001. "Hallelujah! It looks as though agreement on international trading standards is near at hand, and we have good old-fashioned business cooperation to thank for it. As the old gospel song says, 'There will be peace in the valley, for me, some day.' It certainly appears that the seeds of peace in Silicon Valley are being planted. This is now a bona fide trend. Companies that were former sworn enemies are now looking past the shortsighted 'it's us or them' philosophy of business and taking the words of Rodney King, 'Why can't we all just get along?' to heart. Many seem to finally realize that for work to get done and profits to be made via that great connector, the Internet, everybody has to give a little in order to earn a lot. Have a look at some of the events of the last few months: (1) Sun Microsystems and Microsoft Corp. settled a nasty, four-year lawsuit over the use of Java. (2) Apple Computer Co. is making an about-face, embracing the developer community it alienated years ago like it never has before. (3) Long-established companies such as IBM and Hewlett-Packard Co. have de-emphasized many of their proprietary ways, smoked the peace pipe with the renegade open-source community, and are adopting open-source technologies for everyday use in their businesses. (4) The World Wide Web Consortium (W3C) and OASIS, a group of powerful companies working closely with the United Nations, have agreed to support each other after having spent millions of dollars developing competing technologies for Internet business standards. (5) And finally, alliances EbXML.org and UDDI -- which had been in a virtual cold war for superiority in e-business standards leadership -- are about to come to an agreement with the aforementioned W3C on final standards for Internet commerce. For more than a year, developers have been charting the progress of discussions about international standards. Most of these standards are based on XML documents, which eventually will become 'templates' used to standardize electronic contracts, to power product-and-service transactions, and to enable trading partnerships. As more companies go online and these new specs become formulated, developers watch from the sidelines, waiting for key questions to be answered... The groundwork has been done. The next step is for these organizations, together with the UDDI and W3C, to come to the table this June and shake hands on final specifications that will be rolled out to developers all over the world for use in the creation of software and services for future e-business. Let's hope the spirit of cooperation that now exists in the e-business sector will expedite that event..."

  • [March 08, 2001] "BEA Integration Platform Now Supports XML Trading Standard. Collaborate for RosettaNet provides tools for building PIPs." By Elizabeth Montalbano. In Computer Reseller News (March 06, 2001). "BEA Wednesday unveiled a new version of its Collaborate XML-based B2B integration platform that supports the RosettaNet standard for fulfilling trading-partner transactions in XML. According to BEA, WebLogic Collaborate for RosettaNet provides tools to help solution providers create partner interface processes (PIPs), which RosettaNet designed as standardized processes to fulfill business transactions--such as order processing--between business partners. With these tools, Collaborate for RosettaNet helps solution providers build RosettaNet-based trading exchanges and quickly bring them to market, said Louise Smith, vice president of marketing for the BEA E-Commerce Integration Division, in a press statement. BEA WebLogic Collaborate for RosettaNet is immediately available for download from www.bea.com. Collaborate is built on BEA's WebLogic application server and also includes WebLogic Process Integrator, a Java process tool that enables IT staff and business users to graphically model, adapt and store business processes. RosettaNet is a non-profit consortium formed two years ago to standardize how XML is used to define business processes in B2B. Its main work is concerned with standardizing those processes in vertical markets. Nearly 200 of the industry's leading vendors and solution providers, including BEA, Sun Microsystems, IBM, Oracle, Cisco, SilverStream Software, Andersen Consulting, Computer Sciences Corp. (CSC) and Deloitte and Touche, as well as leaders in various vertical industries, support RosettaNet..." See also the announcement. References: see "RosettaNet."

  • [March 08, 2001] "Jena: Implementing the RDF Model and Syntax Specification." By Brian McBride (Hewlett Packard Laboratories Bristol, UK). ['Some aspects of W3C's RDF Model and Syntax Specification require careful reading and interpretation to produce a conformant implementation. Issues have arisen around anonymous resources, reification and RDF Graphs. These and other issues are identified, discussed and an interpretation of each is proposed. Jena, an RDF API in Java based on this interpretation, is described.'] "Since the W3C's Resource Description Framework (RDF) Model and Syntax specification completed its path to W3C recommendation several implementations have been developed. These differ in some aspects of their interpretation of the specification. There has been much discussion of these issues on the RDF Interest Mailing List [refs], which so far, has not produced resolution. Inter-mixed with those discussions, have been others about changes and extensions to the specification. All this has caused confusion and uncertainty that is inhibiting the acceptance and deployment of RDF. Tool builders wish to build tools that are correct and conformant. This they cannot do, because it is not clear what it means to be correct and conformant. Similarly producers and consumers of RDF wish to produce RDF whose interpretation is well defined. Uncertainty of interpretation inhibits them from doing so. One reason for the lack of resolution is that issues are discussed individually. The issues themselves however, are interlinked. It is hard for a community discussing, say the subtleties of reification to agree when the have fundamentally different views on the nature of resources and their identification. An implementer setting out to develop an implementation of an RDF tool must have an interpretation of the specification. This paper describes the interpretation developed for Jena, an RDF API in Java. The guiding principle for this interpretation was to implement, as far as possible, the specification as it is, without embellishment. It is documented here in the hope it will prove helpful to other developers." See "Resource Description Framework (RDF)."

  • [March 08, 2001] "Building the Semantic Web." By Edd Dumbill. From XML.com. March 07, 2001. ['Tim Berners-Lee's vision of the Semantic Web is undoubtedly exciting, but its success will lie in the extent to which it solves real-world problems.'] "The range of people working under the broad umbrella of the Semantic Web come from many diverse communities, from the Web-focused to experienced researchers in the fields of artificial intelligence and knowledge representation. Ultimately the skills of all those involved will be required, and it's definitely beyond the scope of any one group to provide the expertise necessary to build the ultimate Semantic Web. For me, the key thing about the Semantic Web is the word 'Web'. It's our essential starting point, and the Web at large is the ecology in which the primordial Semantic Web must grow. I spend most of my time working with the Web, as a developer and a writer, and also in involvement with the community of developers and publishers that use the Web. So, as I approach the Semantic Web (or 'SW' from here on), I'm always asking the question 'how do we get this started?' There are many interesting and exciting possibilities in the realms of logic and proofs, but getting them running on the Web must be preceded b getting more basic machine processible content out there. The evolving form of the SW has to crawl before it can run. In this article I introduce the SW vision and explore the practical steps that we need to be taking to build it. The essential aim of the SW vision is to make Web information practically processible by a computer. Underlying this is the goal of making the Web more effective for its users. This increase in meffectiveness is constituted by the automation of things that are currently difficult to do: locating content, collating and cross-relating content, drawing conclusions from information found in two or more separate sources. In the software world we can often get so enthusiastic about the systems that we're creating that we stray from a focus on the user's requirements. One of the great things about the Web is that it's unforgiving when we ignore the user. Create a site that's hard to use and nobody will come. Create a technology for page markup that's difficult to grasp and nobody will use it. In fact, you might see the creation and implementation of the SW as a near impossible task: it's still difficult to get people to use as little metadata as the <title> tag in their web pages. Clearly, to get off the starting blocks, the SW has to offer enough in reward to make it worth people's time to learn new skills and to more carefully deploy their content on the Web..." References: see: (1) W3C Semantic Web Activity and (2) "XML and 'The Semantic Web'."

  • [March 08, 2001] "Knowledge Technologies 2001: Conference Diary." By Edd Dumbill. From XML.com. March 07, 2001. ['The inaugural GCA Knowledge Technologies conference brought together members of diverse communities, all concerned with managing knowledge: from RDF and Topic Maps to AI.'] "The first ever Knowledge Technologies conference, hosted by the GCA, is taking place in Austin, Texas this week. It is attended by a mixed audience of librarians, AI experts, knowledge management technologists, and the Web community. As far as XML is concerned, this means people from the RDF, Dublin Core, and Topic Maps worlds. This article is a report from the first day of the conference. Opening keynote sessions included Doug Lenat from Cycorp. Doug has gone against the flow where artificial intelligence is concerned. Twenty years ago, when others were gung-ho for AI, Lenat was a pessimist. As disillusionment has set in over recent years, Lenat reports he is now an optimist. A lot of this good feeling comes from the work he's done with CYC (pronounced "psyche"). Lenat has been steadily feeding his system facts about the world for 15 years, and reports that it's starting to get to the stage where the system can help with its own development. CYC uses a codification of natural language into a formal logical language..." Note: see also the news item on OpenCyc.

  • [March 08, 2001] "XML-Deviant: Toward an XPath API." By Leigh Dodds. From XML.com. March 07, 2001. ['Since XSLT and XPointer rely on XPath, developers are asking whether an XPath API should be created.'] "While the XML-DEV storms of the last few weeks show little sign of abating, some developers have been discussing the potential for an XPath API. Over the last few weeks the XML-Deviant has reported on a number of controversies surrounding the recent activities of the W3C, and a rise in the complexity and interdependence between specifications forming the 'XML family'. The debates have continued this week. Threads on XML-DEV have discussed a possible 'fork in the road' of XML's development. The press has reacted with articles like 'Why 90% of XML Standards Will Fail' and 'The relentless march of abstraction'. Simon St.Laurent noted that the current discussions echo the 'Simplified XML' debate that raised hackles on XML-DEV at the end of 1999. Leading to the formation of the SML-DEV mailing list, the split also lead to the 'Common XML Specification' and the appearance of simple tools like Pyxie...It would be good to maintain the early interest in SAXPath in order to formulate a suitable solution to these issues. There is likely to be lots of prior art that can be mined for additional ideas. Implementations of XPath can be found in many open source XSLT engines, and thus it's likely that if an API can be agreed upon, implementations would follow very quickly."

  • [March 07, 2001] "Mapping W3C Schemas to Object Schemas to Relational Schemas." By Ronald Bourret (The Open Healthcare Group). March 2001. "This paper summarizes two different mappings. The first, part of the process generally known as XML data binding, maps the W3C's XML Schemas to object schemas. The second, known as object-relational mapping, maps object schemas to relational database schemas. The two mappings can be joined (and the intermediate object schema eliminated) to create a mapping from XML Schemas to database schemas. This is not shown, but left as an exercise to the reader. Note that because individual XML Schema structures can often be mapped to multiple object structures, and because individual object structures can often be mapped to multiple database structures, there are usually multiple possible mappings from XML Schemas to database schemas. The mapping is described in terms of the data model presented in XML Schemas Part 1: Structures, rather than the XML syntax used to describe schemas. Although I might eventually add a section describing the mapping based on the XML syntax, this is currently left as a (non-trivial) exercise for the reader...The purpose of this paper is to help people write code that can automatically generate object and database schemas from XML Schemas, as well as transferring data between XML documents, objects, and databases according to mappings between them. Because the set of possible mappings from XML Schemas to object schemas is fairly large, I do not expect any software to support all possible mappings any time soon, if ever. A more reasonable strategy is for the software to pick a subset of mappings that make sense for its uses and implement those." [Introduction on XML-DEV: 'I've posted a paper mapping a (very slight) variant of the data model in W3C schemas to object schemas, and then mapping object schemas to relational schemas. The first part of the paper -- mapping XML schemas to object schemas -- is likely to be of most interest to people. It is undoubtedly similar to Sun's XML data binding (JSR-31) and Veo Systems work with SOX. In fact, I wrote it because neither of those specifications seems to be publicly available. The work also appears to be a superset of the mappings in Bill La Forge's Quick and Enhydra's Zeus project. Please note that the paper is rather terse and assumes you understand the general ideas behind the mapping from XML schemas / DTDs to object schemas. If not, see the presentation "Mapping DTDs to Databases", available from: http://www.rpbourret.com/xml.'] For schema description and references, see "XML Schemas."

  • [March 06, 2001] "Extending XML Schemas." By Roger L. Costello (et al.). XML-DEV post March 06, 2001. Topic: 'What is Best Practice of checking instance documents for constraints that are not expressible by XML Schemas?' "XML Schemas - Strive to be All Powerful? As XML Schemas completes version 1 and begins work on version 2, the question comes to mind: 'should XML Schemas strive in the next version to be all powerful?' Programming languages seem to have that goal - to enable a programmer to express any problem using the language. Perhaps the goal of version 2 of XML Schemas should be to provide enough flexibility that any constraint may be expressed. Alternatively, perhaps XML Schemas should just provide a core set of constraint expressing mechanisms (as it does today), and let the marketplace create a technology (technologies?) to supplement XML Schemas. Then version 2 of XML Schemas would have few changes from version 1..." For schema description and references, see "XML Schemas."

  • [March 06, 2001] "Introducing DocBook." By Norman Walsh. 5 Mar 2001 or later. "I've put the slides from my "Introducing DocBook" presentation online... This material was originally presented by Norman Walsh on 7-Mar-2001 at the WinWriters Online Help Conference in Santa Clara, CA. [*1] The slides were produced from a single XML source document using XSLT. The presentation of these slides uses Cascading Style Sheets; for best results, use a browser which can display CSS formatting. You can page through the slides one at a time, or use the frames view which offers a simultaneous table of contents. [1] Well, supposed to have been presented, actually. I was unable to travel to Santa Clara due to inclement weather." See "DocBook XML DTD."

  • [March 06, 2001] "RFC: A Little IDL." By Dave Winer (UserLand Software). "I've been staring with incomprehension at various Interface Definition Languages (or IDLs) for XML-over-HTTP protocols, and wondering why they're so complicated. I thought it might have something to do with the kinds of languages and editing environments they're designed for. To find out where the disconnect is, I decided to define a simple interface definition language in XML that's suitable for scripting environments, and see if people find holes in its functionality, or if it's useful, or something we want to do. That's why I called this ALIDL, so no one could confuse it with the efforts of a standards body. It's little and human-readable. The goal is to have it work with scripting systems that are wired up to XML-RPC or SOAP 1.1..." [Note on XML-DEV Motivation: WSDL appears relatively difficult or impossible for (some) scripting environments to support. I wanted to start a public exploration of IDLs, so we can learn what the issues are, and the benefits, and to spark development of aggregators and directories. I also wanted to support XML-RPC so the fresh SOAP and deep XML-RPC communities get to know each other and can work with each other. Comments are requested on the XML-RPC discussion group and/or XML-RPC mail list. Pointers to both are at the bottom of the spec.']

  • [March 06, 2001] "XML messaging, Part 1. Write a simple XML message broker for custom XML messages. [XML Tutorial.]" By Dirk Reinshagen. In JavaWorld Magazine (March 2001). ['In this article, the first of three, Dirk Reinshagen discusses XML messaging, specifically the basic premise of XML messaging, what it is, and why it is useful. Further, he presents a simple XML message broker for custom XML messages. In the course of developing this broker, he introduces general broker development strategies. In Part 2 and Part 3, Dirk will discuss the two emerging standards for XML messaging: SOAP and ebXML.'] "XML messaging represents a rapidly growing, dynamic area of IT, a situation that makes it exciting and tiresome at the same time. As B2B exchanges and other forms of inter-business electronic communication grow, XML messaging will be more widely deployed than ever. In this article, we'll first explore XML messaging and why it is useful. Then we'll delve into specific XML messaging features, including message routing, transformation, and brokering. Finally, we'll finish up with a simple example of an XML broker. After you read and understand the concepts, you should clearly understand which scenarios lend themselves to implementing an XML messaging solution. To start our exploration, we need to understand the basic premise of XML messaging and what the term messaging implies. For purposes of this article, I define message as follows: 'A collection of data fields sent or received together between software applications. A message contains a header (which stores control information about the message) and a payload (the actual content of message).' Messaging uses messages to communicate with different systems to perform some kind of function. We refer to the communication as being message-oriented because we would send and receive messages to perform the operation, in contrast to an RPC (Remote Procedure Call)-oriented communication. A simple analogy may help: think of messaging as email for applications. Indeed, messaging possesses many of the attributes of individuals sending email messages to one another. In the past, when you were using or working on a message-oriented system, it meant that you were using some kind of MOM (message-oriented middleware) product like Tibco's Rendezvous, IBM's MQSeries, or a JMS provider to send messages in an asynchronous (one-way) fashion. Messaging today doesn't necessarily mean that you are using a MOM product, and it doesn't necessarily mean that you are communicating asynchronously. Rather, messaging can be either synchronous (two-way) or asynchronous and use many different protocols such as HTTP or SMTP, as well as MOM products... Keep in mind that the power of XML messaging makes it increasingly pervasive in software development. So whether you're developing a simple online invoicing system or a large-scale B2B exchange, or if you're simply an end user to one of these systems, it's likely that XML messaging will play a crucial role."

  • [March 05, 2001] "Comparing W3C XML Schemas and Document Type Definitions (DTDs). [XML Matters #7.]" By David Mertz, Ph.D. (Idempotentate, Gnosis Software, Inc.). From IBM developerWorks, XML Library. March 2001. ['Many developers expect that XML schemas will soon supplant DTDs for specifying XML document types. David Mertz is skeptical that schemas will replace DTDs, though he believes that XML schemas are an invaluable tool in a developer's arsenal. This installment of the "XML Matters" column steps up to the challenge of comparing schemas and DTDs and clarifying just what is going on in the XML schema world.'] "While there are a number of instances where W3C XML Schemas excel, there remain, nonetheless, a number of areas where DTDs are better. Developers are continually left with tough choices... Much of the point of using XML as a data representation format is the possibility of specifying structural requirements for documents: rules for exactly what types of content and subelements may occur within elements (and in what order, cardinality, etc.). In traditional SGML circles, the representation of document rules has been as DTDs -- and indeed the formal specification of the W3C XML 1.0 Recommendation explicitly provides for DTDs. However, there are some things that DTDs cannot accomplish that are fairly common constraints; the main limitation of DTDs is the poverty in their expression of data types (you can specify that an element must contain PCDATA, but not that it must contain, for example, a nonNegativeInteger). As a side matter, DTDs do not make the specification of subelement cardinality easy (you can compactly specify 'one or more' of a subelement, but specifying 'between seven and twelve' is, while possible, excessively verbose, or even outright contorted). In answer to various limitations of DTDs, some XML users have called for alternative ways of specifying document rules. It has always been possible to programmatically examine conditions in XML documents, but the ability to impose the more rigid standard that, 'a document not meeting a set of formal rules is invalid,' essentially, is often preferable. W3C XML Schemas are one major answer to these calls, but not the only schema option out there... At least two fundamental and conceptual wrinkles remain for any 'schemas everywhere' goal. The first issue is that the W3C XML Schema Candidate Recommendation, which just ended its review period on December 15, 2000, does not include any provision for entities; by extension, this includes parametric entities. The second issue is that despite their enhanced expressiveness, there are still many document rules that you cannot express in XML schemas (some proposals offer to utilize XSLT to enhance validation expressiveness, but other means are also possible and in use). In other words, schemas cannot quite do everything DTDs have long been able to, while on the other hand, schemas also cannot express a whole set of further rules one might wish to impose on documents. At a more pragmatic level, tools for working with XML schemas are less mature than those for working with DTDs... W3C XML Schemas let XML programmers express a new set of declarative constraints on documents for which DTDs are insufficient. For many programmers, the use of XML instance syntax in schemas also brings a greater measure of consistency to different parts of XML work; others disagree, of course. Schemas are certainly destined to grow in significance and scope as they become more familiar, and as developers enhance more tools to work with them. One way to get a jump start on schema work is to automate the conversion of existing DTDs to XML schema format. Obviously, automated conversions cannot add the new expressive capabilities of XML schemas themselves; but automation can create good templates from which to specify the specific typing constraints one wishes to impose." For schema description and references, see "XML Schemas."

  • [March 05, 2001] "Information Modelling using RDF. Constructs for Modular Description of Complex Systems." By Graham Klyne (Content Security Group, Baltimore Technologies). 24 pages, with 25 references. "This paper describes some experimental work for modelling complex systems with RDF. Basic RDF represents information at a very fine level of granularity. The thrust of this work is to build higher-level constructs in RDF that allow complex systems to be modelled incrementally, without necessarily having full knowledge of the detailed ontological structure of the complete system description. The constructs used draw on two central ideas: statement sets as contexts (based in part on ideas of McCarthy [Notes on Formalizing Context] and Guha [Contexts: A Formalization and Some Applications]) to stand for a composition of individual RDF statements that can be used in certain circumstances as a statement, and a system of 'proper naming' that allows entity prototypes to be described in a frame-like fashion, but over a wider scope than is afforded by class- and instance- based mechanisms... Articulated visions for the Semantic Web require that anyone must be able to say anything about anything. It is unreasonable to expect everyone to adopt exactly the same ontological structure for making statements about an entity; apart from political and perceptual differences, that approach cannot scale. This leads to my assertion that practical modelling of complex systems requires statements that can stand independently of finer ontological details. This is not a dismissal of ontological structures; work on onological frameworks such as OIL and DAML-O is needed to underpin verification of web-based information. In due course, I would expect a theory to emerge that relates descriptions based on incomplete ontologies to more rigorously complete frameworks. I view basic RDF as a kind of 'assembly language' for information modelling, and see this use of contexts and proper naming as a parallel to procedures and formal parameters in programming languages, used to aid the construction of complex object descriptions without adding new formal capabilities. The constructs presented here are being used in the following ongoing experimental developments: (1) A graphical tool for RDF modelling. (2) An experimental RDF-driven expert shell. We also aim to develop mechanisms for trust modelling and inference; modelling social trust structures and overcoming the brittleness of purely cryptographically based approaches to trust in e-commerce, etc. Another area for investigation is the design of mechanisms for managing non monoticic reasoning, and other logical extensions of contexts. In messages to the RDF interest group, Dan Brickley has proposed an alternative approach to labelling anonymous RDF resources; i.e., resources whose formal URI or URI reference is unknown. The outcome of these discussions may affect the exact form of naming preferred." Note from GK: 'I've finally got around to putting on the web a note I drafted some time ago describing some thoughts about using RDF and contexts for modelling complex objects and concepts. It draws heavily on my earlier note about contexts and RDF, and adds some other thoughts about modularizing complex RDF graphs.' - Posted to the RDF interest group. See "Resource Description Framework (RDF)." [cache]

  • [March 03, 2001] "BEA's Silversword' To Offer XML Toolkits For Web-Services Creation. Next Version of WebLogic Server Will Have Full Support for SOAP, UDDI, WSDL." By Elizabeth Montalbano. In Computer Reseller News (February 26, 2001). "BEA Systems is expected to unveil a new version of its WebLogic Server in June that has full support for XML simple messaging and description standards. The product, code-named Silversword, will provide product-level toolkits for solution providers to create Web services with Simple Object Access Protocol (SOAP), Universal Description, Discovery and Integration (UDDI) and Web Services Description Language (WSDL), says John Kiger, director of product marketing for BEA's E-Commerce Server Division. Kiger says supporting these technologies for 'the building blocks of basic Web services' is just one part of the company's Web services strategy, which BEA Chairman and CEO Bill Coleman unveiled in his keynote Monday at BEA's eWorld conference held here. Similar to an announcement made by Sun Microsystems several weeks ago, Coleman outlined a Web-services strategy based on the Java 2, Enterprise Edition (J2EE) specification and XML standards such as SOAP (recently adopted as a subset of the ebXML initiative), WSDL and UDDI. A beta version of BEA's SOAP toolkit currently is available for its WebLogic Server. Kiger says that the other aspect of BEA's Web-services strategy will be to support XML-based standards for deploying more complex, transactional-based Web services, such as the ebXML standard. Collaborate already contains support for Business Transaction Protocol (BTP), a technology BEA created and submitted to the standards consortium OASIS as a possible standard to be used in conjunction with the ebXML standard. BTP provides a standard way to define guaranteed message delivery, security and the semantics of transactions for Web services. Louise Smith, vice president of marketing for BEA's E-Commerce Integration Division, says that Collaborate also currently supports the RosettaNet standard for interfacing with trading partners in XML..." See the announcement: "BEA Unveils Comprehensive Web Services Strategy and Support for Widest Range of Web Services Standards in the Industry. Web Services Architecture of BEA WebLogic E-Business Platform Enables Real Business-to-Business Transactions and Collaboration over the Internet."

  • [March 03, 2001] "Petition to withdraw xsl:script from XSLT 1.1." By Clark C. Evans, Peter Flynn, Alexey Gokhberg, et al.. See the posting of 2001-03-01. "XSLT provides an extension mechanism whereby additional functionality can be identified with a URI reference and implemented in a manner defined by a particular XSLT processor. This mechanism provides an opaque layer between the extension function's usage and its implementation -- allowing for many implementations of an extension function regardless of language or platform. This extension facility provides a rich playground where new features can be prototyped and even put into production. However, to balance this much-needed flexibility, the syntax makes it clear that such added functionality is, in fact, an 'extension' and therefore may not be portable across XSLT implementations. Success of this extension mechanism has brought about request for change by several parties. One change is the official Java and Javascript extension function binding. Although this petition does not specifically challenge this addition, some question the wisdom of this decision. An official binding could encourage wholesale importation of constructs from Java and Javascript into XSLT without thought as to how those constructs would or should be supported in XSLT proper. A second change, the addition of xsl:script, is what we challenge with this petition. As users and implementers of XSLT, we request that the W3C withdraw section 14.4, Defining Extension Functions from the current XSLT 1.1 working draft for the following reasons [...]" Note: this petition created some controversy on the XSL-List.

  • [March 03, 2001] "Ostensible Markup Language. Using OML to create a little language for device name characterization." By Rich Morin. In UnixInsider (March 2001). ['The Meta Project's file-tree browser is supposed to recognize path names and supply descriptive information, but in cases like /dev/*, this can be a real challenge. Using Perl and OML, an informal variant of XML, however, Rich Morin has pieced together a solution, and in this month's Silicon Carny, he shares it with you.'] "Extensible Markup Language (XML) is, like Java, a strongly hyped language. I have even seen it presented as the way to standardize all computer-to-computer communication. Bosh. Nonetheless, XML can be a very useful addition to your bag of programming tricks. In particular, there's an informal variant of XML that's a really handy way to encode control files, intermediate data, etc. Never one to resist a pun, I call this variant Ostensible Markup Language (OML). Hype, fancy tools, and standardization aside, XML is simply a convenient format for serializing data structures. It handles hierarchical structures with ease, can be coerced into handling cross-linkages, and is very friendly to the addition of new fields. In short, it solves many of the limitations found in the traditional Unix flat file format. OML, by my informal definition, looks enough like XML to pass muster with parsers such as XML::Simple, but it may not have Document Type Definitions (DTDs), style sheets, or other niceties. It may also contain things, such as Perl regular expressions, that aren't considered kosher by normal XML standards. OML is easy for programs to generate, reasonable for humans to read (and edit, if need be), and trivial for programs to ingest. If you don't find it to be all of these things, you're probably doing something wrong! I won't pretend that this is particularly elegant, but it gets the job done in a small and reasonably simple amount of code. Part of the reason for this brevity lies in the expressive power of Perl. The CGI script as a whole, moreover, benefits from the convenience of Perl's many handy modules. Another benefit comes from using OML as a tool to build a little language. By creating OML-based parsing macros, complete with embedded regular expressions, I was able to encode some fairly complex notions in a very compact, yet malleable, format. To see the demo in action, start at the Meta Demo Help Page."

  • [March 03, 2001] "Tutorial: Answering the Namespace Riddle. An Introduction to the Resource Directory Description Language." By Leigh Dodds. From XML.com. February 28, 2001. ['Dodds introduces RDDL, the Resource Directory Description Language, the result of a recent project conducted by the XML developer community to make XML namespaces easier to use.'] "This tutorial introduces the Resource Directory Description Language (RDDL), which is the result of a recent project conducted by the XML-DEV community. It provides an overview of RDDL's very simple vocabulary and the benefits it can bring to XML applications. Namespaces are now a common feature of any new XML vocabulary. While their use is spreading, there is still a great deal of controversy associated with them. The controversy has generally focused on the choice and use of URIs, or more commonly URLs, as Namespace identifiers. While the XML Namespaces specification notes that URIs were selected merely as a unique identification system, it is silent on the issue of what, if anything, those URIs should point to. The received wisdom, as documented in the Namespace FAQ, is that these URIs are not meant to point to anything. Paste one into your web browser, and you'll likely get '404 Not Found' error. URLs have become synonymous with web resources -- developers and Internet users alike expect to be able to point their browsers at these URLs and obtain something intelligible. In response to this expectation, many developers have begun placing useful resources at a namespace URL. For example, the RSS 1.0 specification is found at the RSS namespace URI. Other XML applications place XML schemas, of different varieties, at these URLs, giving a handy place for applications to retrieve schemas during processing. This unregulated practice, and more importantly the mixture of resources that might appear at these URLs, has lead to controversy in XML circles. Some believe that the practice should be deprecated, others that it must be regulated in some way. Following a recent resurgence of this debate on XML-DEV a consensus was finally reached. A namespace URL should point to a directory of resources rather than a single web page or schema. Thus RDDL was born... The real benefits of RDDL will be realized as namespace URIs are populated with RDDL documents; and as applications can routinely rely on them as a source of required resources. In the short term the lack of use can be encapsulated within RDDL APIs until the network effect takes hold. RDDL can also facilitate the development of new types of application. By traversing a number of RDDL documents, it's possible for an XML processor to piece together a pipeline of transformations that may be applied to a given document. If the processor needs to transform a document from language X into language Y, it may be able to find one or more intermediate transforms if no direct route is available -- a kind of simple inference which may be key to Semantic Web applications. The XML community has long desired a true XML browser, an application that can display and manipulate any kind of XML document. A key XML browser component is a way to download dynamically new behaviors to deal with unknown document types. With appropriate application code referenced from RDDL resource elements, this kind of dynamic adoption becomes much easier. In short, RDDL is a elegant little language that can resolve much of the confusion and debate over XML Namespaces, while providing the means to build some very interesting applications." See "Resource Directory Description Language (RDDL)."

  • [March 03, 2001] "XML Ain't What It Used To Be." By Simon St. Laurent. From XML.com. February 28, 2001. ['Current XML development at the W3C threatens to obliterate the original promise of XML by piling on too many features and obscuring what XML does best.'] "Current XML development at the W3C threatens to obliterate the original promise of XML -- a clean, cheap format for sharing information -- by piling on too many features and obscuring what XML does best. While users may demand some of those features for some applications, features for some users are turning into nightmares for others. Rather than creating modules users can apply when appropriate, the W3C is growing a jungle of specifications which intertwine, overlap, and get in the way of implementors and users. Various W3C activities seem to be converting XML documents from labeled, structured content to labeled, structured, and typed content. The primary mechanism for performing this transformation is the W3C XML Schema Definition Language, the most complex and controversial of all of the XML specifications, and the only one that's generated credible competition hosted at other organizations (RELAX through ISO, TREX through OASIS). Widespread grumbling about W3C XML Schemas is a constant feature of the XML landscape, with no sign of fading... The release of the Requirements for both XSLT 2.0 and XPath 2.0 suggest that the W3C plans to drive W3C XML Schema technologies deeply into the rest of XML. The requirements describe operations which both require a "post-schema validation infoset" (PSVI) and depend on parts of the W3C XML Schema spec, like the regular expression syntax defined in Appendix E of XML Schema: Datatypes. This interweaving of specifications has a number of consequences. First, it raises the bar yet again for developers creating XML tools. While borrowing across specifications may reduce some duplication, it also requires developers to interpret tools in new contexts. (As the recent XPointer draft demonstrates, there can be unexpected consequences.) Developers with existing code bases now have to teach that code about complex types. Since none of these documents offer conformant subsets, they have to be swallowed in large chunks..."

  • [March 03, 2001] "XML-Deviant: Does XML Query Reinvent the Wheel?" By Leigh Dodds. From XML.com. February 28, 2001. ['XML developers contend that the overlap between XML Query and XSLT is so great that they aren't separate languages at all.'] "Debates on the XML-DEV and XSL mailing lists over the last two weeks concern the futures of XSLT, XPath, and, the latest addition to the W3C XML toolkit, XML Query. There are no signs of these debates ending this week. Discussion on XML-DEV about the design of XML Query rages on... Both sides of the debate have made convincing arguments. It's obviously desirable to factor out common features between specifications, as Evan Lenz has suggested. But having multiple tools available when tackling a job is often beneficial, which suggests that XML Query should not be dismissed out of hand. Additional lessons may also be learned from tackling similar problems from a different perspective, although to benefit in the long-term, refactoring may still be required at a later date. The common topics in the recent discussions demonstrate that the community has a number of concerns. Hopefully these can be adequately addressed if the XML Query and XSLT Working Groups further coordinate their efforts. In reality, these concerns are over early draft specifications and experience has shown that significant revisions may occur to a specification as it moves from Working Draft to Recommendation."

  • [March 02, 2001] "Representing vCard Objects in RDF/XML." By Renato Iannella (IPR Systems). A submission to the World Wide Web Consortium from IPR Systems Pty Ltd. Reference: W3C Note 22-February-2001. "This note specifies a Resource Description Framework (RDF) encoding of the vCard profile defined by RFC 2426 and to provide equivalent functionality to its standard format. The motivation is to enable the common and consistent description of persons (using the existing semantics of vCard) and to encode these in RDF/XML. Details: "This note specifies a Resource Description Framework (RDF) expression that corresponds to the vCard electronic business card profile defined by RFC 2426. This specification provides equivalent functionality to the standard format defined by VCARD Version 3.0. RDF is an application of the Extensible Markup Language. Documents structured in accordance with this RDF/XML encoding may also be known as 'RDF vCard' documents. This specification is in no way intended to create a separate definition for the vCard schema. The sole purpose for this note is to define an alternative RDF/XML encoding for the format defined by VCARD. The RDF vCard does not introduce any capability not expressible in the format defined by VCARD. However, an attempt has been made to leverage the capabilities of the XML and RDF syntax to better articulate the original intent of the vCard authors. RDF uses the XML Namespace to uniquely identify the metadata schema and version. For vCard, the following URI is defined to be vCard Namespace: http://www.w3.org/2001/vcard-rdf/3.0#. [Staff comment]: "The Submission relates to the following W3C Activities: Semantic Web: (1) In the RDF Interest Group, which tracks RDF experience, applications, and deployment. (2) In the RDF Core WG, which is responsible for addressing open issues and is chartered to consider an update to the RDF Model and Syntax Recommendation. The submission will be brought to the attention of the RDF Interest Group." See the Submission Request and W3C Staff Comment. See also "vCard Electronic Business Card."

  • [March 02, 2001] "Synchronized Multimedia Integration Language (SMIL 2.0) Specification." W3C Working Draft 01-March-2001. Edited by Jeff Ayars (RealNetworks); Dick Bulterman (Oratrix); Aaron Cohen (Intel); Ken Day (Macromedia) et al. Latest version URL: http://www.w3.org/TR/smil20. This document specifies the second version of the Synchronized Multimedia Integration Language (SMIL, pronounced 'smile'). SMIL 2.0 has the following two design goals: (1) Define an XML-based language that allows authors to write interactive multimedia presentations. Using SMIL 2.0, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen. (2) Allow reusing of SMIL syntax and semantics in other XML-based languages, in particular those who need to represent timing and synchronization. For example, SMIL 2.0 components are used for integrating timing into XHTML and into SVG." SMIL 2.0 is defined as a set of markup modules, which define the semantics and an XML syntax for certain areas of SMIL functionality. SMIL 2.0 deprecates a small amount of SMIL 1.0 syntax in favor of more DOM friendly syntax. Most notable is the change from hyphenated attribute names to mixed case (camel case) attribute names, e.g., clipBegin is introduced in favor of clip-begin. The SMIL 2.0 modules do not require support for these SMIL 1.0 attributes so that integration applications are not burdened with them. SMIL document players, those applications that support playback of "application/smil" documents (or however we denote SMIL documents vs. integration documents) must support the deprecated SMIL 1.0 attribute names as well as the new SMIL 2.0 names." [cache]

  • [March 02, 2001] "Microsoft Releases XML Kit, Specification." By Margret Johnston. In InfoWorld (March 02, 2001). Microsoft On Friday released a beta version of its XML for Analysis software development kit and an updated XML for Analysis protocol specification, giving developers tools needed to write XML-based applications aimed at spurring the deployment of sophisticated analytical databases across multiple platforms. XML for Analysis is a new online analytical processing protocol that enables the transfer of information between analytical databases and client applications, regardless of the language used to write the application, Microsoft said in a release. It leverages not only the open Internet standard XML but also SOAP (Simple Object Access Protocol) and HTTP. The new protocol is designed to standardize the data access interaction between a client application and an analytical data provider such as OLAP (online analytical processing) and data mining. More than 50 industry-leading vendors contributed to XML for Analysis, which Microsoft described as a vendor- and platform-independent extension to its OLEDB (object linking and embedding database) for OLAP and OLEDB for Data Mining protocols. With the release of XML for Analysis, developers are able to add analytic capabilities to any client for any device or platform using any major programming language, Microsoft said." See the announcement: "Microsoft Delivers First XML-Based Protocol for Cross-Platform Analytics."

  • [March 02, 2001] "Mapping the XTM Syntax to the XTM Conceptual Model." By Daniel Rivers-Moore. Posted to the XTM mailing list. 2001-03-02. "Attached is my work in progress towards a formal expression (in UML) of the mapping from the XTM Conceptual Model to the XTM Interchange Syntax. This is intended as a suggestion of an approach and a start towards a mapping, not as a completed piece of work...The diagrams used in this section are 'class diagrams', using the conventions of the Unified Modelling Language (UML). In a class diagram, each rectangle represents a class of objects (a kind of thing that can exist), and the words in the rectangle are the name of that class. The lines and arrows between the rectangles represent relationships that exist or can exist between instances of those classes (individual things of those kinds). In an object diagram, each rectangle represents an individual object, and the words in the recangle are the name of the individual, followed by a colon, followed by the name of the class of which it is an instance. The lines between the rectangles represent relationships that exist between those individual objects..." See (1) TopicMaps.Org, (2) XTM Document Web site, and (3) "(XML) Topic Maps."

  • [March 02, 2001] "What Kind of Language is XSLT? An Analysis and Overview." By Michael H. Kay (Software AG). From IBM developerWorks. February, 2001. ['What kind of a language is XSLT, what is it for, and why was it designed the way it is? These questions get many different answers, and beginners are often confused because the language is so different from anything they are used to. This article tries to put XSLT in context. Without trying to teach you to write XSLT style sheets, it explains where the language comes from, what it's good at, and why you should use it.'] I originally wrote this article to provide the necessary background for a technical article about Saxon, intended to provide insights into the implementation techniques used in a typical XSLT processor, and therefore to help users maximize the performance of their style sheets. But the editorial team at developerWorks persuaded me that this introduction would be interesting a much wider audience, and that it was worth publishing separately as a free-standing description of the XSLT language. What is XSLT? The XSLT language was defined by the World Wide Web Consortium (W3C), and version 1.0 of the language was published as a Recommendation on November 16, 1999. I have provided a comprehensive specification and user guide for the language in my book XSLT Programmers' Reference and I don't intend to cover the same ground in this paper. Rather, the aim is simply to give an understanding of where XSLT fits in to the grand scheme of things. The role of XSLT: XSLT has its origins in the aspiration to separate information content from presentation on the Web. HTML, as originally defined, achieved a degree of device independence by defining presentation in terms of abstractions such as paragraphs, emphasis, and numbered lists. As the Web became more commercial, publishers wanted the same control over quality of output that they had with the printed medium. This gradually led to an increasing use of concrete presentation controls such as explicit fonts and absolute positioning of material on the page. The unfortunate but entirely predictable side effect was that it became increasingly difficult to deliver the same content to alternative devices such as digital TV sets and WAP phones (repurposing in the jargon of the publishing trade)... As a programming language, XSLT has many features -- from its use of XML syntax to its basis in functional programming theory -- that are unfamiliar to the average Web programmer. That means there is a steep learning curve and often a lot of frustration. The same was true of SQL in the early days, and all that this really proves is that XSLT is radically different from most things that have gone before. But don't give up: It's an immensely powerful technology, and well worth the effort of learning." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 02, 2001] "Saxon: The Anatomy of an XSLT Processor. What is current state of the art in XSLT optimization?" By Michael H. Kay (Software AG). From IBM developerWorks. February, 2001. [' This article describes how an XSLT processor, in this case the author's open-source Saxon, actually works. Although several open-source XSLT implementations exist, no one, as far as we know, has published a description of how they work. This article is intended to fill that gap. It describes the internal workings of Saxon, and shows how this processor addresses XSLT optimization. It also shows how much more work remains to be done. This article assumes that you already know what XSLT is and how it works.'] "I hope this article serves a number of purposes. First, I hope it will give style sheet authors a feel for what kind of optimizations they can expect an XSLT processor to take care of, and by implication, some of the constructs that are not currently being optimized. Of course, the details of such optimizations vary from one processor to another and from one release to another, but I'm hoping that reading this account will give you a much better feel for the work that's going on behind the scenes. Second, it describes what I believe is the current state of the art in XSLT technology (I don't think Saxon is fundamentally more or less advanced than other XSLT processors in this respect), and describes areas where I think there is scope for further development of techniques. I hope this description might stimulate further work in this area by researchers with experience in compiler and database optimization. Finally (last and also least), this article is intended to be a starting point for anyone who wants to study the Saxon source code. It isn't written as a tour of the code, and it doesn't assume that you want to go into that level of depth. But if you are interested in getting a higher-level overview than you can get by diving into the JavaDoc specs or the source code itself, you'll probably find this useful... I try in this article to give an overview of the internals of the Saxon XSLT processor, and in particular of some of the techniques it uses to improve the speed of transformation. In the 18 months or so since I released the first early versions of Saxon, performance has improved by a factor of 20 (or more, in the case of runs that were thrashing for lack of memory). Perhaps the biggest research challenge is to write an XSLT processor that can operate without building the source tree in memory. Many people would welcome such a development, but it certainly isn't an easy thing to do." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [March 02, 2001] "EXSLT 1.0 - Common, Sets and Math." Posting from Jeni Tennison to the XSL-List. March 02, 2001. Thanks to those of you that commented on the last EXSLT draft. I've put up a new draft for user-defined functions and a couple of handy extension functions at: http://www.jenitennison.com/xslt/exslt/common/. There's a list of changes to the last draft there, but also of interest is that I've created a couple more documents at: http://www.jenitennison.com/xslt/exslt/sets/ and http://www.jenitennison.com/xslt/exslt/math/ that hold some extension functions. These are intended to be a starting point for a number of groups of standard (built-in) functions. The most important issues for developing these functions are (a) whether there are other sets of functions that we should define and (b) what functions we should have in them. These documents are just a starting point - please post any comments and suggestions here..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

February 2001

  • [February 28, 2001] "The Upper Cyc Ontology in XTM." Edited by Murray Altheim (Sun Microsystems). Reference: Sun Microsystems Technical Report 27-February-2001. "This Technical Report documents research and development of an XML Topic Map (XTM) representation of the Upper Cyc Ontology, including a distribution of five XTM topic maps based on features of the ontology. This Technical Report plus any associated software and/or documentation may be submitted to TopicMaps.Org with the goal of promoting XML Topic Maps (XTM) as a suitable ontological framework, as well as a source of XTM Published Subject Indicators (PSIs). This Technical Report is a Sun Microsystems Working Draft, intended for review and comment by interested parties. It is a very preliminary 'work in progress' release, currently has no formal status, and its publication should not be construed as endorsement by either Sun Microsystems, Inc. or any other body." See the discussion and "(XML) Topic Maps."

  • [February 28, 2001] Geography Markup Language (GML) 2.0. Edited by Simon Cox (CSIRO Exploration & Mining), Adrian Cuthbert (SpotOn MOBILE), Ron Lake (Galdos Systems, Inc.), and Richard Martell (Galdos Systems, Inc.). OGC Document Number: 01-029. February 20, 2001. Also in PDF and in .ZIP format. The Open GIS Consortium, supporting Geospatial and Information Technolgy Industries with open standards specifications, has now released Geography Markup Language (GML) 2.0 with a complete W3C XML Schema notation. Abstract: "The Geography Markup Language (GML) is an XML encoding for the transport and storage of geographic information, including both the spatial and non-spatial properties of geographic features. This specification defines the XML Schema syntax, mechanisms, and conventions that (1) Provide an open, vendor-neutral framework for the definition of geospatial application schemas and objects; (2) Allow profiles that support proper subsets of GML framework descriptive capabilities; (3) Support the description of geospatial application schemas for specialized domains and information communities; (4) Enable the creation and maintenance of linked geographic application schemas and datasets; (5) Support the storage and transport of application schemas and data sets; (6) Increase the ability of organizations to share geographic application schemas and the information they describe. Implementers may decide to store geographic application schemas and information in GML, or they may decide to convert from some other storage format on demand and use GML only for schema and data transport." See "Geography Markup Language (GML)." [cache HTML/ZIP; cache, PDF]

  • [February 28, 2001] "XMML: Standards-Compliant Transport Of Geoscientific Data Online In The Exploration And Mining Industry." By CSIRO Exploration and Mining (Dr Simon Cox, PO Box 437, Nedlands WA 6009) and Fractal Graphics p/l (Dr Nick Archibald). 25 pages. 2000-02-22. "We propose to develop the eXploration and Mining Markup Language XMML, a web-compatible XML based exploration and mining data transfer format. This will use a sophisticated geology domain model built on the ISO geographic standards, OpenGIS Consortium implementations, and World Wide Web Consortium encoding recommendations. Because the geology model is built merely as a 'schema' on top of a generic geospatial infrastructure, it will be compatible with both generic (e.g., GIS, CAD, DBMS, spreadsheet, web-browser) and specialised (geology modelling, mechanics and fluid-flow, resource estimation, mine-planning etc) software for analysis, modelling, visualisation and transfer. The system will be capable of describing rich 3D geology, including boreholes, geophysics and analytical data, so that data can easily be exchanged between software applications, between offices, and between explorers, contractors, data-managers and regulators on a transactional basis. The self-describing plain-text form of XML documents also makes them ideal for archival purposes, overcoming the problem of loss of data because of software incompatibilities. See: "Exploration and Mining Markup Language (XMML)." [cache]

  • [February 27, 2001] "Working with XML: The Java API for XML Parsing (JAXP) Tutorial." By Eric Armstrong. [Updated: "Remember that all the package names have changed! So none of the examples will work, for the moment. However, most of the information is still applicable."] This tutorial covers the following topics: (1) Part I: Understanding XML and the Java XML APIs explains the basics of XML and gives you a guide to the acronyms associated with it. It also provides an overview of the Java XML APIs you can use to manipulate XML-based data. To focus on XML with a minimum of programming, follow The XML Thread, below. (2) Part II: Serial Access with the Simple API for XML (SAX) tells you how to read an XML file sequentially, and walks you through the callbacks the parser makes to event-handling methods you supply. (3) Part III: XML and the Document Object Model (DOM) explains the structure of DOM, shows how to use it in a JTree, and shows how to create a hierarchy of objects from an XML document so you can randomly access it and modify its contents. This is also the API you use to write an XML file after creating a tree of objects in memory. (4) Additional Information contains a description of the character encoding schemes used in the Java platform and pointers to any other information that is relevant to, but outside the scope of, this tutorial..." See "Java API for XML Parsing (JAXP)."

  • [February 27, 2001] "xlinkit: A Consistency Checking and Smart Link Generation Service." By Christian Nentwich, Licia Capra, Wolfgang Emmerich, and Anthony Finkelstein. (Department of Computer Science, University College London). Research Note RN/00/06, submitted for publication. [February 2001.] "xlinkit is a lightweight application service that provides rule-based link generation and checks the consistency of distributed web content. It leverages standard Internet technologies, notably XML and XLink. xlinkit can be used as part of a consistency management scheme or in applications that require smart link generation, including portal construction and management of large document repositories. In this paper we show how consistency constraints can be expressed and checked. We describe a method for generating links based on the result of the checks and we give an account of our content management strategy. We present the architecture of our service and the results of a substantial real-world' evaluation... This paper describes xlinkit, a lightweight application service that provides rule-based link generation and checks the consistency of distributed web resources. The paper is supplemented by the on-line demonstra-tions at http://www.xlinkit.com. The operation of xlinkit is quite simple. It is given a set of distributed XML resources and a set of potentially distributed rules that relate the content of those resources. The rules express consistency constraints across the resource types. xlinkit returns a set of XLinks, in the form of a linkbase, that support navigation between elements of the XML resources. The precise link generation behaviour is determined by link building annotations on the rules. xlinkit leverages standard Internet technologies. It supports document distribution and can support multiple deployment models. It has a formal basis and evaluation has shown that it scales, both in terms of the size of documents and in the number of rules. With this thumbnail description in mind it is easiest to motivate and to explain xlinkit by reference to a simple example..." See the discussion of the UML (XMI) checker which uses xlinkit. [cache]

  • [February 27, 2001] "System Desiderata for XML Databases." By Airi Salminen and Frank Wm. Tompa (Department of Computer Science, University of Waterloo, Waterloo, ON, Canada). Submitted for presentation at the VLDB 2001 conference [Roma, Italy]. February, 2001. 12 pages, with 38 references. "There has been much progress made towards defining query languages for structured document repositories, but emerging prototypes, products, and even proposed specifications too often assume overly simplistic data models and application needs. In this paper we explore the requirements for a general-purpose XML database management system, taking into account not only document structure and content, but also the presence of XML Schemas, Namespaces, XML entities, and URIs. Furthermore, the requirements accommodate applications that create, modify, and maintain complex units of data and metadata that co-exist with numerous versions and variants. Our discussion addresses issues arising from data modelling, data definition, data manipulation, and database administration... Two extreme positions can be heard regarding the role of XML in databases. One view is that XML is merely an encoding representation for exchanging data; therefore an XML database system is one that is able to import and export data or programs and to convert them to and from internal forms. The other extreme is that XML is merely an encoding representation for formatting documents; therefore an XML database system is one that is able to store such documents and to retrieve them on demand in order to present them to a browser. Our vision is for a database system that can manage XML data on behalf of applications that are far more demanding than either of these extremes. An XML database is a collection of XML documents and their parts, maintained by a system having capabilities to manage and control the collection itself and the information represented by that collection. It is more than merely a repository of structured documents or semi-structured data. As is true for managing other forms of data, management of persistent XML data requires capabilities to deal with data independence, integration, access rights, versions, views, integrity, redundancy, consistency, recovery, and enforcement of standards. Even for many applications in which XML is used as a transient data exchange format, there remains the need for persistent storage in XML form to preserve the communications between different parties in the form understood and agreed to by the parties... A problem in applying traditional database technologies to the management of persistent XML data lies in the special characteristics of such data, not typically found in others databases. XML documents are complex units of information, consisting of formal and natural languages, and often including multimedia entities. The units as a whole may be important legal or historical records. The production and processing of XML documents in an organization may create a complicated set of versions and variants, covering both basic data and metadata. Applications depend on systematic, controlled, and long lasting management technologies for all such collections. These characteristics of document data also applied to SGML documents [Gol90], long before XML evolved. However XML imposes yet further demands: (1) Closely related specifications that extend the capabilities specified in XML 1.0 [BPS00], such as XML Namespaces [BHL99] and XML Schema [Fal00, TBM00, BiM00], must be accommodated when developing XML database solutions, since they are expected to be widely used. (2) Because references in XML documents refer to internet resources, general-purpose XML database systems must include internet resource management. In the following sections we explore many capabilities needed to manage XML databases. After further elaborating on the special characteristics of XML data, the discussion addresses database system characteristics required for appropriate data definition, data manipulation, and database administration... Database systems were designed to manage large bodies of information about one enterprise at a time and to provide integrity for the information despite many users sharing the data and changes in technology. More recently XML emerged as a universal metalanguage to be used as a common format for information in various application areas. In many environments collections of XML documents will be carriers of large bodies of information related to a particular enterprise or crossing enterprise boundaries. The information must be securely accessible, often for a long time, despite continuing changes both in technology and in participating enterprises, and despite heterogeneity in the user community. The special characteristics of XML data cause problems when adapting database management principles and systems to XML data. In this paper we have discussed these characteristics and derived a set of desired features for XML database management systems. We addressed some of the major requirements for appropriate data definition, data manipulation, and database administration and demonstrated the complexity of the area. The purpose of the paper is to initiate discussion of the requirements for XML databases, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper models and systems for XML database management. A well-defined, general-purpose XML database system cannot be implemented before database researchers and developers understand the needs of document management in addition to the needs of more traditional database applications. We look forward to innovative solutions being developed to address the problems identified, including problems of equivalence, versions, and variants for XML data..." See: "XML and Databases." [cache]

  • [February 27, 2001] "RDDL Me This: What Does a Namespace URL Locate?" By Elliotte Rusty Harold. ['Harold reports that the answer was nothing, until the Resource Directory Description Language, or RDDL, was conceived. This article introduces RDDL and namespace URIs. It includes resource information and downloadable code depicting an RDDL document.'] "No W3C specification has been as controversial as Namespaces in XML. It was the first specification the W3C (World Wide Web Consortium) recommended over significant (though far from unanimous) opposition from the affected community. At the time, there were numerous objections. Some developers didn't like the attribute-based syntax that replaced the processing instructions used in earlier working drafts. Others were concerned with the near incompatibility of DTDs (Document Type Definitions) and namespaces. However, since the recommendation was published in early 1999, one problem has grown to tower over all others: the question of what one finds at the end of a namespace URI (Uniform Resource Identifier). The W3C has an answer to this question: There is nothing at all at the end of a namespace URI, except perhaps a 404 Not Found error. One reason the W3C wanted to avoid putting anything at the end of a namespace URI is that it wasn't clear what to put there. Should it be a DTD? a schema? a Java-content handler for processing the document? a style sheet? something else? There are simply too many good choices to limit developers to any one of these. Unfortunately, this answer seems to be one that developers are unable or unwilling to hear. With few exceptions, namespace URIs are URLs, Uniform Resource Locators, and it's not unreasonable for users to expect a URL to locate a resource. However, namespace URLs are identifiers, not locators. Thus, the error logs of the W3C and other purveyors of XML specifications have been filling up with requests for nonexistent pages at the end of namespace URLs, and XML mailing lists are besieged with questions about whether one can run an XML parser on a machine that isn't connected to the Internet and is thus unable to resolve the namespace URL. After addressing this issue again and again on various mailing lists, Tim Bray and Jonathan Borden decided to do something about it. If they couldn't convince developers that there wasn't anything at the end of a namespace URL, then maybe it would be a good idea to invent something to put there. What they came up with was the Resource Directory Description Language, RDDL for short. RDDL is an extensible XML application for documents that live at namespace URLs. RDDL is designed to allow both human readers and software robots to find any sort of resource associated with a particular namespace. Instead of putting one thing at the end of a namespace URI, RDDL puts a document there that lists all the machine-processable documents that might be available, including, but not limited to: (1) DTDs; (2) Schemas in a variety of languages -- including RELAX, Schematron, the W3C Schema language, TREX, and others; (3) CSS, XSLT, and other style sheets; (4) Specification documents... See "Resource Directory Description Language (RDDL)."

  • [February 27, 2001] "XML Hauls Freight For Marketplace." By Ted Kemp. In InternetWeek (February 26, 2001). "E-marketplaces and community sites make money serving the needs of particular industries, and that means understanding each industry's specialized vocabulary. CarrierPoint Inc., a marketplace that links companies selling raw commodities with trucking firms that can move their freight, has learned that an industry's 'language' also includes the formats in which it writes its documents. In the coming weeks CarrierPoint will upgrade the software that it uses to translate into XML the electronic shipping manifestos that have typically been exchanged in a traditional EDI format... . Overland transport jobs usually are accompanied by documents called 204s that detail a shipper's contract offer, including the shipper's name and address, billing address, customer name and address, weight of the products to be shipped and special handling concerns. The problem for CarrierPoint is that the traditional EDI used by shippers and trucking firms to share data from 204s is less flexible than XML and harder to translate into points of data on a Web site... . The marketplace is upgrading to a new version of translation software from American Coders Ltd. that it already uses to parse data points in EDI documents to an XML protocol. The principal benefit of the upgraded package is that it makes more efficient use of its own memory. Typically XML documents are much larger than their EDI equivalents because they define the nature of the information contained in each data field. The basic software is available for free on an open-source basis from American Coders..."

  • [February 27, 2001] "IBM Beefs Up Content Manager." By Barbara Darrow. In InternetWeek (February 26, 2001). "Managing information in its myriad forms has become a huge business -- estimated to hit the $10 billion mark by 2004, according to one researcher. With enhancements to its Content Manager software, IBM Corp. said it handles more information types than anyone else. Content Manager Version 7.1 adds new XML interfaces, the ability to handle Xerox metacode format, and integrates tightly with Siebel Systems' Call Center application, the company said. It also supports MPEG-2, Hot Media, and QuickTime streaming formats... The new software, available for Windows NT and Windows 2000, as well as AIX, will be unveiled Monday at IBM's Partnerworld Conference in Atlanta. Analysts said IBM has done a good job fleshing out its offering, and ensuring that it will interoperate with various third-party products from Vignette, Documentum, and Interwoven. Still, reliance on multiple vendors to fill an application void makes some corporations nervous. The proliferation of data types and distribution vehicles -- print, web etc. has made management increasingly complex. Currently, IBM partners with a variety of vendors to fill gaps in its own lineup. Wittle said the goal of Content Manager is to work well with a bevy of third party offerings. Pricing for Content Manager Version 7.1 starts at $15,000 per server plus $2,000 per concurrent user..."

  • [February 27, 2001] "User-Defined Extension Functions in XSLT." By Jeni Tennison. February, 2001. [A draft document that summarises recent public discussions on user-defined extension functions written in XSLT; informed and inspired by discussions on XSL-List with David Carlisle, Joe English, Clark C. Evans, Dave Gomboc, Yevgeniy (Eugene) Kaganovich, Mike Kay, Steve Muench, Miloslav Nic, Francis Norton, Dimitre Novatchev, Uche Ogbuji, and David Rosenborg.] "This document describes a method for defining user extension functions using XSLT in XSLT 1.0. XPath contains a number of functions that allow you to perform manipulation of strings, numbers, node sets and so on. While these cover basic functionality, there are often situations where stylesheet authors either need to or want to do more within an XPath. Most XSLT applications offer a range of extension functions. However, using only implementation's extension functions limits the stylesheet author to those thought of and implemented by a particular vendor. It also means that the stylesheet itself is limited to that vendor. Allowing users to define their own extension functions enables them to create the functions that they need for their particular application and enhances the portability of their stylesheets. Stylesheet authors need to have a ways of defining their own functions. These definitions may be in any programming language, but it is likely that different XSLT processors will support different languages. The one language that all XSLT processors support is XSLT. It therefore makes sense to allow stylesheet authors to define extension functions using XSLT - the implementation may not be as efficient as it would be in, say, Java, but at least it can be supported across platforms and implementations, and limits the number of langauges that stylesheet authors have to learn... This document is a first draft for review by the implementers of XSLT processors and the XSLT stylesheet authors. It is based on discussions on XSL-List. Comments on this document should be sent to XSL-List..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." [cache]

  • [February 27, 2001] "FXPath - Functional XPath." By David Rosenborg (Pantor Engineering AB). February 27, 2001. ['A comment on the document "User-Defined Extension Functions in XSLT" (called EXSL here), written by Jeni Tennison.] "The purpose of this document is to outline an alternative approach to writing extension functions in XSLT.The EXSL document and this document result from a recent discussion on the XSL mailing list (xsl-list@lists.mulberrytech.com). The EXSL document is in large an excellent compilation and presentation of the ideas and issues discussed on the XSL list. However, the EXSL document presents one of two rather different approaches on how to implement the extension functions. This document tries to present the other. The EXSL approach is to retrofit some XSLT instructions so that they can deal with all types in XPath, notably node sets. This document wants to show that there is a more natural way to accomplish the same result: write extension functions in XPath to deal with XPath types. Since XPath 1.0 lacks some vital constructs to do this, this document presents a superset, called Functional XPath (FXPath), that makes this possible in a convenient way. This document has a much narrower scope than the EXSL specification. It is concentrated around how to actually define the extension functions. The issues on calling functions, defining sets of common extension functions etc are well covered in EXSL and are not handled here. However, the set of example functions are reimplemented here, in FXPath, to enable a side by side comparison..."

  • [February 27, 2001] "The Relentless March of Computer Abstraction." By Frank Willison (O'Reilly editor in chief). From the O'Reilly Network. February 23, 2001. "I spent February 21-23 at XML DevCon in London. It's been a great opportunity for me to immerse myself in developments of this exciting technology. The two major directions of XML, based on what I've learned at this conference, seem to be: (1) Abstraction (2) Metadata. I'll write a section on each of these big ideas. Then I'll have comments on some interesting ideas or controversies that came out of the sessions I attended. Then I'll explain to you how, by attending XML DevCon, I've learned how the world is going to end..."

  • [February 26, 2001] "Budding B2B Standard Faces Big Problems. [CPExchange] Specification for Sharing Consumer Data Has No Users, Faces FTC Review." By Patrick Thibodeau. In ComputerWorld (February 12, 2001). "A data standard created to act as a high-tech lubricant for the exchange of customer information is facing problems, including a just-announced review by the Federal Trade Commission (FTC) and, perhaps more important, a lack of big end-user acceptance so far. The Customer Profile Exchange standard, or CPExchange, offers companies a way around numerous data types and the custom-designed interfaces needed to translate them. If the standard doesn't take off, the process may not improve, proponents say. 'At this point, we do data exchanges that are disastrous. Everybody speaks a different language, everybody has ways of pushing information -- from text files to XML. It is very, very nasty,' said Henri Asseily, chief technology officer at Los Angeles-based BizRate.com, a company that provides customer-generated ratings of e-commerce sites and one of 70 companies that is a member of the CPExchange Network. The first version of CPExchange was published in October, but so far, no company has adopted it. Most of its backers are vendors, with IBM being the largest. Only a few major end-user companies were involved in the standard's development, and two of those companies have apparently distanced themselves from this effort: First Union Corp. in Charlotte, N.C., and Charles Schwab & Co. in San Francisco. Both companies say they have no plans to implement the standard. Asseily said he believes the standard can solve the data exchange problems, but the 127-page specification is 'so complicated that it's very, very difficult for companies to make heads or tails of it.' The FTC announced earlier this month that it will hold a workshop on March 13 on the potential data privacy issues raised by company-to-company exchanges of customer information, prompted in part by a letter from Sen. Richard C. Shelby (R-Ala.). Shelby claims that the CPExchange technology gives companies a vastly improved ability to share and exploit personal information in pursuit of profit.'... A major selling point for proponents of the CPExchange is the standard's ability to incorporate an individual's privacy preferences. For instance, a company that needs to transmit consumer data to a supplier could attach privacy restrictions that set limits on the use of the data, such as third-party sharing." See "CPExchange Network."

  • [February 26, 2001] "Where XML Specifications are Clicking." By Valerie Rice. In eWEEK (February 25, 2001). "If you like an argument, you'll love XML. Extensible Markup Language is touted as the ultimate solution to every industry's dreams of business-to-business process efficiency. But first, of course, there has to be agreement on the schemata and tags that go into industry-specific XML standards... Some XML initiatives in industries such as insurance and health care were able to get there quickly because they'd already shed meeting-room blood over EDI (electronic data interchange) standards. Other initiatives -- such as RosettaNet, the high-tech industry's XML standard for manufacturing -- started early and were given a boost by the sudden rise of e-commerce. Here are a few vertical industry initiatives that have made remarkable progress and the lessons they can teach the rest: (1) Health care: health care companies early on recognized the B2B benefits that XML could bring, said Liora Alschuler, an independent XML consultant who works with HL7, the health care standards body. HL7 is made up of health care providers, vendors, consultants and other groups with an interest in using XML to share clinical, patient, financial and other information online. HL7's first area of focus was what Alschuler called the 'XMLification" of EDI'... (2) Financial services Among the industries driving xml standards, financial companies -- banks, credit card companies and so forth -- are making a lot of progress, said Wes Rishel, an analyst at Gartner Group Inc., in Stamford, Conn. Unlike in other industries, however, not all the work is being done by standards organizations. Vendors such as Veri Sign Inc., of Mountain View, Calif., and large institutions such as Visa International are playing active roles in driving the standards. That could be one of the reasons that the financial services industry's standard for digital signatures, the so-called XKMS -- for XML Key Management Specification -- was pulled together so quickly... (3) Insurance: In October 1998, the insurance industry started a project to create XML for life insurance information and processes. The industry followed it nine months later with a property and casualty effort. And so far it's all been fairly quick and painless... The ACORD XML spec for life insurance and the separate spec for property and casualty allow insurance companies to handle inquiries and quotes, as well as submit new business and process claims online... (4) High-tech manufacturing: Rosettanet is arguably the mother of all industry-specific XML efforts, with 300 consortium members and standards covering the gamut of products, from raw materials to electronic components. All told, the RosettaNet initiative identified more than 100 processes and created standard XML definitions and tags to support them. Standards exist today covering everything from inventory management to order management and design wins..." See also Chart: The Case for XML.

  • [February 26, 2001] "Why 90 Percent of XML Standards Will Fail." By John R. Rymer (President and founder of Upstream Consulting). In eWEEK (February 26, 2001). "Those who are making XML standards are reliving the mistakes of past standards bodies. I can see what's coming and it is a whole lot less than any of us would like or need. I think 90 percent of the current activities will not produce meaningful technology. In my view, that's failure. Pardon my skepticism, but I've lived through too many can't-miss, can't-live-without-it standards efforts. There was the gargantuan effort to create an alternative to TCP/IP by the International Standards Organization (ISO), the tortured efforts to standardize the Unix operating system, the Open Software Foundation's DCE debacle, and the gun-to-the-head tactics of the Object Management Group (OMG). Of these, only the OMG's CORBA can be called a commercial success. Each of these efforts suffered from one or two mistakes that doomed it to failure... Mistake #1: Nonalignment; Mistake #2: Over-promise; Mistake #3: Overdo it; Mistake #4: Overreach... Pardon me for being cranky about this, but the net effect of XML standards has been to slow adoption of XML products and technology. There's too much noise, too much hype, too many promises--too much risk. Shouldn't we know better by now?"

  • [February 24, 2001] "Interfacing With XML." By [Editor] Ajit Sagar (VerticalNet Solutions). In XML-Journal (March 2001). "A couple of weeks ago I participated in several technical meetings to define the next phase of the architecture of our current products. As usual, any initiatives for a new architecture include requirement considerations for open APIs, platform independence, and loose coupling between components as the basic criteria for the design of the platform components. Our architecture is based on J2EE and XML. The APIs that are exposed by the infrastructure can be categorized into the programmatic APIs that are exposed through object methods and structural APIs. J2EE offers the available programmatic (method-call based) APIs as a programmatic interface. XML offers an effective way of exposing structural APIs. It also provides an elegant mechanism for achieving configuration for the deployment of applications. XML offers an effective way of bridging data transfers between decoupled components or applications. The hierarchical nature of XML documents allows for the exchange of data, which retains object-style relationships such as aggregation and inheritance. Subsequently, XML offers the ability to create flexible and extensible APIs that enable applications to expose their functional capabilities. At the same time, XSLT and XPATH provide processing capabilities such as a search based on pattern matching, and the ability to match data based on matching algorithms. These functional capabilities manifest themselves as XML-based APIs that are universally understood by disparate applications. After all, XML expresses data in a string format, which is human-readable..."

  • [February 24, 2001] "Microsoft Commits To XML." By Wei Meng Lee (The Centre for Computer Studies, Ngee Ann Polytechnic, Singapore). In XML-Journal (March 2001). "Beginning with Internet Explorer 4.0 (IE 4), Microsoft has provided users of its operating system with a unique way of viewing XML documents. If you're running IE5, you already have the Microsoft XML parser (MSXML) installed. The MSXML parser has come a long way, beginning with version 2.0 (IE5) up to the latest version, 3.0. Depending on the software and operating system, you most likely have MSXML 2.0 (IE5) or 2.5 (Windows 2000) on your system. Since January 2000, Microsoft has showed its commitment to XML by releasing a new XML parser every other month (preview release). That early release was version 2.6, renamed version 3.0 last March. Each preview release contains improvements in performance as well as support for the W3C XML 1.0 specifications. The long-awaited production version of MSXML 3.0 was finally released last November. In this article I'll discuss some of its features, and, specifically, show you how to get started using it. In subsequent installments I'll go into greater detail on each of its components. This article covers the following: (1) Installing MSXML 3.0 on your system; (2) Using XSLT and XPath; (3) Using the Internet Explorer tools to validate XML documents... Conclusion: With MSXML 3.0, Microsoft has once again proved its commitment to the XML technologies. This isn't surprising since XML is the foundation technology for many of Microsoft's future products. In this article I've tried to avoid bogging you down with all the technical jargon related to MSXML. In a forthcoming article I'll describe how the MSXML DOM can be manipulated programmatically. I'll rewrite the XSLT stylesheet with ASP and DOM and show you how they can be used to achieve the same purpose."

  • [February 24, 2001] "Designing An Attribute Search Engine For B2B Negotiations." By Stephen Rao and Mary Xing. In XML-Journal (March 2001). "More and more companies are building B2B systems to conduct business on the Internet. These systems are different from catalog-based B2C Web sites. Among other things, B2B systems usually need to provide stronger negotiating capabilities. XML documents are flexible and self-explanatory and are now the preferred solution for B2B information exchanges. A good application of the technology is to use XML files as workflow documents to convey the attributes in trading negotiations. We found that while XML documents are flexible and easy to understand, searching information from such plain text files is difficult. A B2B trading system needs both flexible negotiations and convenient query capabilities. We designed an Attribute Search Engine (ASE) for XML trading negotiations using Java EJBs and a relational database (RDBMS). It's based on the generic concepts we distilled from the use of attributes and enables powerful data searches for XML attribute documents. The resultant system has the strength of both sides and delivers the functions needed in a practical B2B system. The attribute concepts we established are generic. The engine can be applied many other places...In e-commerce negotiations the parties need the flexibility to use various attributes in a workflow document to describe their commodities and terms. We conclude that implementing those documents with XML text files is better than the conventional RDBMS tables. First, relational tables have a finite number of predefined data entities. In negotiations many new things may come up dynamically. It's impractical to predefine them. Second, relational tables usually require high data integrity for transactions as they tend to have tight data constraints among one another. Negotiating documents should be loosely composed with fewer data restrictions. Third, with normalized data, the content of one document is often scattered in a number of tables in RDBMS. The danger of unintentionally changing historic data is greater. XML is effective in modeling document-oriented trading processes because the negotiating parties can add conditions/attributes at will to the same self-explanatory document with greater flexibility. Buyers and suppliers may pass around an XML document in a negotiation until a deal is reached. After the deal they can simply archive the XML as a single file, hence the information is retained independent of other variables... Conclusion Searching and analyzing B2B trading details are important to a successful e-commerce provider and its users. While XML offers users flexible attribute negotiations, an ASE makes information search and analysis easy. They work together nicely to provide the capabilities desired in a practical B2B trading system."

  • [February 24, 2001] "Converting Your Client/Server Applications To The Internet." By Victor Rasputnis (CTI) and Anatole Tartakovsky (CTI).. In XML-Journal (March 2001). "IT projects closely follow the path of technology. .For example, the number of Java/XML/HTML projects is increasing, replacing PowerBuilder or VisualBasic systems developed just a few years ago. And developers are asking themselves the question: Do I have to write the same app from scratch? Again? Integrating existing systems developed in previous millennium environments with the Internet is a costly and difficult task. The other approach would be to "convert" existing applications to native Internet technologies. Sound complex? It's not. Legacy systems contain rich metadata, although in a proprietary format such as PowerBuilder's PBL or VisualBasic's FRM files. All graphic controls - list boxes, buttons, labels, and so on - show up in the metadata with all their positions, sizes, colors, and other attributes. Database queries allow reconstruction of the original SQL select statements or stored procedure calls. Code scripting of the events is also available. Suppose we learn how to read the metadata and put it into XML format. What can we do with it? We can generate systems for the Internet by automatically converting existing legacy code. In this article we outline the design of the 'magic wand' that converts client/server programs into a Java/XML/XSL solution. In particular, we'll demonstrate how you can leverage investments in all your PowerBuilder DataWindows, migrating them to sites residing on J2EE application servers... We demonstrated the working approach to migrating client/server applications into the J2EE environment. It enables the automatic magic-wand conversion of databound legacy control to cutting-edge Internet technologies. The cornerstone of the solution - code generation from XML metadata - extends it far beyond the conversion process. Indeed, where the metadata is coming from is irrelevant. Combined with a proper graphic design tool, this solution may become a full-scale IDE for creating Internet applications. This proposed approach puts to work the Model/View/Controller paradigm, enforcing strict separation of the data model (XML) from presentation (XSL). In our opinion that alone should bring developers' productivity back to the level of RAD tools. In addition, the XSL-based approach to code generation provides limitless possibilities for end-user customization. The authors maintain a free online implementation of the conversion software at www.xmlsp.com."

  • [February 23, 2001] "The XML Meta-Architecture [and What the XML Application Interface Looks Like]." Presentation slides. By Henry S. Thompson (HCRC Language Technology Group University of Edinburgh; World Wide Web Consortium). Keynote presentation at XML DevCon Europe in London, England. February 21, 2001. Conclusion: "(1) Think about things in terms of Infosets and Infoset pipelines: Modular, Powerful, Scalable. (2) Use XML Schema and its type system to facilitate mapping: Unmarshalling is easy; Marshalling takes a little longer." Abstract: "The XML technology core has grown rapidly since the announcement of the XML 1.0 itself just over three years ago. First there was XML Namespaces, then DOM and XPath and XSLT, now XLink/X-Pointer/XBase, XML Infoset and XML Schema are (nearly) here, before long we'll have XSL-FO, XML Query and XML Protocols. I believe that XML Infoset is fundamental to understanding the relationship between all these parts. In this talk I'll present my take on an emerging perspective on the meta-architecture of XML, where each XML technology can be understood as defining a class of infoset transducers. On this account an XML application is a pipeline of infoset processing composed of such transducers, for example parser->schema processor->linker->schema proecessor->query processor. I'll suggest how I think this vision will impact on standards development, and conclude by looking at the XML/Application interface from this perspective." Also online: the Technetcast from DDJ. [cache]

  • [February 23, 2001] IAS XBRL Taxonomy Draft. Presented by David Prather. February 20, 2001. Context: At the first global meeting of XBRL.org in London, the XBRL member organization International Accounting Standards Committee (IASC) announced a "draft taxonomy of XBRL for Financial Statements to members of XBRL.org for review." The IASC taxonomy is an XML-based specification for the 'Commercial and Industrial' sector that allows users and suppliers of financial information to exchange financial statements across all software and technologies, including the Internet. The draft/beta taxonomy is available as an XML schema and in Microsoft Access database format. Some details are documented in a Meeting Presentation: "IAS-XBRL PowerPoint presentation from the February 20, 2001 International XBRL meeting in London." By Kurt Ramin (IASC, London), Ian Wright (PricewaterhouseCoopers, London). David Prather (IASC, London), Bruce Mackenzie (Deloitte & Touche, London. The IAS XBRL Taxonomy Draft was presented by David Prather (IAS XBRL Project Manager, IASC). "IAS XBRL Approach: October [yielded] a way forwards: produce 'trees' to identify the elements; work shared by all of Big 5. In November [PIs] agreed to key principles: elements for items in IAS standards; structure should based on the minimum formats in IAS 1 (B/S +I/S), IAS 7; general ledger closing balances [B/S items in balance sheet section etc]; all cash movements in cash flow section; all other items in the notes; detail as required or recommended by IAS cross references to IAS paragraphs; used trees to confirm complete. Expected key benefits: IAS users are familiar with IAS standards; IAS standards define or explain many of the elements; IAS is translated into 13 languages; elements are directly linked to paragraphs in standard so assist users to use the correct item..." See details.

  • [February 23, 2001] "Content Management Moves Ahead." By Stephanie Sanborn. In InfoWorld Volume 23, Issue 8 (February 19, 2001), page 38. "XML, the Internet, and global collaboration are all changing the still-evolving industry Content Management's roots may lie in document management, but its future will likely lie on the Web and beyond as its evolution pushes the concept of what content is and how it can be used for e-business. The Web gave content management and the life cycle of content itself a boost as companies began to realize that although running business on the Web has many benefits, it also requires making content useful and relevant online. Companies are finding a need to collaborate around content, and that often means bringing together users and content from different parts of the globe... As you get much richer in your applications and provide more content, more inventory, and a broader set of services to a broader set of people you can reach through the Web, the whole problem of managing that content becomes much greater because you have much more of it and you need to describe it much more effectively, explains Robert Perry, a senior analyst at The Yankee Group in Boston... NextPage plans to capitalize on distributed content by adding peer-to-peer technology from its acquisition of netLens to content management. Incorporating netLens' Peer Space product into NextPage's NXT3 content platform products will create a 'virtual space where all connections are able to be established and the communication can happen,' along with alerts to notify users when changes are made to content they are interested in, says Darren Lee, NextPage's vice president of marketing and product strategy. 'I need a way to connect repositories together and provide integrated access for an end-user across all of that information, not just giving them a view to the Web site. Inherent in that is that [the content] is distributed,' Lee says. 'And therefore p-to-p as an architecture is a perfect fit. It's more about information finding you than you finding information.' Another technology sure to play a big role in the future of content management is XML, which 'is starting to become the lingua franca of business, and it's starting to become customized based on the industry you're in,' Zarghamee says, noting that XML's capability of describing context is invaluable to content management. 'That becomes very powerful, and systems can start exchanging content and take action on the content. So you can truly get into e-business networks and dynamic trading partners. Those ideas have been around, but there was really no technology that enabled it [until XML].' To Interwoven's Ruck, 'XML is like Java in the sense that it's going to be a pervasive technology that's going to be adopted throughout product lines' and is particularly important for areas such as b-to-b, content syndication, and wireless, where content will be deployed 'over multiple customer touch points.' 'There are a lot of different distribution destinations now, different channels to support, often in different countries, that all need the same brand,' Perry explains. 'That's what content management can really help with: creating a single blueprint and pushing it out'..."

  • [February 23, 2001] "IBM, Microsoft Settle E-Commerce Standards Dispute." By Siobhan Kennedy. In InfoWorld (February 23, 2001). "A group backed by International Business Machines Corp., the world's largest computer hardware company, agreed this week to adopt an electronic-commerce standard being developed by software giant Microsoft Corp., settling a high-stakes dispute that has been rumbling for more than a year. By bringing the incompatible standards together, the two sides are seeking to provide companies with a common format for doing business over the Internet, a market expected to explode in the next few years. AMR Research in Boston predicted the market for business-to-business transactions will skyrocket to $5.7 trillion by 2004 from $581 million in 2001 as more and more companies use the Web to buy products and services. 'If you don't have a standard way of communicating, then people will create lots of different ways of doing it,' Bob Sutor, IBM's director of e-business standards strategy, told Reuters on Thursday. 'And that will create big interoperability problems.' That is exactly what has happened so far. With IBM pushing one standard and Microsoft another, the result has been a sometimes bitter war of words between the two. IBM has dismissed the rival effort as lightweight and too Microsoft-centric, and Microsoft has criticized the IBM group for taking too long to get its standard out the door... OASIS' standard, called ebXML (for electronic business XML), is a series of specifications that define how businesses should communicate with each other in buying and selling goods over the Internet. XML (Extensible Markup Language) is a popular Web standard that businesses use to exchange information with each other online... John Montgomery, lead product manager for Microsoft's .Net framework, said Oasis' decision to adopt SOAP is a clear validation of the approach both Microsoft and the World Wide Web Consortium has taken with XML standardization. 'Microsoft has consistently said that the (consortium) is where XML standardization should occur,' he added. Sutor, who is also vice chairman of the ebXML group, said the OASIS members will continue to develop the ebXML standard, an overarching effort that includes a lot more work than the small part that overlapped with Microsoft's SOAP." See the ebXML announcement.

  • [February 23, 2001] "Standards Groups Reach E-Business Accord." By Wylie Wong. In CNET News.com (February 22, 2001). ['A brewing controversy over e-business standards may have been averted Thursday after one Web standards group agreed to support the work of another.'] "The World Wide Web Consortium (W3C), the gatekeeper of many Internet standards, and OASIS, a group of technology companies backed by the United Nations, have been developing competing technologies to allow businesses to link over the Internet and conduct e-commerce. OASIS on Thursday announced it is ceasing its effort to build a communications protocol for e-commerce business communications in favor of a competing specification under development by the W3C. The W3C recently began building an XML-based protocol based on technology developed by Microsoft, IBM and others, called the Simple Object Access Protocol (SOAP). At issue is the need to build an XML-based communications protocol that serves as a common format for businesses to swap information with each other. XML (Extensible Markup Language) is a popular Web standard for businesses to exchange information with each other via the Web. The result is one uniform standard for exchanging XML messages and less confusion among software developers on what standard to use in the future, said Bob Sutor, IBM's program director for e-business standards strategy. Companies conducting business over the Web need a common format to send information to one another, much like the post office has a standard way for people to send mail, Sutor said. People are required to write addresses and place stamps on the same places on an envelope so the post office knows where to send mail. Until now, OASIS and the W3C differed in their definitions of that format. Before OASIS' support of SOAP, both OASIS and the W3C had said they would create connectors so that the two differing communications protocols could communicate. Now, that work will be unnecessary. 'We don't need unnecessary duplication,' Sutor said of the competing efforts. 'It means that software that businesses use can be simpler because they will have fewer specifications for messaging. People can devote more time on creating really new software to make their businesses better, rather than being mired down in the details of supporting yet another messaging protocol.' OASIS, which includes IBM, Sun Microsystems, BEA Systems and others, has been working with a United Nations organization to develop a blueprint for businesses in different industries to use XML. The EBXML effort is aimed at allowing companies that use older data-exchange technology, called Electronic Data Interchange, or EDI, to start using more flexible and potentially cheaper XML-based software over the Internet." See the ebXML announcement.

  • [February 22, 2001] "XMLTrans: a Java-based XML Transformation Language for Structured Data." By Derek Walker, Dominique Petitpierre, and Susan Armstrong (ISSCO, University of Geneva, 40 blvd. du Pont d'Arve CH-1201 Geneva 4, Switzerland). Abstract: "The recently completed MLIS DicoPro project addressed the need for a uniform, platform-independent interface for accessing multiple dictionaries and other lexical resources via the Internet/intranets. Lexical data supplied by dictionary publishers for DicoPro was supplied in a variety of SGML formats. In order to transform this data to a convenient standard format (HTML), a high level transformation language was developed. This language is simple to use, yet powerful enough to perform complex transformations not capable with other standard transformation tools. XMLTrans provides rooted/recursive transductions, similar to transducers used for natural language translation. XMLTrans is written in standard Java and is available to the general public... The goal of DicoPro [April 1998 to Sept 1999] was the development of a uniform, cross-platform client-server tool to enable translators and other language professionals connected to an intranet to consult dictionaries and related lexical data from multiple sources. Dictionary data was supplied by participating dictionary publishers in a variety of proprietary formats. One important DicoPro module was a transformation language capable of standardizing the variety of lexical data. The language needed to be easy enough for a nonprogrammer to master, yet powerful enough to perform all the necessary transformations to achieve the desired output. We initially looked at available SGML transformation tools, XML transformation tools, and nally decided to develop our own. We began to examine available XML transduction resources. The budding standard at the time that our project began, XSL, was still not mature enough to rely on as a core for the language. In addition XSL does not provide for rooted, recursive transductions needed to convert the complex data structures found in DicoPro's lexical data. Edinburgh's Language Technology Group has produced a number of useful SGML/XML manipulation tools. Unfortunately none of these matched our specic needs. For instance, sgmltrans does not permit matching of complex expressions involving elements, text, and attributes... Given the large number of XML APIs developed for Java, this seemed to be a promising venue. The API model which best suited our needs was the Document Object Model(DOM) with an underlying SAX parser. This provides the core of the XMLTrans parser. The transducer was designed for the processing of large XML files, keeping only the minimum necessary part of the document in memory at all times. In effect, XMLTrans processes lexical entries from a dictionary that are independent of each other and that have a few basic formats. It takes as input a well-formed XML file and a file containing transformation rules and gives as output the application of the rules on the input file. [In this paper] We begin with a simple example to illustrate the kinds of transformations performed by XMLTrans. Then we introduce the language concepts and structure of XMLTrans rules and rule files... The XMLTrans transducer was used to successfully convert all the lexical data for the DicoPro project. There were 3 bilingual dictionairies and one monoligual dictionary totalling 140 Mb in total (average size of 20 MB), each requiring its own rule file (and sometimes a rule file for each language pair direction). Original SGML files were preprocessed to provide XMLTrans with pure well-formed XML input. Inputs were in a variety of XML formats, and the output was HTML. Rule files had an average of 178 rules, and processing time per dictionary was approximately 1 hour... The code is portable and should be runnable on any platform for which aJava runtime environment exists. A free version of XMLTrans can be downloaded from http://issco-www.unige.ch/projects/dicopro_public/index.html. See other details. [cache]

  • [February 22, 2001] "K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data." By Susan Davidson, Jonathan Crabtree, Brian Brunk, Jonathan Schug, Val Tannen, Chris Overton, and Chris Stoeckert. In IBM Systems Journal, March 2001. 23 pages, with 76 references. "The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, 'on-the-fly' integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear 'winner'. Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application. Our experiences also point to some practical tips on how updates should be published by the community, and how XML can be used to facilitate the processing of updates in a warehousing environment... Conclusions: Both the K2/Kleisli view and GUS warehouse strategies have proven useful for genomic applications within the Center for Bioinformatics. Kleisli was used for some time to implement several web-based, parameterized queries that were written for specific user groups. Users could query views that integrated many important on-line data sources (such as GenBank, GSDB, dbEST, GDB, SRS-indexed databases, KEGG and EcoCyc) and application programs (such as BLAST) by supplying values for parameters; the data sources and application programs were then accessed on demand to provide answers to the parameterized queries... it is now up to individual data source owners or third parties to modify their data sources or to provide wrappers to their data sources so that they conform to these specifications. However, we believe that the standardization of all genomic data sources is an unrealistic goal given their diversity, autonomy, and rapid schema changes. This is evidenced by the fact that interest in CORBA seems to have waned over the past year and to have been superceded by XML. As a universal data exchange format, XML may well supplant existing formats such as EMBL and ASN.1 for biological data, and as such will simplify some of the lower-level driver technology that is part of K2/Kleisli and other view integration systems. There is an abundance of freely available parsers and other software for XML, and heavy industry backing of XML. The question is whether it will do more than function as an exchange format. It may, in fact, become a basis for view integration systems by using one of the query languages developed for semistructured data or XML. However, before it becomes a good basis for an integration system we believe that several things must happen: (1) Some kind of schema technology must be developed for XML. DTDs function as only a rough schema for XML. For example, there are no base types other than PCDATA (so the integer 198 cannot be distinguished from the string \198"), no ability to specify keys or other constraints on the schema, and the reliance on order makes representing tuples (in which the order of attributes is unimportant) tricky. The recent XMLSchema proposal addresses many of these problems by providing a means for defining the structure, content and semantics of XML documents. (2) An agreement must be reached on the use of terms, or there must be a mechanism to map between terms. The discussions in this paper have sidestepped one of the most diffcult parts of data and software integration: semantic integration. Semantic integration focuses on the development of shared ontologies between domains of users, and on the resolution of naming conflicts (synonyms and homonyms). In the TAMBIS project, although Kleisli was used for the low-level (syntactic) integration, a major effort of the project was to develop an ontology through which researchers could navigate to find information of interest. The view layer layer K2MDL in K2 aids in semantic integration by providing a means for mapping between concepts in different databases, and has proven extremely useful in the integration projects for SmithKline Beecham. For XML to be useful in data integration, either the usage of tag labels must be uniform across the community, or a semantic layer must be available. (3) A standard for XML storage must be adopted. Several storage techniques are currently being explored based on relational and object-oriented technologies; new technologies are also being considered. However, there is no agreement on what is best. Warehouse developers must currently therefore provide their own mapping layer to store the result of an integration query. These issues are of current interest in the database research community, and within the near future we expect to see preliminary solutions." [cache]

  • [February 22, 2001] "XML on the Move." By Edd Dumbill. [Trip Report.] February 21, 2001 "On the first day of XML DevCon Europe in London, England, speakers highlighted the growth of XML in its three years of existence. Henry Thompson from the University of Edinburgh (and zealous editor of the W3C's XML Schema specification) noted in his opening keynote that XML had grown from one specification to a family of technologies. He focused on the emerging centrality of the XML Infoset and XML Schema. David Orchard of Jamcracker taught a session on web services, XML, and UDDI. Despite XML's growth in the area of program-to-program communication, there's still much to build..."

  • [February 22, 2001] "Keys for XML." By Peter Buneman, Susan Davidson, Wenfei Fan, Carmem Hara, and Wang-Chiew Tan. Presentation prepared for WWW10 (2001). With 21 references. "We discuss the definition of keys for XML documents, paying particular attention to the concept of a relative key, which is commonly used in hierarchically structured documents and scientific databases... If XML documents are to do double duty as databases, then we shall need keys for them. In fact, a cursory examination1 of existing DTDs reveals a number of cases in which some element or attribute is specified -- in comments -- as a 'unique identifier'. Moreover a number of scientific databases, which are typically stored in some special-purpose hierarchical data format which is ripe for conversion to XML, have a well-organized hierarchical key structure. Various forms of key specification for XML are to be found in the XML standard, XML Data, and XML Schema. Through the use of ID attributes in a DTD, one can uniquely identify an element within an XML document. However, it is not clear that ID attributes are intended to be used as keys rather than internal 'pointers'. For example, ID attributes are not scoped. In contrast to keys, they are unique within the entire document rather than among a designated set of elements. As a result, one cannot, for example, allow a student (element) and a person (element) to use the same SSN as an ID. Moreover using ID attributes as keys means that we are limiting ourselves to unary keys and, of course, to using attributes rather than elements. Finally, one can specify at most one ID attribute for an element type, while in practice one may want more than one key. XML Data introduces a notion of keys explicitly. However, its keys can only be specified in types and moreover, can only be defined for element types rather than for certain collections of elements. XML Schema has a more elaborate proposal, which is the starting point of this paper. The proposal extends the key specification of XML Data by allowing one to specify keys in terms of XPath expressions. There are a number of technical problems in connection with XPath. XPath is a relatively complex language in which one can not only move down the document tree, but also sideways or upwards, not to mention that predicates and functions can be embedded as well. The main problem with XPath is that questions about equivalence or inclusion of XPath expressions are, as far as the authors are aware, unresolved; and these issues are important if we want to reason about keys as we do in relational databases. Yet until we know how to determine the equivalence of XPath expressions, there is no general method of saying whether two such specifications are equivalent. Another technical issue is value equality. XML Schema restricts equality to text, but the authors have encountered cases in which keys are not so restricted. A more detailed discussion can be found in section 7.1. However, the main reason for writing this note is that none of the existing key proposals address the issue of hierarchical keys, which appear to be ubiquitous in hierarchically structured databases, especially in scientific data formats. A top-level key may be used to identify components of a document, and within each component a secondary key is used to identify sub-components, and so on. Moreover, the authors believe that the use of keys for citing parts of a document is sufficiently important that it is appropriate to consider key specification independently of other proposals for constraining the structure of XML documents." A related paper: "Reasoning About Keys for XML." University of Pennsylvania. Technical Report MS-CIS-00-26, 2000; local copy. Also in PDF and Postscript. [cache]

  • [February 22, 2001] "[Draft/experimental] Formal Specification of the RDF Syntax." By Brian McBride (HPLabs). February 22, 2001. "[This is] an experiment in doing a formal specification of the RDF Syntax. The goal is to more formally define the triples that any given RDF XML represents. The idea is to annotate the grammar with attributes. Each production in the grammar takes attributes as arguments and can return attributes as a result. A production emitTriple, which always succeeds but has the side effect of emitting a triple is introduced. There is a trivial transformation from the annotated grammar to an equivalent XSLT transform, thus in effect enabling an executable specification." ["There was a suggestion on the list last summer that transformation grammars could be used to formally specify the translation of RDF XML to triples. A quick search around the web revealled that such grammars were proposed in natural language processing, but I didn't find anything immediately useful. I liked the idea of formal definition of the transformation. I also thought it would be good to have an 'executable' specification, i.e. one that we could execute against test cases and would spit out the triples. My first thought was to use XSLT to define the transform, but that turned out pretty unreadable. My second attempt has been to produce an attribute grammar for RDF. It turns out that an attribute grammar works reasonably well, and there is a simple, possibly automatable, way of turning the attribute grammar into an XSLT transform. The attribute grammar can be found in the header comment, and the XSLT is executable, though probably buggy."] See "Resource Description Framework (RDF)."

  • [February 22, 2001] "Getting the Tags In: Vendors Grapple with XML-Authoring, Editing and Cleanup. [XML Authoring: Reviewing the Latest Tools.]" By Liora Alschuler. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), pages 1, 5-10. [NB. A summary does not do justice to this detailed, informative review. -rcc] Publishers of all types are seeking ways to encode their content in XML, and vendors are responding with a variety of specialized tools that run the gamut, from structured authoring to post-production data conversion. We survey the options and expand on three products that work with Microsoft Word, the leading text-editing tool in publishing today. Since the first structured editing applications surfaced some 15 years ago, the technology for adding tags to documents has struggled to earn its place alongside WYSIWYG word-processing applications. Structured authoring still hasn't reached the mainstream, but the rise in XML use on the Web and increasing demand for cross-media source files are gradually reshaping this market. Those who want to implement XML-authoring can choose from a mixed bag of specialized tools, ranging from structured input to post-production conversion. Our overview starts with updates on XML editing tools from the leading established players: Arbortext and SoftQuad. We then review the two top Microsoft Word plug-ins for creating valid XML as you type in the popular word processor: HyperVision's WorX SE and I4I's S4. Lastly, reviewing the options for post-production conversion, we take our first look at an exciting Word plug-in designed for manuscript production editors: Inera's eXtyles... The field of vendors offering XML word processors has shrunk to -- and a handful of straggling wannabes. Both of the veteran leaders reported healthy sales growth last year... WorX SE adds administration tools: WorX SE is a Microsoft Word plug-in introduced in spring 2000. The plug-in application preserves the interface and functionality of Word while adding real-time, interactive structure conversion and feedback. Users see dynamic document structure displayed in a graphic tree, as in a native structured editor, indicating which parts are valid according to either an XML DTD or an XDR schema... I4I (Infrastructures for Information) has released a product based on its long-standing development toolkit, S4/Text, which the company calls a 'tagless editor for XML.' Like WorX SE, S4/Text is a Word add-on that operates on keystroke and mouse capture through the Word API. In real time, it validates input against an XML DTD and guides users so that they create valid XML using the Word user interface... Inera aids production editors with eXtyles" In contrast with structured-authoring tools, eXtyles from Inera Corporation is not a writing tool, but an editorial and production tool designed to clean up and apply presentation and editorial styles to Word documents. EXtyles does three types of processing: it inserts publication-specific meta-data, cleans up noise such as double line-endings, hard hyphenations and misspellings, and exports tagged XML files. EXtyles can export well-formed XML documents or XML documents that have been validated against a DTD... [Conclusion:] This market is morphing rather than exploding. Buyers are demanding easy-to-use interfaces for creating reusable content, pushing even the traditional structured editor vendors to accommodate the Word interface, while Microsoft, unable to make or buy a satisfactory bolt-on to Word, licenses XMetaL. The new Word add-ons are demanding serious evaluation, but, for now at least, the perception that ultimately data conversion and WYSIWYG interfaces don't scale and don't satisfy users in mature applications leaves this market searching for solutions that don't yet exist." Related: see also the article by David Mertz which provides an "up-to-date review of a half-dozen leading XML editors."

  • [February 22, 2001] "NewsML Lays Groundwork for Next-Generation News Systems. [Report from the Edge.]" By Aimee Beck and Luke Cavanagh. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), pages 15-18. ['NewsML Sets Stage For Future News Systems. Passed by the IPTC in October, NewsML -- the XML-based header for multipart, multimedia news feeds -- is a dramatic step forward from the wire service standards that have been in place for decades. We outline the new standard (and its text counterpart, NITF), find out how users are gearing up to support them and gauge vendor readiness.'] "Last fall, wire services and other leading news publishers approved NewsML, an XML-based standard for transmitting news. Featuring generic markup in both header and text content and support for multimedia news feeds, NewsML paves the way for new generations of editorial systems, ones designed for digital media. Major players in the news wire industry spent more than eight years developing the News Industry Text Format (NITF), a successor to the venerable ANPA wire-story format. Originally developed in SGML and then reshaped as XML about 18 months ago, NITF was on its way to being adopted when Reuters took the idea a step further and conceived NewsML. NewsML completed a short trip through the standards gauntlet and was ratified by the IPTC this past October. With the consensus-demanding standardization process finally over, it's time to assess the implications of NewsML: Who will it affect, and how soon? What benefits will it bring? What vendors will support it? What influence will the spec have outside of the news industry? With these questions in mind, we set out to see if NewsML was gaining any traction in the field. Aside from early adopters, most notably Reuters, we found few examples of NewsML projects underway. However, change is afoot. Awareness of and demand for XML is rising rapidly in the news industry, and vendors are responding. The transition to an XML-enabled editorial environment will take years to permeate the entire global news industry, but we believe its impact ultimately will be akin to the change that the AP Leafdesk brought to U.S. newspapers in the early 1990s. Once papers had a digital wire-photo receiver, electronic pagination suddenly changed from pipedream to obtainable goal. In the same way, once papers get their hands on a system that can process XML-encoded multimedia feeds, they'll be a giant step closer to delivering well-coded files to their webmasters and in a much better position to deliver their own multimedia news coverage to consumers... [Conclusion:] The utility of XML, both in the header and the text itself, has become apparent, not only to newspapers but to a wide array of Web publishers and media companies handling news. Reuters has seeded the market with an open-source toolkit to encourage its customers to adopt NewsML, but, for NewsML to take off, the rest of the newswires and system vendors will have to support it as well. The time is right for that to happen. The blessing of NewsML by the IPTC sets a global standard for how news will be delivered in the future. With an approved spec in hand, vendors have the blueprint they need to begin implementations. Few Web content-management systems currently feature built-in support for wire services, but adding such support would mesh with their need to support XML text and variable metadata. Newspaper editorial system suppliers that have been dragging their heels on the XML issue must now catch up, or risk losing out to those that anticipated this change." See "NewsML and IPTC2000."

  • [February 22, 2001] "Journeying to the XML Promised Land. [Letter from the Editor.]" By Mark E. Walter, Jr. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), page 2. "The Extensible Markup Language (XML) has been a lightning rod in publishing since its debut at the annual SGML conference in the fall of 1996. It galvanized the entire industry, and within five years, the promise that the Web community would adopt a simplified form of SGML more readily than the old standard has been fulfilled. All publishers have to do to reap the riches of the Promised Land is convert to an XML-based process. Why then, are so many firms wandering the desert in search of such a process? Is it the inability to change people, or the lack of decent tools? Wherever blame lies, one thing is certain: the tools are changing. That's why in this month's cover story we look at the options for authoring and editing XML documents. Those options include not only established vendors, such as Arbortext and SoftQuad, but also a handful of lesser-known firms that we believe our readers will want to check out. Another factor that will drive XML implementation will be its adoption by coalitions of key vendors or users. In the news business, for example, the ratification of NewsML by the wire services will be the first major update to wire-copy headers in decades, a change that will impact virtually every newspaper-system supplier and wire-service customer -- from newspapers to radio stations to Web portals..."

  • [February 22, 2001] "Ecosystems' The Environment: Product Development Interface For XML Content Management." By Mark Walter. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), pages 11-14. ['Cool product-development interface for XML content management. An application layer on top of Astoria/Eclipse sets a new user interface benchmark for XML-based collaborative editing, production and new product development in reference, book, corporate and education applications.'] The Ecosystems Environment is an application layer that sits on top of Chrystal Software's Astoria or Eclipse SGML/XML-aware document-management systems that have Web-based access for participants. While Astoria provides the basic library facilities common to content-management systems -- check-in/check-out for collaborative authoring, versioning and so forth -- it lacks a user interface for many publishing-specific functions. The Environment provides this user interface. The defining feature of The Environment is 'LiveOutline,' a tool for building new documents and products. LiveOutline manages the document assembly and modification process by creating a complex web of managed elements. LiveOutline can then be used independent of the source SGML/XML content to update, change or compare the evolution of the content by tracking 16 different states for each SGML/XML element... The Environment provides a built-in SGML/XML viewer that shows you the content if you want, as you browse the repository, without having to export it to a word processor. If you are in the Visual Difference feature, the viewer will display the structural and content changes redlined. That's important because in a collaborative XML-editing environment documents may be shredded to a level of granularity that makes it more difficult to know exactly which part of the document to check out, or if it has changed or been modified since the last time you referenced the document... The Environment furnishes [an essential user interface, and, after seeing it several times, we view it as an essential component for any Astoria customer, and -- if Ecosystems is able to hook it up to other repositories -- for users of other high-end XML-aware content-management systems as well. While it may not be the product that brings XML-based content management to the masses, The Environment sets a new user interface standard among high-end, component-based, content-management systems in reference and textbook editorial settings..."

  • [February 22, 2001] "XQuery: Reinventing the Wheel?" By Evan Lenz (XYZFind Corp.). February 2001. "There is a tremendous amount of overlap in the functionality provided by XQuery, the newly drafted XML query language published by the W3C, and that provided by XSLT. The recommendation of two separate languages, one for XML query and one for XML transformations, if they don't have some sort of common base, may cause confusion as to which language should be used for various applications. Despite certain limitations, XSLT as it currently stands may function well as an XML query language. In any case, the development of an XML query language should be informed by XSLT... The proliferation of XML as a data interchange format and document format is creating new problems and opportunities in the field of information retrieval. While much of the world's information is housed in relational database management systems, not all information is able to fit within the confines of the relational data model. XML's hierarchical structure provides a unified format for data-centric information, document-centric information, and information that blurs the distinction between data and documents. Accordingly, a data model for XML could provide a unified way of viewing information, whether that information is actually stored as XML or not. Access to, extraction from, and manipulation of this information together comprise the problem of an XML query language. This paper explores some issues, advantages, and disadvantages of using XSLT as a query language for XML. It attempts to show that the basic building blocks of an XML query language can be found in XSLT, by way of an introduction to and comparison with XQuery, the newly drafted XML query language published by the W3C XML Query Working Group. This paper is not a proposal for a specific implementation. [Conclusion:] In the long run, the XML Query Working Group is probably doing the right thing in first formally defining the semantics of the query language. To attain the sophistication of query optimization that we currently have with SQL, an XML query language's underlying mathematics must be well understood. But these semantics should not be developed in a vacuum. However well understood a particular set of semantics is, we will not truly understand which set of semantics is useful in an XML query language until people have built real applications involving XML query. This is the reason why XSLT should be seriously addressed: it is the most widely used and implemented XML query language yet." Note 'This paper is adapted from what I'll be presenting on 'XSLT as a query language' at XSLT-UK.' See also the related posting on XQuery. On XSLT-UK: see the events listing. Related references in "XML and Query Languages." [cache]

  • [February 22, 2001] "XML-Deviant: Time to Refactor XML?" By Leigh Dodds. From XML.com. February 21, 2001. ['The growing interdependency between XML specifications is causing concern among XML developers -- is this just a case of sensible reuse, or are we creating a dangerously tangled web of standards?'] "The W3C has been particularly busy over the last few weeks, releasing a flurry of new Working Drafts. While welcoming this progress, some members of XML-DEV have expressed concern over the new direction that these specifications have taken. Intertwined Specifications: A succession of new Working Drafts have appeared on the W3C Technical Reports page. The list includes requirements documents for XSLT 2.0, XPath 2.0 and XML Query a data model and an algebraic description for XML Query, and a resurrection of the XML Fragment Interchange specification. The most striking aspect of these specifications is not their sudden appearance but, rather, their mutual interdependence: (1) XSLT 2.0 must support XML Schema datatypes; (2) XPath 2.0 must support the regular expressions defined in XML Schema datatypes, as well as the XML Schema datatypes; (3) XML Query and XPath 2.0 will share a common data model; (4) XML Query may itself use XML Fragments; (5) XML Query must support XML Schema datatypes; (6) Both XPath and XML Query must be modeled around the Infoset, and particularly the "Post Schema Validation Infoset"; (7) XML Schema itself depends on XPath to define constraints. As this list shows, dependence on the XML Schema datatypes and the Post Schema Validation Infoset are particularly prominent. This has produced a few furrowed brows on the XML-DEV mailing list... Refactoring and iteration have become common features in many development methodologies. Extreme Programming is an example. Acknowledging that it's hard to get things right the first time, and allowing changes in requirements, is fundamental to complex development processes, including the XML standards process that many are keen to see take shape."

  • [February 22, 2001] "Corporate Users Cool Toward XML for Supply Chains." By Michael Meehan. In ComputerWorld (February 19, 2001). "XML may be the future technology underpinning of online business-to-business trading, but many companies are in no hurry to get there. At the EC Forum here, a number of companies with large electronic data interchange (EDI) systems acknowledged that they're only in the investigation phase for using XML tags in their electronic purchases and sales. Many attendees at the conference echoed that hesitance about XML, noting that established corporations for the most part already have working supply chains. Amy Hedrick, senior e-business integration analyst at AMR Research Inc. in Boston, said companies aren't going to abandon 15 years of EDI development to move to a system reliant upon XML, especially since there are no widely used standards for the data-tagging language and more than 100 variants of it. XML has also seen slow adoption in certain markets. Chris Maxwell, an e-commerce systems manager at Dallas-based Pepsico Inc., said the food and beverage world is still rooted in EDI transactions. General Electric's Global eXchange Services (GXS) division hosted today's event. Last year, GXS took the step from being an EDI partner with 100,000 companies toward creating an XML-based electronic public marketplace. GXS CEO Harvey Seegers said the migration has been slow, and he expects that it will continue to be slow. He estimated that about 1% of the transactions GXS facilitated last year were of the browser-based XML variety. GXS plans to support both established EDI networks and upstart XML initiatives -- and its executives remain split as to when XML will prove a solid return on investment for businesses with legacy systems and defined supply chains..."

  • [February 21, 2001] "A Practical Comparison of XSLT and ASP.NET." By Chris Lovett. From Microsoft MSDN, 'Extreme XML' Column. February 19, 2001). ['Columnist Chris Lovett uses MSDN Online's table of contents to compare XSLT and ASP.NET, complete with pros and cons for each approach.'] "People are using XML to manage the publishing process for large, complex Web sites. Examples include an interview with Mike Moore on how www.microsoft.com has used XML to manage its complex needs and 'Streamlining Your Web Site Using XML', a high-level overview of how companies such as Dell use XML to streamline their entire publishing process. The questions I am getting from a lot of customers are: Should they dump XML/XSL and go write a bunch of C# ASP.NET code instead? If they have already heavily invested in XSLT, does ASP.NET provides any value to them? Is there some middle ground where they can get the best of both worlds? If so, what are the strengths and weaknesses of each technology and when should people use them? I will drill in on a specific example so I can compare and contrast XSLT versus ASP.NET. The example is the MSDN TOC. MSDN found that XML was ideal for managing its large table of contents (TOC). The contents of this TOC come from hundreds of groups around the company. The XML format provided a way to glue together disparate back-end processes that would have been much harder to change. XML/XSL also made it possible to reach different browsers on different platforms. Given that developers are finding that XML is the best way to manage the back-end data that goes into a Web site, let's take a look at how you take this XML data and turn it into HTML. First I will look at the ASP.NET solution... So which solution performs better? On my machine the XSLT version gets 33 requests per second (using MSXML 3.0). The C# version gets about 120 requests per second. A preliminary test of a .NET Beta 2 version of XslTransform does about 47 requests per second. So clearly the C# code is faster. This is understandable, given that the C# code is hand-optimized XmlNavigator code that minimizes the number of XPath evaluations. However, XSLT can also be performed on Internet Explorer 5.x clients, although this particular style sheet requires MSXML 3.0, which will not be integrated until a future version of Internet Explorer. When XSLT is offloaded to the client, the server is then just publishing static HTML pages. These HTML pages still have to fetch the XSLT style sheet, but this gets cached on the client side. Internet Information Server 5.0 can do around 1,000 static HTML pages per second on my machine, depending on their size... There is no clear winner. There are pros and cons to each solution. Developers will have to write application-specific code anyway (such as my XmlMenuResolver class), so I could certainly see the argument for staying in a C#-only environment and saving on the XSLT training costs. On the other hand, if customers have already invested heavily in XSLT, as www.microsoft.com has, and they have clear business value already derived from that, then integrating the XSLT solution into an ASP.NET environment as I have shown here can provide the best of both worlds."

  • [February 21, 2001] "Microsoft Windows XP: What's in It for Developers?" By Kyle Marsh, Dave Massy, and John Boylan. MSDN Library. February 2001. "This article explores some of the features of Microsoft Windows XP and looks at the effect these changes have on software developed for Windows. The discussion focuses on the new Windows XP visuals and ComCtl32, side-by-side component sharing, and fast user switching... With Windows XP, there's an infrastructure to support assemblies and isolated applications (both COM+ and Win32). A code change should not be required to get at side-by-side assemblies from Win32 applications. Applications can use the latest system assemblies without global impact. In short, isolated applications are valuable because they are more reliable. They are built and shipped with all needed components and are not affected by changes that other applications make. Isolated applications use a manifest, which is an XML file containing information that self-describes an assembly or an application. All binding and activation metadata, such as COM classes, interfaces, and type libraries, is now stored in the manifest, rather than the registry. There are two types of manifest files: applications manifests, which describe isolated applications, and assembly manifests, which describe individual assemblies..."

  • [February 21, 2001] Chemical Giant Embraces XML For Direct Links To Suppliers. [E-Business Applications.]" By Michael Alexander. In InternetWeek (February 19, 2001), pages 23, 26. "Eastman Chemical Company has completed a yearlong project and met its goal of setting up system-to-system connections to 15 of its customers and suppliers. A key to the effort has been an emerging XML-based standard, called eStandard, which has been developed by the chemical industry. 'What is really interesting to us about XML is not only does it enable more robust intersystem connections but XML also can be used to paint browser screens, update databases, send data to printers and other capabilities,' said Bill Graham, integrated direct program lead at Eastman. 'With one protocol for data descriptions, we have multiple avenues to reach multiple trading partners.' For example, Eastman, with annual revenue of $4.6 billion, is evaluating building an extranet based on XML for dozens of its small suppliers, Graham said. At least 80 percent of transactions between chemical companies are conducted under contract, and putting transactions online saves time and money. Eastman said it expects that by 2002 company-to-company links will account for half of its e-business revenue, with the other half divided between extranets and B2B marketplaces. Eastman used B2B integration modules from webMethods' enterprise application integration software based on XML to link its SAP R3 application to the ERP systems of its 15 trading partners. WebMethods tools are used for secure enveloping, message delivery and other functions necessary to guarantee secure connections. The chemical company has a minority stake in webMethods, as well as investments in online marketplaces OneChem, e-Chemical and Shipchem. Eastman also has set up trading links with five customers through OneChem and Envera exchanges. Though Eastman favors setting up direct-system links, it has contracts with Koch Chemical, Vulcan Chemical and other customers that use those exchanges, Graham said. 'In this case, we connect once and gain direct transactional access to both suppliers,' he added. The XML pilot program, which Eastman concluded in January, was designed to prove the feasibility of setting up secure company-to-company links using XML and to work out interoperability issues in each partner's infrastructure and business processes. The time it takes to process purchase orders fell from a week and a half to a matter of minutes or seconds, Graham said." See "Eastman Chemical Company and webMethods Successfully Launch Business-To-Business Integration (B2BI) Solution."

  • [February 21, 2001] "Sun Eyes New Auction Application. XML technology makes it easier to put excess inventory up for bid." By Ted Kemp. In InternetWeek (February 19, 2001), pages 23, 26. "As it enters its second year of selling products on popular auction sites, Sun Microsystems is mulling an upgrade to the service it uses to put items up for bid. Sun began selling workstations, enterprise servers, workgroup servers and software in December 1999 on such auction sites as eBay. Auctions help to clear older inventory and provide an outlet for products lacking state-of-the-art technology in terms of, say, microprocessors or DVD systems. Sun is considering migrating to a service that would manage an XML connection from Sun's product database to one or more online auction sites, and a second XML link from the auctions' transaction engines back to a Sun checkout page or fulfillment system. The application -- a sort of auction middleware -- is hosted by GoTo Auctions, a unit of search engine GoTo.com. Sun now uses a free service that handles such simple tasks as image-hosting; the more complex GoTo app is built around spidering technology, which locates the pages on auction sites that let sellers enter product data. It then fills out virtual forms and registration pages with product and pricing data that the seller enters into the system through a secure Web interface. A still more complex option would crunch historical selling data, product by product, and give Sun guidance on the best ways to spread goods across auctions. The new system can work with several auction sites, though Rublowsky said Sun sells almost exclusively on eBay. GoTo's fee ranges from 3 percent to 10 percent of gross sales, depending on the complexity of the app that the client selects and the products put up for bid."

  • [February 21, 2001] "XML Puts the Pedal to the Metal. ['Real World XML Speeds Shipping. Web and ePublishing.]" By Lowell Rapaport. In Imaging and Document Solutions Volume 10, Number 3 (March 2001), pages 44-50. ['Extensible markup language is widely touted as the bedrock of future supply chain and data interchange initiatives. Here's how two shipping companies are putting XML into real-life applications.]' "This magazine has lavished a great deal of attention on content management - systems designed to assemble, manage, recombine and deliver information on intranets and the Internet. XML's style and presentation advantages play an important role in these types of solutions. This article addresses XML's advantages for data interchange, which are driving a wave of cybermarket and supply chain initiatives that will transform the way many companies do business. (1) R+L wanted to minimize the processing downtime, and it found an answer in a suite of solutions from GOSOF.com (Go Save on Freight), Ocala, FL. GOSOF.com is a specialized application service provider for the freight industry. Among its offerings are document image capture, forms processing, data warehousing, and support for mobile workstations that transmit and receive bills of lading via cellular network. GOSOF.com's goal is to shorten the communication distance between shipping companies like R+L and its customers. GOSOF.com set up an XML-based data collection system that lets shippers generate and submit bills of lading electronically, either directly in XML, through their ERP systems, or through an interactive Web site powered by Dialog Server software, from Action Point, San Jose, CA. If a shipper still relies on paper bills of lading, the documents can be scanned in through Action Point's InputAccel capture and forms processing software. This part of the solution collects document images, extracts data and converts the information to XML-tagged content identical to any other bill of lading submission. GOSOF.com has even developed a portable document capture system complete with scanner, computer and cellular phone link... Since all bills of lading are converted to XML, whether they originate electronically or on paper, R+L's computers can understand and process data directly. The information doesn't have to be converted from one file type to another or retyped manually. (2) Just like EDI, XML can be used for business-to-business communications. With no expensive proprietary software to pay for, almost any company can join in an XML-based online exchange. Users log into an information exchange and submit orders or inventory notes in XML format. Relying on standardized XML DTDs and schema, information can be read by any user of the exchange. Once a buyer and seller are brought together, negotiations to complete the deal can be conducted over a secure connection. Boston-based Keane is an information technology consulting firm that helps businesses make the transition from EDI to open XML-based solutions. With so many different entities working together, it's unrealistic to expect that they will all use the same application software. XML schema and DTDs offer a standardized way for carriers, inspectors, freight forwarders, consignees and others involved in a shipment to send messages to each other. The process starts when shippers log onto Optimum Logistics' system to create shipping orders. Optimum Logistics collects the shipping information in XML format and then forwards it to all the entities that will be involved in the shipment. Those entities send back acknowledgements, also in XML. There are usually contracts between shippers, carriers and consignees, but if a carrier cannot accept a shipment, Optimum Logistics offers a business-to-business exchange that helps the shipper locate a substitute carrier..." See also the sidebar "XML/EDI Convergence: Speaking Your Partner's Language," by Andy Yeomans [page 44].

  • [February 21, 2001] "Relearn Old Lessons Before Embracing XML. [At Your Service.] By Julie Gable. In Imaging and Document Solutions Volume 10, Number 3 (March 2001), page 27. "XML's strengths for enabling business-to-business e-commerce often eclipse its advantages for internal content management. The Gartner Group, Stamford, CT, says XML's strength lies in "the process of integrating digitized data of multiple types in multiple formats and from multiple sources so that users can access a cohesive set of relevant information about a topic." Users in knowledge-producing organizations recognized the potential of XML early on. In 1999, a University of Michigan study on the feasibility of publishing dissertations electronically in XML estimated that it would cost about $67 per document to convert to XML vs. about $2 per document to convert to Adobe PDF. Yet the study recommended conversion to XML. Why? XML allows the same content to be customized for specific audiences and presented in different ways, including screen display, print, Braille and so forth. Documents in XML are modular in nature, so users can execute searches across specifically tagged sections rather than entire documents, resulting in more relevant search results. XML is also an excellent archival format for preserving documents over the long term because of its ability to render content regardless of platform, without relying on specific application software or hardware that is subject to obsolescence. If XML is the new ASCII, why haven't document management vendors flocked to provide XML product sets? The answer may lie in what the document management industry has already learned from prior experience: the procedural infrastructure is often the hardest part of implementation in the internal content realm, not the technology. Consider the following examples..."

  • [February 21, 2001] "XML: Business Beacon or Tower of Babel? [Open Platform.] By David Weinberger. In Imaging and Document Solutions Volume 10, Number 2 (February 2001), page 55. "XML is on the verge of plummeting down the celebrity curve. We already hear refrains such as: "You know, it doesn't do everything we said it was going to do," and "just doing something in XML doesn't mean it's really open." XML's weaknesses are rooted in its very being. There should be no illusions about it. Yet illusions there are, brought about by the media's need to generate headlines and the vendors' need to differentiate their wares in an undifferentiated market. The most overinflated expectation for XML comes from the media touting it as a standard. In fact, it's a standard for writing standards. XML is like an alphabet and a grammar: Now that we agree on the letters and that sentences will consist of nouns and verbs, we can begin to create different languages. So, XML by its very nature can be a beacon, or - if an industry is excessively greedy - it can be a Tower of Babel, breeding competing standards that don't know how to talk with one another. Inevitably, both have happened. XLink is one of many examples of beacon-hood. Web-based forms are an excellent example of Babel-onia. The forms example is illustrative of the venal forces that work against what XML offers. A tiny company, PureEdge (formerly UWI.com), made a name for itself early on by proposing an XML standard, XFDL, for encoding forms - anything from a purchase order to a mortgage application. This is a good use of XML because the essence of a form is the data it's capturing, and XML is quite strong on data-capture. There are lots of considerations when designing an XML standard for forms, including capturing acceptable entry ranges and allowing for the conditional display of fields (e.g., if the house you're buying costs more than $500,000, you may have to fill in some blanks for additional insurance coverage). PureEdge's XML design was well thought out and seemed to be at least a good start..."

  • [February 21, 2001] "XML Enables Dynamic Technical Publishing." By Lowell Rapaport. In Imaging and Document Solutions Volume 10, Number 2 (February 2001), page 14. "XyEnterprise has developed Content@XML. Content@XML supports XML authoring environments such as Arbortext Epic, SoftQuad's XMetal and HyperWorx, an XML authoring add-on for Microsoft Word. XyEnterprise has also developed XML Professional Publisher, an XML composition engine for creating PDF files from XML content. The PDF files can be printed or delivered electronically... Content@XML retains the strengths XyEnterprise developed for paper document management over the years, providing a production environment incorporating workflow management, collaboration and integrated security. "This is a system that can compete with products like Documentum, but without the high-end deployment costs," Parsons claims. Future plans for Content@XML include improving access security for use with the Internet. XyEnterprise expects to continue serving the legal and financial markets as well as its core base of industrial publishers such as Tweddle Litho... Content@XML combines XML components in a comprehensive data management and workflow application. It supports a number of XML editing applications and manages XML/SGML tagged data in a project-centric workflow environment."

  • [February 21, 2001] "Small Suppliers: Weak Retail Link." By Ted Kemp. In InternetWeek (February 19, 2001), pages 1, 77. "Retailers are rethinking how to coax their smallest suppliers onto the Web, applying a mixture of training and inexpensive technology. They're finding the Web is a better fit than proprietary EDI links, but it's still not a quick supply chain fix. For decades, retailers have strived to connect with even their smallest suppliers using EDI, sometimes resorting to brute coercion. Today, the benefits of electronic links are becoming more obvious to small companies as open Internet technology makes such connectivity simpler and less expensive. But many of those same companies are still in no rush to automate. Sears later this year will test XML connectivity with suppliers of all sizes because it's more flexible and easy to use than EDI -- exactly what small suppliers need. XML data tags can be written in plain English, allowing Web pages to function like database records. EDI uses more arcane communications methods that are harder to learn and implement for enterprise-to-enterprise communications. It also requires a far bigger financial commitment. This is the latest in an ongoing series of initiatives at Sears to make electronic communications easier for suppliers. Today, the giant retailer gives its 3,000 small and midsize suppliers several billing and ordering options that it manages through vendor SPS Commerce. Suppliers without PCs or Internet access can fax invoices to SPS, which reformats them into EDI and passes them along to Sears. The reverse operation takes place with purchase orders...From the suppliers' perspective, a simple lack of know-how is keeping many small firms from linking to their big retail customers via XML or EDI, and small and midsize suppliers are finding that such links often require them to revamp their entire businesses. For example, most retailers want visibility into inventory levels and the flow of goods within supplier operations, but many small suppliers lack inventory management systems that can provide that data in any form. 'Fundamental business processes need to be reconfigured, and small companies usually aren't geared to do that,' said Deepinder Sahni, vice president of AMI Partners, a research firm that specializes in small and midsize business issues. Some 44 percent of U.S. businesses with fewer than 100 employees don't even have Internet access in their offices, and an additional 37 percent don't operate a Web site, he noted. Retail experts agree that Web links between retailers and suppliers speed up the order process..."

  • [February 21, 2001] "IM, XML Will Work Together To Unleash B2B Transactions." By Jamie Lewis [The Burton Group]. In InternetWeek (February 13, 2001) "Although it was first used by teenagers for AOL gab sessions, instant messaging (IM) is becoming a valuable tool for communicating within and between organizations. And its role is poised to expand further. Some companies have started using AOL's services, and both Microsoft and Lotus have moved to integrate IM functions into Exchange and Domino. But there's another role emerging for IM services that may have a profound effect on how enterprises enable distributed computing across the Internet. In short, message-oriented communications mechanisms like IM may well provide the 'software backplane' that many applications and services use to communicate in B2B transactions as well as in consumer-oriented services. As XML assumes its role as the standard syntax for encoding business transactions and communications, message-oriented protocols will enable not only application-to-application but also person-to-person communications, providing the XML-oriented pipe for routing business information online... As it matures, XML will provide that common ground. With XML, the requirement to configure distributed applications in advance before they can communicate is much lower. XML allows applications to exchange objects (or 'documents' in XML parlance) whose intended receive-side processing is self-explanatory. Ideally, a 'self-describing' XML-based B2B message would contain all the content and context that two dissimilar endpoints need to exchange the message. Senders and recipients of XML-based B2B messages would be free to process them as they wish, without being tightly bound to each other's programmatic interfaces. Combining XML's power with a message-oriented approach to application communications holds a great deal of promise. In that light, IM services take on the function of message brokers and routers. Such brokers can enable asynchronous conversations across platforms and support dynamic, message-oriented 'publish-and-subscribe' models for application-to-application communications. This doesn't mean AOL IM will become the foundation for all B2B application communications. But it does mean a new generation of products and services that looks an awful lot like IM systems will emerge to serve these needs. Jabber, for example, is an open-source IM client based on XML and managed by Jabber.org. At its core, Jabber consists of several components, including what is in essence an XML router, and other services such as presence management (which allows a communicating party to find out if another party is online). With the right security model, integration with directory services and other key functions, Jabber (or other systems like it) may well become the foundation for a message-oriented communications infrastructure that moves XML messages between applications..." ['The core of Jabber (www.jabber.org) is a vibrant community of developers working at the intersection of XML, presence, and real-time messaging. This community is building a set of common technologies for further development, including servers, clients, libraries, services, and applications. Jabber is fully based on XML, so it provides an extensible architecture for creating the next generation of services and applications on the Internet. The benefits of using Jabber include presence management, transparent interoperability, and real-time routing of structured information.'] See: "Jabber XML Protocol."

  • [February 20, 2001] "Canonical XTM: A canonical serialization format for XML topic maps. Version 0.1." By Lars Marius Garshol, with contributions by Geir Ove Grønmo. Posted to XTM Mailing list 2001-02-20. ['I've now written up a proposal for a Canonical XTM specification, which is appended here. It is submitted for the consideration of topicmaps.org, in the hope that it may be useful. It has already been implemented and is now used internally by Ontopia for testing purposes.'] "This specification describes a serialization format for XML topic maps which has the property that all logically equivalent topic maps have the exact same byte-by-byte representation in this format. This can be used to test the conformance of XTM processors. The specification describes the serialization of a topic map into an output document, but does not concern itself with where that topic map came from. It is NOT a goal to ensure that the canonical topic map can be successfully read into an XTM processor, but merely to confirm that all processing defined by the XTM 1.0 specification has been performed correctly. The topic map must before serialization be processed into consistent topic map, as defined by XTM 1.0. When applying canonicalization to XTM documents no string normalization such as Unicode canonical decomposition must be performed..." See: "(XML) Topic Maps."

  • [February 20, 2001] "Sun for ONE." By Charles Babcock. In Interactive Week (February 12, 2001). "Sun Microsystems last week announced its Open Net Environment, a software strategy that plays up its Java and Internet integration capabilities. While the software contains few new elements, it maneuvers Sun into a more competitive stance versus Microsoft as a developer platform. 'This announcement may appear boring but it has real significance,' said Frank Gillett, an analyst at Forrester Research. 'It marks the beginning of a new battle over Web services.' Sun said its Open Net Environment (ONE) approach allows developers to build 'smart services' -- software code that can recognize a customer visiting a Web site and interact with the customer in ways that match what he or she is trying to do, said Greg Popadopoulos, chief technology officer at Sun. The growing ONE software set -- which includes the Solaris operating system, iPlanet application and integration servers, as well as the Market Maker e-commerce applications and Webtop user interface -- represents an integrated product set for developers, said Scott McNealy, chief executive of Sun. Both Sun and Microsoft are emphasizing the use of eXtensible Markup Language in their product lines. One of the additions to Sun's lineup was support for Small Object Access Protocol, a Microsoft-sponsored standard for XML-based instructions that can connect dissimilar computer systems. SOAP is under review as a standard by the World Wide Web Consortium. Also, Sun now supports Universal Description, Discovery and Integration, sponsored by Ariba, IBM and Microsoft, as a standard for an XML-based registry of online services. The Sun platform supports Java language, iPlanet servers, ONE Webtop interface and XML language Strengths: Java has caught on as an enterprise and Web application language. Many network services and XML are built into iPlanet servers. Weaknesses: With the defection of Microsoft, Java is not everywhere -- it's not on the Windows desktop. Integration of Sun's Forte development tools is still to come in some respects. Sun's Star Office applications, on which Webtop is based, are not pervasive." See also the Sun Microsystems white paper.

  • [February 20, 2001] "NotifyMe Networks Launches With Alert Service." By Mindy Charski. In Interactive Week (February 12, 2001). "A company launching today, called NotifyMe Networks, enables businesses to send 'actionable' alerts to their employees and customers through devices including the telephone, PC and pager. While many enterprises have implemented messaging systems that can send instant alerts to employees and customers, NotifyMe is among the first to give recipients the opportunity to respond. The company's code is based on XML, so no proprietary software is necessary. Clients are charged maintenance fees and pay per minute or per alert, which Chief Executive Chuck Dietrick said amounts to 'a matter of cents.' NextJet is among the company's first clients. The package delivery service will use the NotifyMe alerts to keep up with changing airline schedules and dispatch couriers to their destinations. NotifyMe expects its technology to be used primarily within companies and between businesses, but there are consumer-oriented applications as well. CNET Network's CNET Auctions, for instance, will make the service available to bidders who wish to be notified through the telephone when they've been outbid on a product. The person can raise the bid by keying the new price into the telephone. CNET will not charge the customer for the service, which could lead to higher revenue for the site as bids increase and strengthens customer loyalty, Dietrick said. EnvoyWorldWide offers a similar alerting product, which now enables recipients to answer multiple-choice questions through a touch tone phone or keyboard."

  • [February 20, 2001] "Dynamically Generated XSL Revisited. [XML@Large.]" By Michael Floyd. In WebTechniques Volume 6, Issue 03 (March 2001), pages 66-69. ['You could write over 100 style sheets, or you could let some transformations do all the work. Michael Floyd gives you the short story.'] "In the January installment of this column, I demonstrated how to dynamically generate XSL style-sheet transformations, which can then be applied to XML documents. In that column, I assumed that the developer has intrinsic knowledge of the structure and organization, or "schema," of the data being transformed. That knowledge is important because style-sheet transformations often use simple step patterns (or even full-blown XPath expressions) to locate a given element or attribute in the document tree, then use <xsl:value-of> to retrieve the item's content. So, XSL style sheets are highly reliant on a document's structure. By moving from statically created style sheets to dynamically generated transformations, you shift responsibility from the style-sheet author to the DOM developer. However, if you can generalize the process, you can realize significant benefits from generating your XSL dynamically. The key to generalizing this process lies in the schema. If you have a formal schema, such as a DTD or XML Schema document, you should be able to discover enough about the organization and structure to generate a reasonable XSL style-sheet document. This month, I'll examine that process and discuss how far you can take it. I wrote this article with the assumption that you, the developer, are familiar with XML Data Reduced (XDR) schemas, the XSL Transformation language (XSLT), and the Document Object Model (DOM)... Style sheets are used to associate meaning with markup elements, preventing us from completely generalizing style-sheet transformations. You can't render a <bold> element unless you know how to generate the appropriate transformation. There are a few ways to solve this problem. One is to let the user associate meaning interactively. You might create a tool that scans the schema, presents a list of all elements (and appropriate attributes), and lets an end user assign a property or behavior. An even simpler method is to create a mapping in your code between markup elements and their transformations. Either way, by moving from statically created style sheets to dynamically generated transformations you can solve the problem of propagating style sheets, and reduce maintenance of them, while generalizing the overall process." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [February 20, 2001] "An Application Server with No Strings Attached. [Product Review: Enhydra 3.5 for Wired and Wireless Devices.]" By Michiel de Bruijn. In WebTechniques Volume 6, Issue 03 (March 2001), pages 70-71. [Review of Enhydra 3.5 for Wired and Wireless Devices, Lutris Technologies.] "When the prerelease version of the latest incarnation of the Enhydra application server hit my desk, two questions came to mind: What's the "wireless" moniker doing in its name, and -- because, after all, Enhydra is an open-source project -- why do developers have to pay for it? [...] Wireless sounds cool, but on closer examination it isn't very impressive. Because these markup languages tend to be based on either HTML or XML, the actual functionality required to support them is minimal. Enhydra's feature list also mentions multiple markup language support. And true enough, it supports the functional equivalent of selecting different document object models (DOMs) to serve up XML-like data over TCP/IP. While this is useful in some situations, it's hardly exciting. A very helpful non-J2EE Web development feature is the innovative XML support through Enhydra's XML Compiler (XMLC). Because it outputs dynamically recompilable Java classes based on your HTML/XML documents instead of regular JavaScript code, XMLC already significantly enhances the performance of your Web apps. However, version 2.0 takes things even further with its 'lazy' DOM parser. The lazy DOM uses a read-only template DOM that's shared across all application instances -- where data is copied into a specific instance only when it's required. This works well if only particular nodes are accessed and the instance doesn't traverse the entire document. Other items of interest include the Presentation, Session, and Database Managers -- collectively powering what Lutris calls SuperServlets. These make it easy to associate Java code with URLs, keep session state (with or without using cookies), and access JDBC-compliant databases. And, as I've mentioned before, documentation for all of the tools included with Enhydra is excellent. Without making you wade through too much verbiage, it does a good job of explaining both application server basics and more advanced topics. Since Enhydra is an open-source product, you might have some misgivings about the availability and quality of support. After all, not everyone is interested in just taking the source code and fixing problems themselves. Lutris' boxed version solves that problem quite nicely: It comes with all the amenities you'd expect from a commercial vendor, including a list of supported software, technical support (pay-as-you-go after a number of free incidents), training, and even consultancy. The bottom line is that whether you go for the free download (which may lack some documentation) or the nonfree -- but still affordable -- packaged version, Enhydra offers excellent value. Even if you've been disappointed by a higher-priced solution in the past, Enhydra just might work for you."

  • [February 20, 2001] "XML meets semantics: The reality. [Thinking XML, #1.]" By Uche Ogbuji (CEO and principal consultant, Fourthought, Inc.). From IBM developerWorks, XML library. February 19, 2001. ['This discussion of XML and semantics kicks off a column by Uche Ogbuji on knowledge management aspects of XML, including metadata, semantics, Resource Description Framework (RDF), Topic Maps, and autonomous agents. Approaching the topic from a practical perspective, the column aims to reach programmers rather than philosophers.]' "This new column, 'Thinking XML', will cover the intersection of XML and knowledge architecture (KA). Knowledge architecture sounds like something tossed out by a jargon bot, but it's really just an umbrella term for some very useful technologies that are emerging now that XML is entering its adolescence. Metadata management, semantic transparency, and autonomous agents are hardly concepts unique to XML, but the promise of XML to unify the syntax of structured and semistructured data helps turn the next-to-impossible into the feasible. The key feature that will distinguish this column from much of the discussion of such topics is that I'll address programmers, not philosophers. I'll focus on development tools and techniques that allow developers to use XML to better collect and navigate the knowledge latent in data, whether in corporate databases or on the Web itself. This sounds quite grandiose, but the column installments will really be an incremental procession, never leaving common sense too far behind. This first column and the next set the scene, so they will diverge a bit from my ground rule of "lots of code, little philosophy." These first two columns cover the semantics of XML and related vocabularies. I'll discuss only initiatives with existing work products for the developer to take a look at, but I won't be presenting a lot of hands-on code and techniques just yet." See: (1) "XML and 'The Semantic Web'"; (2) "Conceptual Modeling and Markup Languages."

  • [February 20, 2001] "Practical XML with Linux, Part 3. XML database tools for Linux. Hierarchical, relational, and object databases." By Uche Ogbuji (CEO and principal consultant, Fourthought, Inc.). In LinuxWorld (February 2001). ['Your stash of XML documents is probably growing exponentially. Uche Ogbuji provides an overview of database types, then surveys the wide range of tools available for storing and managing XML data stores.'] "There are almost as many uses of XML as there are XML users, but there are only two ways of looking at how XML documents are organized. XML's roots lie in SGML, which was originally conceived as a way of structuring documents for machine preparation and human consumption. XML has inherited much of that bias toward documents, and is often used for presentation-oriented publishing (POP). Examples include books, slide presentations, and company Websites. POP formats tend to have elements and text that flow in a flexible and free-form manner. XML has also gained popularity as the basis for data formats suitable for exchange between computer programs: consumed by machines but able to be inspected by humans. This is known as messaging-oriented middleware (MOM) because of its role in the infrastructure of applications. Examples include serialized objects, automated purchase orders, and Mozilla bookmark files. MOM formats tend to be highly regular, with elements making up well-defined fields with content according to strict data typing. MOM and POP formats often impose different needs on XML databases, based on the differences in usage patterns and format. We will decide whether certain Linux database technologies are more appropriate for MOM or POP documents. There are many ways of structuring databases. The relational model, used by well-known DBMSs like PostgreSQL and Oracle, is probably the most popular for new systems, but there are many other approaches. Databases can be: Hash-based systems; Hierarchical databases; Relational and object/relational databases; Object databases; Multi-dimensional databases ; Semistructured databases... support the notion that it is impractical to have a rigid schema for data that models the real world, given the fluidity of the real world. Many of its concepts are a natural fit for XML and related technologies like the Resource Description Framework (RDF). There is a growing body of work on how to effectively manage XML data in hierarchical, relational, and object databases." See: "XML and Databases."

  • [February 20, 2001] "[XSLT Tutorial. Part 1]." By Henning Behme. From iX/Raven - iX - Magazin für professionelle Informationstechnik. February 19, 2001. "In order to present XML documents or data to the user in an attractive way in browser, mobile phone or PDF format, the original data must first be converted to the necessary formats. This is the purpose of XSLT as part of the style component of XML... The tutorial begins with the basics and finishes by trying out AxKit (v 1.2) for serving XML sources dynamically." The three-part tutorial series is also available in German. See details.

  • [February 20, 2001] "XML Standards Reference. [EXPLORING XML.]" By Michael Classen. From WebReference.com. February, 2001. "XML standards are defined at breathtaking speed these days. It is also difficult to keep up with the various versions of those standards. This short list focuses on the XML applications that should be of particular interest to webmasters and Web developers. It is not meant to be objective or exhaustive...Try these annotated links to XML standards, recommendations, and resources..."

  • [February 20, 2001] " Vignette's Bill Daniel tells where enterprise content management is headed." By Martin LaMonica and Tom Yager. In InfoWorld (February 19, 2001). ['As web publishing rushed onto the world scene, Vignette was an early leader in developing content management systems with personalization. Now the company has expanded its product base to be an e-business platform, addressing content management as well as integration and data analysis. That's only natural, says Bill Daniel, Vignette's senior vice president of products. Content management products are evolving from a soup-to-nuts suite to specialized applications that run on top of application servers. InfoWorld Executive Editor Martin LaMonica and East Coast Technical Director Tom Yager talked recently with Daniel about where enterprise content management is headed.'] "... let me give you a bold statement: I don't think there will be a discreet content management market in the future. We have said there are three broad sets of functionality that customer-driven applications require: communication, collaboration, and comprehension. And those relate specifically to content, content management, and delivery capabilities. [These are] integration capabilities and analysis capabilities. Those need to be futures and functions within an application suite. I don't think any of those, over time, will completely stand alone as big markets. They will be in everything. [...] In the case of Vignette, we have a whole set of APIs so you can access the functionality. In addition to that, our content management systems utilize relational databases from Oracle and other vendors. So that we have, essentially, a very open way to get at the content, to get at the meta data. You can do it programmatically through the APIs or you can do it just through database connectivity. For the state of the art, it's all over the map. Some of our competitors have no APIs. There's no programmatic way to interface to their system. All you can do is have their systems pump information out and you can pick it up. And so that's the two extremes: from very open to literally proprietary. What you're going to see over time is that XML is going to provide the interchange, so that I can create an XML document and serve it to you. And you can unpack it and put it in your content management system. But what people are going to increasingly want is APIs that allow them to drive that information movement programmatically, as opposed to by a human."

  • [February 19, 2001] "UDDI4J: Matchmaking for Web services. Interacting with a UDDI Server." By Doug Tidwell (Web Services Evangelist, IBM). From IBM developerWorks, XML library. January 2001. [UDDI4J is an open-source registry implementation from IBM. Follow Doug Tidwell as he shows how to build applications that can make use of a UDDI registry.'] "As part of its continued commitment to Web services, IBM has released UDDI4J, an open-source Java implementation of the Universal Discovery, Description, and Integration protocol (UDDI). In this article, we'll discuss the basics of UDDI, the Java API to UDDI, and how you can use this technology to start building, testing, and deploying your own Web services. The central idea behind the Web services revolution is that the Web will be populated with an assortment of small pieces of code, all of which can be published, found, and invoked across the Web. One key technology for the service-based Web is SOAP, the Simple Object Access Protocol. Based on XML, SOAP allows an application to interact with remote applications. That's all well and good, but how do we find those applications in the first place? That's wh