SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Site Map

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
 XML Articles and Papers. January - March 2001.

## XML General Articles and Papers: Surveys, Overviews, Presentations, Introductions, Announcements

References to general and technical publications on XML/XSL/XLink are also available in several other collections:

The following list of articles and papers on XML represents a mixed collection of references: articles in professional journals, slide sets from presentations, press releases, articles in trade magazines, Usenet News postings, etc. Some are from experts and some are not; some are refereed and others are not; some are semi-technical and others are popular; some contain errors and others don't. Discretion is strongly advised. The articles are listed approximately in the reverse chronological order of their appearance. Publications covering specific XML applications may be referenced in the dedicated sections rather than in the following listing.

### March 2001

• [March 31, 2001] DeltaXML XML Schema software. Posting from Robin LaFontaine describes online availability of 'schema comparator'. "Monsell's DeltaXML XML Schema software compares XML Schema files taking into account the fact that elements, attributes etc. can be in any order. So only significant changes are identified - even down to ignoring a change in the order of a 'choice' item, ... but a change in a 'sequence' is identified... The free trial version works for small schemas. If you want a full trial for larger schemas let me know and I will provide an evaluation license. If you have data files to compare, DeltaXML Markup will compare any XML files and identify changes for you, representing these changes in XML of course..." See the DTD changes in XML Schema from CR to PR, and XSU - Upgrade for XML Schema documents [20000922 to PR 20010316]. Also (1) "Revised Online Validator for XML Schema (XSV) and XML Schema Update Tool (XSU)" and (2) "XML Schemas."

• [March 30, 2001] "A Framework for Implementing Business Transactions on the Web." Hewlett-Packard initial submission to OASIS BTP work. By Dr. Mark Little (Transactions Architect, HP Arjuna Labs, Newcastle upon Tyne, England), with Dave Ingham, Savas Parastatidis, Jim Webber, and Stuart Wheater. 20 pages (with 11 notes). [See the posting from Mark Little.] "An increasingly large number of distributed applications are constructed by being composed from existing applications. The resulting applications can be very complex in structure, with complex relationships between their constituent applications. Furthermore, the execution of such an application may take a long time to complete, and may contain long periods of inactivity, often due to the constituent applications requiring user interactions. In a loosely coupled environment like the Web, it is inevitable that long running applications will require support for fault-tolerance, because machines may fail or services may be moved or withdrawn. A common technique for fault-tolerance is through the use of atomic transactions, which have the well know ACID properties, operating on persistent (long-lived) objects. Transactions ensure that only consistent state changes take place despite concurrent access and failures... From the previous discussions it should be evident that there are a range of applications that require different levels of transactionality. Many types of business transaction do not have the simple commit or rollback semantics of an ACID transaction, and may complete in a number of different ways that may still be interpreted as successful but which do not imply everything that the business transaction did has occurred. We have shown that a flexible and extensible framework for extended transactions is necessary, then in addition to standardising on the interfaces to this framework, we also need to work on specific extended transaction models that suit the Web. We would not expect applications to work at the level of Signals, Actions and SignalSets, as these are too low-level. Higher-level APIs are required to isolate programmers from these details. However, from experience we have found that this framework helps to clarify the requirements on specific extended transaction implementations. We have given examples of the types of Web applications that have different requirements on any transaction infrastructure, and from these we believe it should be possible to obtain suitable extended transaction models." Other issues that will need to be considered when implementing many business transactions include: (1) Security and confidentiality... (2) Audit trail... (3) Protocol completeness guarantee... (4) Quality of service..." See "OASIS Business Transactions Technical Committee."

• [March 30, 2001] "OASIS Security Services TC: Glossary." By the OASIS Security Services Technical Committee (SSTC). Edited by Jeff Hodges. "A New Oasis-SSTC-Draft is available from the on-line SSTC document repository. This draft is presently a work item of the Use Cases and Requirements subcommittee, and of the SSTC as a whole. This document comprises an overall glossary for the OASIS Security Services Technical Committee (SSTC) and its subgroups. Individual SSTC documents and/or subgroup documents may either reference this document and/or 'import' select subsets of terms." Background may be read in the mailing list archives (1) security-use and (2) security-services. Document also in PDF format. See the Technical Committee web pages.

• [March 30, 2001] "A Brief History of SOAP." By Don Box (DevelopMentor Inc.). March 30, 2001. "... For the most part, people have stopped arguing about SOAP. SOAP is what most people would consider a moderate success. The ideas of SOAP have been embraced by pretty much everyone at this point. The vendors are starting to support SOAP to one degree or another. There are even (unconfirmed) reports of interoperable implementations, but frankly, without interoperable metadata, I am not convinced wire-level interop is all that important. It looks like almost everyone will support WSDL until the W3C comes down with something better, so perhaps by the end of 3Q2001 we'll start to see really meaningful interop. SOAP's original intent was fairly modest: to codify how to send transient XML documents to invoke/trigger operations/responses on remote hosts. Because of our timing, we were forced to tackle issues that the schemas WG has since solved, which caused the S in SOAP to be somewhat lost. At this point in time, I firmly believe that only two things are needed for mid-term/long-term convergence: (1) The XML Schemas WG should address the issue of typed references and arrays. Adding support for these two 'synthetic' types would obviate the need for SOAP section 5. These constructs are broadly useful outside the scope of messaging/rpc applications, so it makes sense (to me at least) that the Schemas WG should address this. (2) Define the handful of additional constructs needed to tie the representational types from XML Schemas into operations and SUDS-style interfaces/WSDL-style portTypes. WSDL comes close enough to providing the necessary behavioral constructs to XML Schemas, and I am cautiously optimistic that something close to WSDL could subsume SOAP entirely. I strongly encourage you to study the WSDL spec and submit comments/improvements/errata so we can get convergence and interop in our lifetime..." See "Simple Object Access Protocol (SOAP)" and "Web Services Description Language (WSDL)."

• [March 30, 2001] "A Busy Developer's Guide to SOAP 1.1." By Dave Winer and Jake Savin (UserLand Software). March 28, 2001. "This specification documents a subset of SOAP 1.1 that forms a basis for interoperation between different environments much as the XML-RPC spec does. When we refer to 'SOAP' in this document we're referring to this subset of SOAP, not the full SOAP 1.1 specification. What is SOAP? For the purposes of this document, SOAP is a Remote Procedure Calling protocol that works over the Internet. A SOAP message is an HTTP-POST request. The body of the request is in XML. A procedure executes on the server and the value it returns is also formatted in XML. Procedure parameters and returned values can be scalars, numbers, strings, dates, etc.; and can also be complex record and list structures..." See also the political background [Dave's SOAP Journal, part 2] and the compatible validator running on SoapWare.Org. See "Simple Object Access Protocol (SOAP)."

• [March 30, 2001] "Expressing Qualified Dublin Core in RDF." Draft Version-2001-3-29. By Dublin Core Architecture Working Group. Authors: Stefan Kokkelink and Roland Schwänzl. Supersedes Guidance on expressing the Dublin Core within the Resource Description Framework (RDF). "In this draft Qualified Dublin Core is encoded in terms of RDF, the Resource Description Framework as defined by the RDF Model & Syntax Specification (XML namespace for RDF). RDF is a W3C recommendation. Also RDFS the RDF Schema specification 1.0 is used (XML namespace for RDFS). RDFS is a W3C candidate recommendation. Quite often the notion of URI (Uniform Resource Identifier) is used. The notion of URI is defined by RFC 2396 The notion of URI embraces URL and URN. We also discuss colaboration of qualified DC with other vocabularies and DumbDown. In this paper explicit encodings are provided for classical classification systems and thesauri. Additionally a procedure is discussed to create encodings for more general schemes. One of the majour changes with respect to the data model draft is the more systematic use of RDF Schema. It is understood that all DC related namespace references are currently in final call at the DC Architecture Working Group. They will be fixed in a forthcoming version of the current draft..." For related work, see CARMEN (Content Analysis, Retrieval and MetaData: Effective Networking) and especially CARMEN AP 6: MetaData based Indexing of Scientific Resources. See: "Dublin Core Metadata Initiative (DCMI)."

• [March 29, 2001] "XSLT Processor Benchmarks." By Eugene Kuznetsov and Cyrus Dolph. From XML.com. March 28, 2001. [The latest benchmark figures for XSLT processors show Microsoft's processor riding high, with strong performance from open source processors... XML.com is pleased to bring you the results of performance testing on XSLT processors. XSLT is now a vital part of many XML systems in production, and choosing the right processor can have a big impact. Microsoft's XSLT processor, shipped with their MSXML 3 library, comes top of the pile by a significiant margin. After Microsoft, there's a strong showing from the Java processors, with James Clark's XT--considered by many an "old faithful" among XSLT engines--coming ahead of the rest. Still, speed isn't everything, and most XSLT processors are incomplete with their implementation of the XSLT 1.0 Recommendation. On this score, Michael Kay's Saxon processor offers good spec implementation as well as respectable performance.'] "XSLTMark is a benchmark for the comprehensive measurement of XSLT processor performance. It consists of forty test cases designed to assess important functional areas of an XSLT processor. The latest release, version 2.0, has been used to assess ten different processors. This article describes the benchmark methodology and provides a brief overview of the results... The performance of XML processing in general is of considerable concern to both customers and engineers alike. With more and more XML-encoded data being transmitted and processed, the ability to both predict and improve XML performance is critical to delivering scalable and reliable solutions. While XSLT is a big part of delivering on the overall value proposition of XML (by allowing XML-XML data interchange and XML-HTML content presentation), it also presents the greatest performance challenge. Early anecdotal evidence showed wide disparities in real-life results, and no comprehensive benchmark tools were available to obtain more systematic assessments and comparisons... Of the processors included in this release of the benchmark, MSXML, Microsoft's C/C++ implementation, is the fastest overall. The three leading Java processors, XT, Oracle and Saxon, have surpassed the other C/C++ implementations to take 2nd through 4th place respectively. This suggests that high-level optimizations are more important than the implementation language in determining overall performance. The C/C++ processors tend to show more variation in their performance from test case to test case, scoring some very high marks alongside some disappointing performance. XSLTC aside, the C/C++ processors won first place in 33 of the 40 test cases, in some cases scoring two to three times as well as their Java competitors (attsets, dbonerow). This suggests that there is a lot of potential to be gained from using C/C++, but that consistent results might be harder to obtain..." Tool: XSLTMark; see also Kevin Jones' XSLBench test suite. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

• [March 29, 2001] "XSLT Benchmark Results." By Eugene Kuznetsov and Cyrus Dolph. From XML.com. March 28, 2001. ['The full results from the DataPower XSLT processor benchmarks.'] XSLTMark gauges the capabilities of XSLT processing engines by testing them on a common platform with a variety of stylesheets and inputs that sample the gamut of possible applications. See the XSLTMark overview for more information about the benchmark itself and how to download it. These results were obtained by DataPower on a Pentium III/500 machine running Linux. We encourage XSLT engine authors and users to submit benchmark results on their platforms, as well as drivers for new processors. Test results for the following XSLT processors are available: Overall Chart; 4Suite 0.10.2 (Fourthought); Gnome XSLT 0.5.0 (Gnome Project); MSXML 3.0 (Microsoft); Oracle XSLT 2.0 (Oracle); Sablotron 0.51 (Ginger Alliance); Saxon 6.2.1 (Michael Kay); TransforMiiX 0.8 (Mozilla Project); Xalan-C++ 1.1 (Apache Project); Xalan-Java 2.0.0 (Apache Project); XSLTC alpha 4(Sun); XT 19991105 (James Clark); Key." See previous article. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

• [March 29, 2001] "XML Q&A: DTDs, Industry Markup Languages, XSLT and Special Characters." By John E. Simpson. From XML.com. March 28, 2001. 'John Simpson solves hairy problems with DTDs and 'special characters.' John also provides some pointers on where to start with using industry markup languages.'

• [March 29, 2001] "XML-Deviant: Schemas by Example." By Leigh Dodds. From XML.com. March 28, 2001. ['There has been a lot of activity in the area of XML schema languages recently: with several key W3C publications and another community proposed schema language. Another alternative schema language has emerged from the XML community, relying entirely on example instance documents.'] (1) "W3C XML Schema: The finish line is now in sight for the members of the W3C XML Schemas Working Group. The XML Schema specifications are an important step closer to completion with their promotion to Proposed Recommendation status. All that remains now is for Tim Berners-Lee, as Director of the W3C, to approve the specifications before they become full Recommendations. The road has been long and hard, and it's had a number of difficult sections along the way." (2) Examplotron: "Eric van der Vlist has been helping to realize Rick Jelliffe's vision of a plurality of schema languages by publishing Examplotron, a schema language without any elements. Examplotron's innovation lies in its '"schema by example' approach to schema generation. Rather than define a dedicated schema language with which a document can be described, Examplotron uses sample instance documents, annotated with several attributes that carry schema specific information such as occurrence of elements, and assertions about element and attribute content. Like Schematron before it, Examplotron is implemented using XSLT. An Examplotron instance document can be converted into a validating stylesheet by applying a simple transformation..." For schema description and references, see "XML Schemas."

• [March 28, 2001] "[XML Transformations] Part 2: Transforming XML into SVG." By Doug Tidwell (Cyber Evangelist, developerWorks XML Team). From IBM developerWorks, XML Education. Updated: March 2001. ['The first section of our tutorial showed you how to transform XML documents into HTML. We used a variety of XML source documents (technical manuals, spreadsheet data, a business letter, etc.) and converted them into HTML. Along the way, we demonstrated the various things you can do with the XSLT and XPath standards. In this section, we'll use the World Wide Web Consortium's emerging Scalable Vector Graphics format (SVG) to convert a couple of our original documents into graphics.'] "For our transformations, we'll use two of our original six source documents: some spreadsheet data and a Shakespearean sonnet. The other documents from our original set aren't easily converted to SVG; we'll discuss why later... SVG is a language for describing two-dimensional graphics in XML. You use SVG elements to describe text, paths (sets of lines and curves), and images. Once you've defined those images, you can clip them, transform them, and manipulate them in a variety of interesting ways. In addition, you can define interactive and dynamic features by assigning event handlers, and you can use the Document Object Model (DOM) to modify the elements, attributes, and properties of the document. Finally, because SVG describes graphics in terms of lines, curves, text, and other primitives, SVG images can be scaled to any arbitrary degree of precision... We've taken a couple of our documents and transformed them into SVG. The column and pie charts are really useful examples that demonstrate what SVG can do, and our transformed sonnet displays the sonnet and its rhyme scheme clearly. These transformations used several important concepts in stylesheets. We used parameters and variables, we added extension functions when we needed them, and we used the mode attribute to control how templates were invoked. All of these were necessary because of the kind of documents we were creating. Despite this, our approach to writing stylesheets remains the same: (1) Determine the kind of document you want to create. (2) Look at the contents of that target document, and determine what information you need to complete it. (3) Build a stylesheet that creates the elements of the target document, and either retrieve or calculate the information you need for each part of the target document. The more text-intensive documents demonstrate what SVG doesn't do very well. Anything that contains text that needs to be broken into lines and paragraphs is difficult to do with SVG. You have to calculate the line breaks yourself, and you have to figure out how tall each line of text should be. Furthermore, if you wanted to use rich text features in your SVG document (display certain words in other fonts, different type sizes, different colors, etc.), your job would be even more difficult. See also tutorial articles (1) "Transforming XML into HTML" and (2) "Transforming XML into PDF." See: "W3C Scalable Vector Graphics (SVG)."

• [March 28, 2001] "Scalable Vector Graphics. [Integrated Design.]" By Molly E. Holzschlag. In WebTechniques Volume 6, Issue 4 (April 2001), pages 30-34. ['Scalable Vector Graphics is Up For Candidate Recommendation before the W3C. 'Will it be a Flash killer?' Wonders Molly E. Holzschlag.] "Scalable Vector Graphics (SVG) is a perfect example of technology and design meeting on a level playing field. Via XML markup, you can create and implement graphic images, animations, and interactive graphic designs for Web viewing. Of course, browsers must support SVG technology, which is one reason that many developers haven't looked into it too seriously, or perhaps haven't heard of it. SVG is being developed under the auspices of the W3C. As a result, developers have worked to make it compatible with other standards including XML, XSLT, CSS2, Document Object Model (DOM), SMIL, HTML 4.0, XHTML 1.0, and sufficient accessibility options via the Web Accessibility Initiative (WAI). As of this writing, SVG's status is Candidate Recommendation. The working group responsible for SVG has declared it stable, and if it passes several more tests, it moves into the Recommendation phase. Perhaps the most important concept to grasp when first studying SVG is its scalability. Graphics aren't limited by fixed pixels. Like vector graphics, you can make scalable graphics larger or smaller without distorting them. This is very important for designing across resolutions. Scalable graphics adjust to the available screen resolution. This alone makes SVG attractive to Web designers, as it solves one of the most frustrating issues we face: creating designs that are as interoperable, yet as visually rich, as possible... While SVG support in browsers obviously isn't immediately available, it's a technology that's worth watching and using. The fact that major companies are investing time and money to create tools that support it is indicative of the hope SVG holds. What's more, the fact that standards compliance is being written into these tools early on is very exciting -- an unprecedented event when it comes to client-side markup! So while SVG might not be something you'll actually use for awhile, it's absolutely worth taking out for a test drive, if only for the sheer fun of it." See: "W3C Scalable Vector Graphics (SVG)."

• [March 28, 2001] "An SVG Tool Kit for Java: Batik SVG Toolkit. [Product Review.]" By Clayton Crooks. In WebTechniques Volume 6, Issue 4 (April 2001), pages 40-41. ['Pros: Offers Java developers an easy way to add SVG capabilities to their programs. Cons: Unless you're developing custom solutions, apps are limited.'] "Batik, an open-source project lead by the Apache Software Foundation, is a Java-based tool kit for incorporating Scalable Vector Graphics (SVG) into applications. In addition to offering the developer tools that let you view, generate, or manipulate images, the Apache Software Foundation has released a set of applications with basic SVG functions that can be used with any standard application. The goal is to provide a complete set of core modules that can be used individually or together to develop SVG projects... Batik provides complete applications and modules, making it easy for Java-based applications to use SVG content. According to the Web site, using Batik's SVG Generator, you can develop a Java application to export any graphics format to the SVG format. Another application can be developed using Batik's SVG processor and Viewer to easily integrate SVG viewing capabilities. Still another application uses Batik's modules to convert SVG documents to various formats, such as popular raster formats like JPEG or PNG. Since its inception, Batik has been an open-source project. It was created when several groups working on independent SVG-related projects combined their efforts. The original teams included employees from industry giants like Eastman Kodak, Sun Microsystems, and IBM. The groups decided that their respective projects could benefit from the offerings of the others, and that combining the projects would result in a much more complete tool." See: "W3C Scalable Vector Graphics (SVG)."

• [March 28, 2001] "Zope: An Open-Source Web Application Server. [Review.]" By Brian Wilson (Harbro Systems in Santa Rosa, CA). In WebTechniques Volume 6, Issue 4 (April 2001), pages 80-81. 'Zope has rich set of content-management and database features; fairly steep learning curve.' "Many of the Web projects I work on are for nonprofit organizations, and I must lean heavily on volunteers who have little experience working on Web sites. As a result, I'm very interested in tools that help me set up and maintain a basic site layout, while letting beginners enter and maintain content. I heard that Zope could help me, so I decided to try it. Zope was developed by Digital Creations, which provides commercial support for it. The introduction to the online Zope Book says that Zope is a framework for building Web applications. It allows for powerful collaboration, simple content management, and Web component use. Sounds good so far. Because Zope is open source and runs on Red Hat Linux, I'll have access to updates and bug fixes. Zope is written in Python, making it portable across many platforms (www.python.org). Currently, it's available in binary format for Windows (9x/NT), Linux, and Solaris, plus it can be compiled on other Unix platforms. I used the pre-built Linux version for this article (Zope 2.2.4), which I tested on both versions 6.2 and 7.0 of Red Hat Linux... The heart of Zope is Document Template Markup Language (DTML). Yes, DTML requires that you learn yet another language, but it builds on HTML, so it should be familiar. It's also incredibly powerful. You can create pages through the Web interface, and use special Zope DTML tags to do things like iterate over the objects in a folder and insert them into a table. I began creating pages right away -- without knowing any DTML. . . Zope holds out the promise of being able to do everything I need for my Web sites. As with many open-source projects, Zope suffers from having a fabulously rich feature set that I cannot (yet) access because the documentation isn't finished. I know that in time, I could read through mailing list archives and scattered online docs to learn what I need to know, but that route is definitely no picnic. Although I found Zope impressive, I'm still fond of Apache. Hence, my next step will be to look at Midgard, which is based on Apache, MySQL, and PHP. It's definitely harder to install than Zope, but Midgard builds on the base of three tools I'm already using." See also "Zope Parsed XML Project Releases ParsedXML Version 1.0."

• [March 28, 2001] "Zope: Open Source Alternative for Content Management. Zope Proves Utility of Open-Source Web Tools." By Mark Walter and Aimee Beck. In The Seybold Report on Internet Publishing Volume 5, Number 7 (March 2001), pages 11-15. In depth review with case studies. ['SRIP looks at Zope, a free toolkit developed by Digital Creations that's gained favor among daily newspapers, corporations, government agencies and a host of Web startups. Included are details on Zope's new content-management framework, due out this spring.'] "With Net budgets plunging in parallel with the high-tech stock swoon, site managers are seeking lower-priced alternatives to premium content-management systems. That's good news for Digital Creations and Zope, its open-source Web publishing framework built on top of Python. This month Digital Creations is extending Zope even further, releasing a full-blown content-management system based on the Zope framework... Coming in the next release, due out later this spring, will be a simple syndication server that helps administrators set up automated polling for inbound feeds and lets authorized customers pull content for outgoing material. Also under development is an overhaul to the underlying presentation templates: Digital Creations plans to change its "document template markup language" and its reliance on custom tags to an XHTML-based scheme driven from custom attributes on standard tags. That change will make it much easier for template designers to get WYSIWYG feedback from within popular Web-design products, like Dreamweaver or GoLive... Every system has its limitations, and Zope, for all its power and flexibility, relies on Python, which at this point is not yet the language of the masses. The upside, of course, is that Zope is open source: If you're willing to roll up your sleeves, you can save considerable money on software. In following Linux, Digital Creations has confirmed the merits of the open source software model and garnered supporters from across the globe. With CMF, Digital Creations has taken a big step toward bringing Zope to an even wider audience. The downside to open-source products, compared to their commercial counterparts, is that users have to assume primary responsibility for support. In the Zope CMF, customers get a nice combination -- free code, and, in Digital Creations, a consultant with deep experience solving complex publishing problems. At a time when Web budgets are being trimmed, but the volume of content continues to rise, Zope could be poised for even faster growth. Fredericksburg.com's Muldrow concludes, 'I've honestly not seen a product that so completely improved the way we do things -- I built a product to post jobs online in less than a day. We haven't been able to do that with anything else'." See also "Zope Parsed XML Project Releases ParsedXML Version 1.0."

• [March 28, 2001] "Trailblazing with XPath. [XML@Large.]" By Michael Floyd. In WebTechniques Volume 6, Issue 4 (April 2001), pages 66-69. ['XPath will keep you from getting lost in your document trees whether you're using XSLT or the DOM. Michael Floyd provides guidance.'] "As in desert enduro, finding your way through XML documents isn't always a straightforward task. Fortunately, the designers of XML have included a mechanism, called XPath, that helps you navigate through documents. XPath partly defines a syntax that lets you easily traverse a tree's structure and select one or more of its nodes. Once you've selected a node or nodes, you can manipulate, reorder, or transform them in any way you desire. The mechanism that lets you select tree nodes is called a pattern. A pattern is actually a limited form of what XPath calls location paths. (We'll get to location paths in a moment.) Much of XPath's expression language was originally described in the early XSL specification. Eventually, however, the W3C broke the XSL specification into three parts: XSL, which describes the formatting objects used to display XML elements; the XSL Transformation Language, which lets you transform XML into other formats; and XPath. So it's easy to associate XPath expressions with XSLT. It turns out, however, that these expressions are also useful in other tree-related models, including the Document Object Model (DOM) and XPointer. You can also use XPath expressions as arguments to DOM function calls... Of course, there's a great deal more to XPath than I've described here. In future months, I'll cover the other functions, including number, Boolean, and node-set functions. More importantly, I'll show you how to use them in DOM work and in creating style sheets."

• [March 24, 2001] "On XML Integrity Constraints in the Presence of DTDs." By Wenfei Fan (Bell Labs and Temple University), and Leonid Libkin (University of Toronto). Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. With 32 references. "Abstract: "The paper investigates XML document specifications with DTDs and integrity constraints, such as keys and foreign keys. We study the consistency problem of checking whether a given specification is meaningful: that is, whether there exists an XML document that both conforms to the DTD and satisfies the constraints. We show that DTDs interact with constraints in a highly intricate way and as a result, the consistency problem in general is undecidable. When it comes to unary keys and foreign keys, the consistency problem is shown to be NP-complete. This is done by coding DTDs and integrity constraints with linear constraints on the integers. We consider the variations of the problem (by both restricting and enlarging the class of constraints), and identify a number of tractable cases, as well as a number of additional NP-complete ones. By incorporating negations of constraints, we establish complexity bounds on the implication problem, which is shown to be coNP-complete for unary keys and foreign keys." Detail: Although a number of dependency formalisms were developed for relational databases, functional and inclusion dependencies are the ones used most often. More precisely, only two subclasses of functional and inclusion dependencies, namely, keys and foreign keys, are commonly found in practice. Both are fundamental to conceptual database design, and are supported by the SQL standard. They provide a mechanism by which one can uniquely identify a tuple in a relation and refer to a tuple from another relation. They have proved useful in update anomaly prevention, query optimization and index design. XML (eXtensible Markup Language) has become the prime standard for data exchange on the Web. XML data typically originates in databases. If XML is to represent data currently residing in databases, it should support keys and foreign keys, which are an essential part of the semantics of the data. A number of key and foreign key specifications have been proposed for XML, e.g., the XML standard (DTD), XML Data, and XML Schema. Keys and foreign keys for XML are important in, among other things, query optimization, data integration, and in data exchange for converting databases to an XML encoding. XML data usually comes with a DTD that specifies how a document is organized. Thus, a specification of an XML document may consist of both a DTD and a set of integrity constraints, such as keys and foreign keys. A legitimate question then is whether such a specification is consistent, or meaningful: that is, whether there exists a (finite) XML document that both satisfies the constraints and conforms to the DTD. In the relational database setting, such a question would have a trivial answer: one can write arbitrary (primary) key and foreign key specifications in SQL, without worrying about consistency. However, DTDs (and other schema specifications for XML) are more complex than relational schemas: in fact, XML documents are typically modeled as node-labeled trees, e.g. in XSL, XQL, XML Schema, XPath, and DOM. Consequently, DTDs may interact with keys and foreign keys in a rather nontrivial way, as will be seen shortly. Thus, we shall study the following family of problems, where C ranges over classes of integrity constraints... We have studied the consistency problems associated with four classes of integrity constraints for XML. We have shown that in contrast to its trivial counterpart in relational databases, the consistency problem is un- decidable for C[K,FK], the class of multi-attribute keys and foreign keys. This demonstrates that the interac- tion between DTDs and key/foreign key constraints is rather intricate. This negative result motivated us to study C{Unary}[K,FK], the class of unary keys and foreign keys, which are commonly used in practice. We have developed a characterization of DTDs and unary constraints in terms of linear integer constraints. This establishes a connection between DTDs, unary constraints and linear integer programming, and allows us to use techniques from combinatorial optimization in the study of XML constraints. We have shown that the consistency problem for C{Unary}[K,FK] is NP-complete. Furthermore, the problem remains in NP for C{Unary}[K-neg,IC-neg], the class of unary keys, unary inclusion constraints and their negations. We have also investigated the implication problems for XML keys and foreign keys. In particular, we have shown that the problem is undecidable for C[K,FK] and it is coNP-complete for C{Unary}[K,FK] constraints. Several PTIME decidable cases of the implication and consistency problems have also been identified. The main results of the paper are summarized in Figure 4. It is worth remarking that the undecidability and NP-hardness results also hold for other schema specifications beyond DTDs, such as XML Schema and the generalization of DTDs proposed in [Y. Papakonstantinou and V. Vianu. 'Type inference for views of semistructured data']. This work is a first step towards understanding the interaction between DTDs and integrity constraints. A number of questions remain open. First, we have only considered keys and foreign keys defined with XML attributes. We expect to expand techniques developed here for more general schema and constraint specifications, such as those proposed in XML Schema and in a recent proposal for XML keys. Second, other constraints commonly found in databases, e.g., inverse constraints, deserve further investigation. Third, a lot of work remains to be done on identifying tractable yet practical classes of constraints and on developing heuristics for consistency analysis. Finally, a related project is to use integrity constraints to distinguish good XML design (specification) from bad design, along the lines of normalization of relational schemas. Coding with linear integer constraints gives us decidability for some implication problems for XML constraints, which is a first step towards a design theory for XML specifications." Note the longer version of the paper referenced on Wenfei Fan's web site. [cache]

• [March 24, 2001] "XML with Data Values: Typechecking Revisited." By Noga Alon (Tel Aviv University), Tova Milo (Tel Aviv University), Frank Neven (Limburgs Universitair Centrum), Dan Suciu (University of Washington), and Victor Vianu (UC San Diego). Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. Abstract: "We investigate the typechecking problem for XML queries: statically verifying that every answer to a query conforms to a given output DTD, for inputs satisfying a given input DTD. This problem had been studied by a subset of the authors in a simplified framework that captured the structure of XML documents but ignored data values. We revisit here the typechecking problem in the more realistic case when data values are present in documents and tested by queries. In this extended framework, typechecking quickly becomes undecidable. However, it remains decidable for large classes of queries and DTDs of practical interest. The main contribution of the present paper is to trace a fairly tight boundary of decidability for typechecking with data values. The complexity of typechecking in the decidable cases is also considered." Details: "Databases play a crucial role in new internet applications ranging from electronic commerce to Web site management to digital government. Such applications have redefined the technological boundaries of the area. The emergence of the Extended Markup Language (XML) as the likely standard for representing and exchanging data on the Web has confirmed the central role of semistructured data but has also redefined some of the ground rules. Perhaps the most important is that XML marks the 'return of the schema' (albeit loose and flexible) in semistructured data, in the form of its Data Type Definitions (DTDs), which constrain valid XML documents. The benefits of DTDs are numerous. Some are analogous to those derived from schema information in relational query processing. Perhaps most importantly to the context of the Web, DTDs can be used to validate data exchange. In a typical scenario, a user community would agree on a common DTD and on producing only XML documents which are valid with respect to the specified DTD. This raises the issue of (static) typechecking: verifying at compile time that every XML document which is the result of a specified query applied to a valid input document, satisfies the output DTD... On the decidability side, we show that typechecking is decidable for queries with non-recursive path expressions, arbitrary input DTD, and output DTD specifying conditions on the number of children of nodes with a given label. We are able to extend this to DTDs using star-free regular expressions, and then full regular expressions, by increasingly restricting the query language. We also establish lower and upper complexity bounds for our typechecking algorithms. The upper bounds range from pspace to non-elementary, but it is open if these are tight. The lower bounds range from co-np to pspace . On the undecidability side, we show that typechecking be- comes undecidable as soon as the main decidable cases are extended even slightly. We mainly consider extensions with recursive path expressions in queries, or with types decoupled from tags in DTDs (also known as specialization). This traces a fairly tight boundary for the decidability of typechecking with data values... The main contribution of the present paper is to shed light on the feasibility of typechecking XML queries that make use of data values in XML documents. The results trace a fairly tight boundary of decidability of typechecking. In a nutshell, they show that typechecking is decidable for XML-QL-like queries without recursion in path expressions, and output DTDs without specialization. As soon as recursion or specialization are added, typechecking becomes undecidable..." [cache]

• [March 24, 2001] "Representing and Querying XML with Incomplete Information." By Serge Abiteboul (INRIA), Luc Segoufin (INRIA), and Victor Vianu (UC San Diego). Paper presented at PODS 2001. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). May 21 - 24, 2001. Santa Barbara, California, USA. With 25 references. Abstract: "We study the representation and querying of XML with incomplete information. We consider a simple model for XML data and their DTDs, a very simple query language, and a representation system for incomplete information in the spirit of the representations systems developed by Imielinski and Lipski for relational databases. In the scenario we consider, the incomplete information about an XML document is continuously enriched by successive queries to the document. We show that our representation system can represent partial information about the source document acquired by successive queries, and that it can be used to intelligently answer new queries. We also consider the impact on complexity of enriching our representation system or query language with additional features. The results suggest that our approach achieves a practically appealing balance between expressiveness and tractability. The research presented here was motivated by the Xyleme project at INRIA, whose objective it is to develop a data warehouse for Web XML documents... The main contribution of this paper is a simple framework for acquiring, maintaining, and querying XML documents with incomplete information. The framework provides a model for XML documents and DTDs, a simple XML query language, and a representation system for XML with incomplete information. We show that the incomplete information acquired by consecutive queries and answers can be effciently represented and incrementally refined using our representation system. Queries are handled effciently and exibly. They are answered as best possible using the available information, either completely, orby providing an incomplete answer using our representation system. Alternatively, full answers can be provided by completing the partial information using additional queries to the sources, guaranteed to be non-redundant. Our framework is limited in many ways. For example, we assume that sources provide persistent node ids. Order in documents and DTDs is ignored, and is not used by queries. The query language is very simple, and does not use recursive path expressions and data joins. In order to trace the boundary of tractability, we considered several extensions to our framework and showed that they have significant impact on handling incomplete information, ranging from cosmetic to high complexity or undecidability. This justifies the particular cocktail of features making up our framework, and suggests that it provides a practically appealing solution to handling incomplete information in XML." See: "Xyleme Project: Dynamic Data Warehouse for the XML Data of the Web." [cache]

• [March 24, 2001] "Xyleme, une start-up de l'Inria pour structurer le Web en XML." From 01net.com. March 01, 2001. Xyleme veut structurer les données sémantiques du Web en XML. Objectif? Construire un moteur de recherche professionnel, interrogeable à partir du systhme d'information de l'entreprise." ["The Web is moving from HTML to XML, with all the major players, Microsoft, IBM, Oracle, content providers, B2B enablers, behind this revolution. Xyleme exploits this revolution to create a new service through an indexed XML repository that stores Web knowledge and that is capable of answering queries from applications and users. The outcome is a seamless integration between the web and corporate information systems... Xyleme is designed to store, classify, index and monitor XML data on the Web. The emphasis is on high level services that are difficult or impossible to support with the current Web technologies. In particular, we consider more complex query processing than the simple keyword search of actual search engines, semantic data integration and sophisticated monitoring of changes..."] See: "Xyleme Project: Dynamic Data Warehouse for the XML Data of the Web."

• [March 24, 2001] "SCHUCS: A UML-Based Approach for Describing Data Representations Intended for XML Encoding." By Michael Hucka (Systems Biology Workbench Development Group ERATO Kitano Systems Biology Project). 'Version of 11 December 2000'. UML to XML Schema mappings. Note: this document supplements the SBML Level 1 final specification, which uses a simple UML-based notation to describe the data structures: Systems Biology Markup Language (SBML) Level 1: Structures and Facilities for Basic Model Definitions." See the corresponding news item on SBML. "There are three main advantages to using UML class diagrams as a basis for defining data structures. First, compared to using other notations or a programming language, the UML visual representations are generally easier to read and understand by readers who are not computer scientists. Second, the visual notation is implementation-neutral -- the defined structures can be encoded in any concrete implementation language, not just XML but other formats as well, making the UML-based definitions more useful and exible. Third, UML is a de facto industry standard, documented in many books and available in many software tools including mainstream development environments (such as Microsoft Visual Basic 5 Enterprise Edition). Readers are therefore more likely to be familiar with it than other notations. Readers do not need to know UML in advance; this document provides descriptions of all the constructs used. The notation presented here can be expressed not only in graphical diagram form (which is what UML is all about) but also in textual form, allowing descriptions to be easily written in a text editor and sent as plain-text email. The scope of the notation is limited to classes and their attributes, not class methods or operations. One of the goals of this effort has been to develop a consistent, systematic method for translating UML-based class diagrams into XML Schemas. Another goal has been to maintain a reasonably simple notation and UML-to-XML mapping. An important side-effect of this is that the vocabulary of the notation is purposefully limited to only a small number of constructs. It is explicitly not intended to cover the full power of UML or XML. This limited vocabulary has nevertheless been sufficient for the applications to which it has been applied so far in the Systems Biology workbench project... The notation proposed in this document is based on a subset of what could be used and what UML provides. It is not intended to cover the full scope of UML or XML. The subset was chosen to be as simple as possible yet allow the expression of the kinds of data structures that need to be encoded in XML for the ERATO Kitano Systems Biology workbench. The notation proposed here is not carved in stone, and will undoubtedly continue to evolve..." See: "Systems Biology Markup Language (SBML)." [cache]

• [March 24, 2001] "RDF Protocol." By Ken MacLeod. March 24, 2001. "RDF Protocol is simple structured text alternative to standard ASCII line-oriented protocol (as used in FTP, NNTP, SMTP, et al.). RDF Protocol also subsumes the features of RFC-822-style headers as used in MIME, SMTP, and HTTP." Includes Core RDF Protocol; IRC in RDF Protocol; Replication in RDF Protocol. [From the posting: 'Toying With an Idea: RDF Protocol': "RDF Protocol really isn't a protocol so much as setting down some conventions for passing bits of RDF around. Well, ok, some of the bits work a lot like a protocol, so it's gotta look like that, but here goes... I'm playing with a Python implementation of the basic message read/write and using IRC as the example protocol to emulate, using Dave Beckett's IRC in RDF schema. In case anyone was wondering, there are no APIs and no RPCs at this layer, it's all XML instance passing, with RDF triples as the content..." See "Resource Description Framework (RDF)."

• [March 24, 2001] "DocBook TREX Schema V4.1.2.2." From Norman Walsh. 03-12-01. DocBook TREX Schema V4.1.2.2 "is the current experimental TREX Schema version of DocBook. This version was (mostly) generated automatically from the RELAX version. This version is available as a zip archive. Includes: docbook.trex (the DocBook TREX Schema); dbhier.trex (the DocBook TREX Schema 'hierarchy' module); dbpool.trex (the DocBook TREX Schema 'information pool' module); dbtables.trex (the DocBook TREX Schema tables module); text.xml (a test document). See: "Tree Regular Expressions for XML (TREX)." Also: (1) RELAX DocBook schema; (2) W3C XML DocBook schema. [cache]

• [March 23, 2001] "Software Verification and Functional Testing with XML Documentation." By Ernest Friedman-Hill. In Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS-34), edited by R. H. Sprague. Los Alamitos, CA, USA: IEEE Computer Society. Meeting: January 3-6, 2001. Maui, Hawaii. Abstract: "Continuous testing is an important aspect of achieving quality during rapid software development. By making the user documentation for a software product into part of its testing machinery, we can leverage each to benefit the other. The documentation itself can be automatically tested and kept in a state of synchronization with the software. Conversely, if the documentation can be machine interpreted, evaluation of the software's adherence to this description simultaneously verifies the documentation and serves as a functional test of the software. This paper presents an application of these ideas to a real project, the manual for Jess, the Java Expert System Shell. The Jess manual is rich in machine-interpretable information and is used in several distinct modes within Jess' extensive functional and unit test suites. The effort to maintain the accuracy and completeness of Jess's documentation has dropped significantly since this method was put in place." [Note: "Jess is a rule engine and scripting environment written entirely in Sun's Java language by Ernest Friedman-Hill at Sandia National Laboratories in Livermore, CA. Jess was originally inspired by the CLIPS expert system shell, but has grown into a complete, distinct Java-influenced environment of its own. Using Jess, you can build Java applets and applications that have the capacity to 'reason' using knowledge you supply in the form of declarative rules."] Details: "The Jess project is primarily a research project. While the basic syntax of the Jess language stays relatively constant, features are added and removed on a regular basis as requirements evolve and new ideas are tried out. Nevertheless, Jess is a small project, supported by one person working part-time. Taken together, the small project size, the dynamic nature of the software itself, and the large user base make the problem of maintaining up-to-date documentation for Jess particularly acute... It is also very easy to extend the Jess language with new commands written in Java or in Jess itself, and so the Jess language can be customized for specific applications. Jess is therefore used in a range of different ways, meaning that its documentation must cover many topics. The software is in use at hundreds of sites around the world in industries including e-commerce, insurance sales, telecommunications, and R&D, so the documentation must be of sufficient quality and completeness to satisfy the broad user base. If documentation were interpretable by computer, then the behaviour described in the documentation could be verified by the test machinery. Writing documentation would no longer be a 'superfluous' activity, but instead it would be an integral part of the development process. Inaccurate documentation becomes as serious as any other bug detected during testing. We have applied this technique to a real project, the ongoing development of Jess', the Java Expert System Shell, using XML as the documentation format. This paper describes this effort and suggests some potential enhancements for future work... The validation system described here proved itself to be very useful in the development process from Jess 4.0 to 5.1. The effort required to maintain good user documentation was greatly reduced. Approximately ten alpha and beta releases of Jess over the space of a year were made, and each shipped with a completely up-to-date manual. All of the examples in each of the manuals were correct; conversely, the software always performed as described in the manual. Many extensions to this scheme are possible. The possibility for expanded use of <functiondef> has already been implied. If the argument and return-value descriptions were machine readable, then a series of simple tests for every documented function could be automatically generated to verify that the types and number of arguments, and the type and sometimes identity of the return value, adhered to the documentation. Another possibility would be the confirmation of the existence and signature of Java API functions mentioned in the manual. A special tag is already used to format such references in the printed documentation. Again, it should be possible to automatically generate some very simple unit tests for such functions."

• [March 23, 2001] "Using XML/XMI for Tool Supported Evolution of UML Models." By F. Keienburg and Andreas Rausch (Institut für Informatik, Technische Universität München). In Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS-34). Edited by: R. H. Sprague. With 19 references. Los Alamitos, CA, USA: IEEE Computer Society, 2001. Meeting: January 3-6, 2001. Maui, Hawaii. Abstract: "Software components developed with modern tools and middleware infrastructures undergo considerable reprogramming before they become reusable. Tools and methodologies are needed to cope with the evolution of software components. We present some basic concepts and architectures to handle the impacts of the evolution of UML models. With the proposed concepts, an infrastructure to support model evolution, data schema migration, and data instance migration based on UML models can be realized. To describe the evolution path we use XML/XMI files." Details: "One needed important thing for delivering transparent model changes is a neutral model specification format. For reasons of currently becoming a respected standard and being adopted by a lot of UML Case Tools vendors, XMI is chosen in this architecture as a neutral exchange format between different Case Tools. In addition there is a explosion of tools for handling XML documents very comfortable. The XMI standard specifies with a Document Definition Type (DTD), how UML models are mapped into a XML file. Besides this functionality XMI also specifies how model changes can be easily mapped into an XML document. Therefore XMI is a very good solution for solving some of the requested requirements for UML model evolution... XMI specifies a possibility for transmitting metadata differences. The goal is to provide a mechanism for specifying the differences between documents in a way that the entire document does not need to be transmitted each time. This is especially important in a distributed and concurrent environment where changes have to be transmitted to other users or applications very quickly. This design does not specify an algorithm for computing the differences, just a form of transmitting them. Only occurring model changes are transmitted. In this way different instances of a model can be maintained and synchronized more easily and economically. The idea is to transmit only the changes made to the model together with the necessary information to be able to apply the necessary changes to the old model. With this information you have the possibility for model merging. This means you can combine difference information plus a common reference model to construct the appropriate new model. A important remark to this topic is that model changes are time sensitive. This means changes must be handled in the exact chronological order for achieving the wanted result... In this paper we have shown that modern middleware infrastructures for the development of distributed applications provide rich support for model based development and code generation. But there is almost no support in case of model evolution. We have introduced some concepts and architectures to realize a tool supporting model evolution and data migration and to integrate this tool in modern infrastructures. To specify the model evolution the developer should use an XMI based difference description. Based on this concepts we have already implemented a first prototype. This is a very primitive version but it is already integrated in our framework AutoMate. Based on this experience we have realized the new version of the tool called ShapeShifter. ShapeShifter is now a stand alone tool supporting model evolution and data migration on top of Versant's object-oriented database. With ShapeShifter you specify the model difference in XMI and the model and the database are automatically migrated. ShapeShifter is now used in a first industrial project. The next step will be a complete integration in a CASE tool. Currently one can export and import XMI model files from some CASE tools. But for a full integration of ShapeShifter we need more sophisticated tools to generate the XMI difference file from to XMI based model versions. Moreover we plan to integrate ShapeShifter into several Enterprise Java Beans Container." Paper also available in Postscript format. See "XML Metadata Interchange (XMI)." [cache]

• [March 23, 2001] "Tip: Using JDOM and XSLT. How to find the right input for your processor." By Brett McLaughlin (Enhydra strategist, Lutris Technologies). From IBM developerWorks. March 2001. ['In this tip, Brett McLaughlin tells how to avoid a common pitfall when working with XSLT and the JDOM API for XML developers working in Java. You'll learn how to take a JDOM document representation, transform it using the Apache Xalan processor, and obtain the resulting XML as another JDOM document. Transforming a document using XSLT is a common task, and JDOM makes the transformation go quite easily once you know how to avoid the missteps. The code demonstrates how to use JDOM with the new Apache Xalan 2 processor (for Java).' "Being one of the co-creators of JDOM, I simply couldn't pass up the chance to throw in a few JDOM tips in a series of XML tips and tricks. This tip provides the answer to one of the most common questions I get about JDOM: 'How do I use JDOM and XSLT together?' People aren't sure how to take a JDOM Document object and feed it into an XSLT processor. The confusion often arises because most XSLT processors take either DOM trees or SAX events as input streams. In other words, there is not one obvious way to provide a JDOM Document as input in all cases. So how do you interface JDOM with those processors? The key to solving this problem is understanding the input and output options. First determine the input formats that your XSLT processor accepts. As I mentioned above, you'll usually be able to feed a DOM tree or I/O stream into the processor. But which of those is the faster solution? You're going to have to do a little digging to answer that question. That's right, I'm not going to give you a specific answer, but a method for figuring it out..."

• [March 23, 2001] Examplotron 0.1." By Eric van der Vlist (Dyomedea). "The purpose of examplotron is to use instance documents as a lightweight schema language -- eventually adding the information needed to guide a validator in the sample documents. 'Classical' XML validation languages such as DTDs, W3C XML Schema, Relax, Trex or Schematron rely on a modeling of either the structure (and eventually the datatypes) that a document must follow to be considered as valid or on the rules that needs to be checked. This modeling relies on specific XML serialization syntaxes that need to be understood before one can validate a document and is very different from the instance documents and the creation of a new XML vocabulary involves both creating a new syntax and mastering a syntax for the schema. Many tools (including popular XML editors) are able to generate various flavors of XML schemas from instance documents, but these schemas do not find enough information in the documents to be directly useable leaving the need for human tweaking and the need to fully understand the schema language. Examplotron may then be used either as a validation language by itself, or to improve the generation of schemas expressed using other XML schema languages by providing more information to the schema translators..." From the XML-DEV posting: "Beating Hook, Rick Jelliffe's single element schema language has been quite a challenge, but I am happy to announce examplotron a schema language without any element. Although examplotron does include an attribute, this attribute is optional and you can build quite a number of schemas without using it and I think it fair to say that examplotron is the most natural and easy to learn XML schema language defined up to know ;=) ... The idea beyond examplotron -and the reason why it's so simple to use- is to define schemas giving sample documents. Although examplotron can be used as a standalone tool, it can also be used to generate schemas for more classical -and powerful- languages and I don't think it will compete with them but rather complement them. Thanks for your comments..." See also: (1) the XML-DEV posting, and (2) "XML Schema Element and Attribute Validator." For schema description and references, see "XML Schemas."

• [March 22, 2001] "XOIP: XML Object Interface Protocol." By Morten Kvistgaard Nielsen and Allan Bo Jørgensen. Centre for Object Technology, COT/3-34-V1.0. 116 pages. [Master's Thesis, Department of Computer Science, Aarhus University, 2001.] "XOIP describes a way in which heterogeneous networked embedded systems can interface to a variety of distributed object architectures using XML. An implementation of XOIP is available for download. This document is a thesis for the Masters Degree in Computer Science at the University of Aarhus... In this thesis we shall present our solution to the problem of achieving interoperability between heterogeneous distributed object architectures and paradigms. What makes our solution special is that it is specifically designed to address the problems faced by embedded systems, where lack of system resources have hitherto prevented their participation in distributed object systems. Since embedded systems are more likely to be placed in heterogeneous object systems than their desktop counterparts, the two issues are naturally linked." [cache]

• [March 22, 2001] "Gates Unveils Hailstorm." By Barbara Darrow. In Computer Reseller News (March 19, 2001). "Microsoft Chairman Bill Gates Monday unveiled Hailstorm, one more step in the company's attempt to transform itself into a provider of software-as-services. Hailstorm -- which the company positions as a set of user-centric services to ease e-commerce and Web applications--is not slated for production until 2002. These services theoretically will enable users with any Web-connected devices, including handheld machines and cell phones, to easily and securely access applications and information on the Net... Similar to Novell's DigitalMe service unveiled two years ago, Hailstorm will let a user log on once to the system, which would then remember critical information, including passwords to diverse Web sites and services. Other services will include calendar, address book, notification and authentication.CRN first broke the story of the Hailstorm platform, called by one source as Microsoft Passport on steroids, in January. Microsoft made a design preview of the service available Monday and brought a number of potential partners -- including eBay, Groove Networks and American Express--onstage for demonstrations. By integrating Hailstorm services with its own auction APIs, for example, eBay would enable its own users to get realtime notification when someone has overbid them on a planned purchase. Similarly, American Express Blue Card users trying to order an out-of-stock book would receive notification from the merchant when the title is back in stock, and then click on that message to initiate the transaction... Certain base-level functionality -- such as single log-in -- will continue to be offered for free, but users will be charged for value-added services and on usage, company executives say. Still, it remains to be seen whether Microsoft, whose relationships with partners have been problematic at times, will be the partner of choice here." See: "Microsoft Hailstorm."

• [March 22, 2001] "Interview: Tim Berners-Lee on the W3C's Semantic Web Activity." By Edd Dumbill. From XML.com. March 21, 2001. ['The World Wide Web Consortium has recently embarked on a program of development on the Semantic Web. This interview outlines the vision behind the new Activity, and how it relates to XML in general.'] "Tim Berners-Lee: The W3C operates at the cutting edge, where relatively new results of research become the foundations for products. Therefore, when it comes to interoperability these results need to become standards faster than in other areas. The W3C made the decision to take the lead -- and leading-edge -- in web architecture development. We've had the Semantic Web roadmap for a long time. As the bottom layer becomes stronger, there's at the same time a large amount falling in from above. Projects from the areas of knowledge representation and ontologies are coming together. The time feels right for W3C to be the place where the lower levels meet with the higher levels: the research results meeting with the industrial needs... We always design the Activity to suit the needs of the community at the time. Examples of infrastructural work in which we did this are the HTTP, URI, and XML Signature work. We wanted the attention of the community experts, and things required wide review. More of our Activities and working groups are moving toward a more public model; XML Protocol is a perfect example. SW needs to be really open, as many resources for its growth are from the academic world. We need people who may at some point want to give the group the benefit of their experience, without having a permanent relationship with the consortium. It's not particularly novel. It's combining the RDF Interest Group with W3C internal development stuff. We need to find what the Knowledge Representation community have got that's ripe for standardization, and what it hasn't and so on. Coordination will be very important." See: "XML and 'The Semantic Web'."

• [March 22, 2001] "Tutorial: An Introduction to Scalable Vector Graphics." By J. David Eisenberg. From XML.com. March 21, 2001. ['This introduction to SVG teaches you all you need to know about the W3C's vector graphics format in order to start putting it to use in your own web applications.'] "If you're a web designer who's worked with graphics, you may have heard of Scalable Vector Graphics (SVG). You may even have downloaded a plug-in to view SVG files in your browser. The first and most important thing to know about SVG is that it isn't a proprietary format. On the contrary, it's an XML language that describes two-dimensional graphics. SVG is an open standard, proposed by the W3C... This article gives you all the basic information you need to start putting SVG to use. You'll learn enough to be able to make a handbill for a digital camera that's on sale at the fictitious MegaMart..." [From the W3C SVG Web site: " SVG is a language for describing two-dimensional graphics in XML. SVG allows for three types of graphic objects: vector graphic shapes (e.g., paths consisting of straight lines and curves), images and text. Graphical objects can be grouped, styled, transformed and composited into previously rendered objects. Text can be in any XML namespace suitable to the appplication, which enhances searchability and accessibility of the SVG graphics. The feature set includes nested transformations, clipping paths, alpha masks, filter effects, template objects and extensibility. SVG drawings can be dynamic and interactive. The Document Object Model (DOM) for SVG, which includes the full XML DOM, allows for straightforward and efficient vector graphics animation via scripting. A rich set of event handlers such as onmouseover and onclick can be assigned to any SVG graphical object. Because of its compatibility and leveraging of other Web standards, features like scripting can be done on SVG elements and other XML elements from different namespaces simultaneously within the same Web page."] See: "W3C Scalable Vector Graphics (SVG)."

• [March 22, 2001] "Perl & XML: Using XML::Twig." By Kip Hampton. From XML.com. March 21, 2001. ['XML::Twig provides a fast, memory-efficient way to handle large XML documents, which is useful when the needs of your application make using the SAX interface overly complex.'] "If you've been working with XML for a while it's often tempting frame solutions to new problems in the context of the tools you've used successfully in the past. In other words, if you are most familiar with the DOM interface, you're likely to approach new challenges from a more-or-less DOMish perspective. While there's plenty to be said for doing what you know will work, experience shows that there is no one right way to process XML. With this in mind, Michel Rodriguez's XML::Twig embodies Perl's penchant for borrowing the best features of the tools that have come before. XML::Twig combines the efficiency and small footprint of SAX processing with the power of XPath's node selection syntax, and it adds a few clever tricks of its own..."

• [March 22, 2001] "Overcoming Objections to XML-based Authoring Systems." By Brian Buehling. From XML.com. March 21, 2001. ['When deploying an XML-based content management system, common misconceptions must be corrected. This article helps IT professionals do just that.'] "During a recent development effort, one of our clients was alarmed at the conversion costs of the proposed XML-based content management system compared to the existing MS Word-based process. This was just one instance of an alarming trend of balking at XML-based systems in favor of using public web folders, indexed by some full-text search engine, as part of a local intranet. In the short run, these edit, drop, and index solutions have some appealing features, including low development and conversion costs. But they are short-lived systems that either wither from lack of functionality or rapidly outgrow their design. Fortunately, the initial objections to the cost of building an XML-based content repository have become fairly predictable. In most cases they are based on misconceptions about XML or on an overly optimistic view of alternative approaches. Even though implementing an XML-based content management system is not always the best approach for an organization, any architectural decision should be made only after thoroughly overcoming the common misconceptions of the technology involved. The list of questions below is intended to be a guide for IT professionals to discuss intelligently the pros and cons of developing an XML document repository..."

• [March 20, 2001] "Microsoft's HailStorm Unleashed." By Joe Wilcox. In CNET News.com (March 19, 2001). "Microsoft on Monday launched a HailStorm aimed at upstaging rival America Online. The software giant unveiled a set of software building blocks, grouped under the code name HailStorm, for its .Net software-as-a-service strategy. Along with HailStorm, Microsoft marshaled out new versions of its Web-based Hotmail e-mail service, MSN Messenger Service, and Passport authentication service. The Redmond, Wash.-based software company is positioning HailStorm as way of enticing developers to create XML (Extensible Markup Language)-based Web services deliverable to a variety of PC and non-PC devices such as handhelds and Web appliances. Microsoft said HailStorm is based on the company's Passport service and permits applications and services to cooperate on consumers' behalf. HailStorm also leans heavily on instant messaging services provided by MSN Messenger and on Microsoft's Hotmail e-mail service. Microsoft envisions HailStorm as a way for consumers and business customers to access their data -- calendars, phone books, address lists -- from any location and on any device. That model closely mirrors AOL's model by which members access AOL's service via a PC, handheld, or a set-top box to retrieve their personal information. Microsoft on Monday also disclosed five development partners for its .Net plan, including eBay, which announced its partnership last week. eBay and Microsoft entered into a strategic technology exchange that includes turning the eBay API (application programming interface) into a .Net service. HailStorm is based on Passport's user-authentication technology, which Microsoft uses for Hotmail, MSN Messenger, and some MSN Web services. The company describes the XML-based technology as user rather than device specific. Rather than keeping information on a single device such as a PC, Microsoft envisions people accessing content and personal information through a number of devices created using XML tools. Microsoft is looking to launch two types of .Net services: broad horizontal building-block services such as HailStorm and application-specific services. HailStorm initially will comprise 14 software services including MyAddress, an electronic and geographic address for an identity; MyProfile, which includes a name, nickname, special dates and pictures; MyContacts, an electronic address book; MyLocation for pinpointing locations; MyNotifications, with will pass along updates and other information; and MyInbox, which includes items such as e-mail and voicemail. Microsoft said HailStorm will enter beta testing later this year and will be released next year. Rather than solely relying on Microsoft technology to become the standard for these services, the company is using established Web development languages such as XML, SOAP (Simple Object Access Protocol) and UDDI (Universal Description Discovery and Integration). IBM also is pushing XML, the emerging choice du jour for creating Web pages, and UDDI, a sort of Web services Yellow Pages for developers. IBM last week used XML and UDDI to beef up its WebSphere Application Server and has been aggressively using the tools to woo developers to its middleware software. Technology Business Research analyst Bob Sutherland said that while he expects competition between Microsoft and IBM will be fierce over XML, 'they will woo customers not so much on the benefits of the XML platform but what their products have to offer'." See: "Hailstorm."

• [March 20, 2001] "Microsoft Launches HailStorm Web-Services Strategy." By Tom Sullivan and Bob Trott. In InfoWorld (March 19, 2001). "Microsoft executives detailed a key piece of the company's strategy for delivering user-centric Web services here on Monday. The strategy, code-named HailStorm, is a new XML-based platform that lives on the Internet, and is designed to transform the user experience into one in which users have more control over their information. 'It's probably the most important .NET building block service,' said Microsoft Chairman Bill Gates. 'This is a revolution where the user's creativity and the power of all their devices can be used.' Currently, Gates said, users are faced with disconnected islands of data, such as PCs, cell phones, PDAs, and other devices. HailStorm is designed to combine the different islands and move the data behind the scenes so users don't have to move it themselves, thereby providing Microsoft's latest mantra of anytime, anywhere access to data from any device, according to Gates. To that end, Microsoft will provide a set of services under HailStorm, such as notifications, e-mail, calendaring, contacts, an electronic wallet, and favorite Web destination, designed for more effective communication. 'Stitching those islands together is about having a standard schema, in fact a rich schema, for tying all that info together,' he added. That schema will be constructed largely of XML, which Gates called the foundation of HailStorm. 'The kind of dreams people have had about interoperability in this industry will finally be fulfilled with the XML foundation,' he said. The first end point of HailStorm will be Microsoft's forthcoming Windows XP, the next generation of Windows 2000, due later this year. Gates said that XP makes it easier to get at HailStorm services. 'HailStorm is not exclusively tied to any particular OS,' he added. Although Microsoft said that HailStorm will work with platforms from other vendors, such as Linux, Unix, Apple Macintosh, and Palm, the company maintained that HailStorm services will work most effectively with Windows platforms... Microsoft plans to tap into the 160 million users of its Passport single-sign-on service as early users of HailStorm, and will offer them free services. Gates added that HailStorm will consist of a certain level of free services, but customers that want more will be charged for it..." See: "Hailstorm."

• [March 20, 2001] "Legal Storm Brewing Over Microsoft's HailStorm." By Aaron Pressman and Keith Perine [The Industry Standard]. In InfoWorld (March 20, 2001). Even before Microsoft announced its new online services plan -- dubbed HailStorm -- on Monday, some of the company's leading competitors were quietly registering complaints about the effort with government antitrust regulators. The competitors, including AOL Time Warner and Sun Microsystems, allege that HailStorm and other pieces of Microsoft's .NET initiative are designed to limit their access to customers and further leverage Microsoft's dominant Windows market share... Microsoft denies that anything in its .NET plan is improper. The company's new HailStorm product is not limited to Windows and can be accessed by consumers running Linux, Apple's Macintosh operating system, or even on a Palm handheld device, Microsoft notes. The company also said HailStorm is built on open standards and is available for use by any Web site, including AOL. However, Microsoft plans to charge consumers, developers, and participating Web sites... The next version of Windows, called XP, will integrate HailStorm services into the operating system, encouraging consumers to sign up when they start their computers for the first time. The operating system also features an integrated media player and a copyright-protection scheme to prevent users from distributing copies of music purchased online. Competitors complain that XP won't allow consumers to choose a competing media player as the default program for playing music on their PCs."

• [March 20, 2001] "Shifting to Web Services." By Tom Sullivan, Ed Scannell, and Bob Trott. In InfoWorld Volume 23, Issue 12 (March 19, 2001), pages 1, 27. "Web services may be all the rage these days, but users, developers, and even vendors are only nibbling at the edges of what this still-unfolding shift in software architecture and delivery means to them. Microsoft on Monday will attempt to demystify Web services a bit more, when Chairman Bill Gates and other officials roll out a major technology component to their .NET strategy, dubbed Hailstorm, at an event in Redmond, Wash. Hailstorm, a Web-services development platform first unveiled last week at an exclusive conference for developers and partners, relies on industry standards XML, SOAP (Simple Object Access Protocol), and UDDI (Universal Description, Discovery, and Integration) and will include next-generation versions of Microsoft offerings such as Hotmail, MSN Messenger, and Passport, the software giant's Internet identification service. Developers can embed these and related services into their applications. One source, who requested anonymity, described Hailstorm as being a 'building block' approach to Web services that will open up new ways to communicate and transmit data in an instant message, peer-to-peer format. Microsoft rivals Sun Microsystems and IBM separately last week also tried to put some reality behind their own Web-services plays. Just how Web services will be used is shaping up to be the nascent market's million-dollar question. In the wake of the dot-com fadeout, brick-and-mortar companies are picking up the slack, hoping Web services will generate e-commerce revenue. But perhaps even more pertinent to enterprises is the potential to use the Web services model to tie together existing, in-house applications using XML standards. The coming Hailstorm: Microsoft's Hailstorm initiative will offer a platform for Web services. (1) Represents an expansion of instant-messaging-type p-to-p technology. (2) Allows developers to embed Web services, such as Passport, for identification in their apps. (3) Is based on XML, SOAP, and UDDI... Also, eBay, in San Jose, Calif., agreed to support .NET with its community-based commerce engine, and the two companies envision that Web sites supporting .NET will be able to list relevant items up for auction on eBay through an XML interface. Mani Chandy, co-founder and chief scientist at Oakland-based iSpheres and a computer science professor at Cal Tech, said that because of Web-services standards, large companies that have big IT staffs will start moving toward the architecture. '"A lot of brick-and-mortar companies offer Web services, but they don't even know it. They may not offer them in SOAP, but they might offer them in HTML,' Chandy added. A new generation of companies, some brick-and-mortars, others dot-com successes, are growing up with the notion of Web services. Denver-based Galileo, an early partner of the .NET program, is currently working to convert its Corporate Travel Point software into a Web service by adding support for standards, such as UDDI, XML, SOAP, and the WSDL (Web Services Description Language) specification for standardization..."

• [March 19, 2001] "Untangling the Web. SOAP Uses XML as a Simple And Elegant Solution that Automates B2B Transactions." By Greg Barish. In Intelligent Enterprise Volume 4, Number 5 (March 27, 2001), pages 38-43. "What B2B really needs is an easy way to integrate the back-end systems of participating organizations. And we're not just talking about a solution that involves each business maintaining multiple interfaces to that data. That's the way things work today and, to a large extent, visual interfaces have often proved to be unwieldy solutions. IT managers want a way to consolidate their data and functionality in one system that can be accessed over the Web by real people or automatically by software agents. The Simple Object Access Protocol, better known as SOAP, is aimed squarely at this data consolidation problem. Recently approved by the World Wide Web Consortium (W3C), SOAP uses XML and HTTP to define a component interoperability standard on the Web. SOAP enables Web applications to communicate with each other in a flexible, descriptive manner while enjoying the built-in network optimization and security of an HTTP-based messaging protocol. SOAP's foundations come from attempts to establish an XML-based form of RPC as well as Microsoft's own efforts to push its DCOM technology beyond Windows. SOAP increases the utility of Web applications by defining a standard for how information should be requested by remote components and how it should be described upon delivery. The key to achieving both of these goals is the use of XML to provide names to not only the functions and parameters being requested, but to the data being returned... SOAP simply and elegantly solves the major problems with both the HTML-based and DCOM/CORBA approaches by using XML over existing HTTP technology. Use of XML yields three important benefits: (1) XML makes the data self-describing and easy to parse. (2) Because XML and XSL separate data from presentation, useful data is distinguished from the rendering metadata. Thus, pages used as data sources for software agents can be reused for human consumption, eliminating the need for redundant data views. (3) XML enables complicated data structures (such as lists or lists of lists) to be easily encoded using flexible serialization rules. Using XML for encoding data also represents an alternative to ANSI-based Electronic Data Interchange (EDI). While EDI has been successfully used for years, it does have its problems. For example, it is cryptic and difficult to debug. Also, it is more expensive and requires the server and client to have special software installed to handle the format. What's more, EDI over HTTP is problematic: It doesn't completely support important HTTP encryption and authentication standards, and thus secure transactions are limited or simply not possible. In contrast, SOAP keeps things simple. It's extensible, the data is self-describing, simple to debug, and it can enjoy the benefits of HTTP-based security methods. While a SOAP message requires more bandwidth than an EDI message, bandwidth has become less of a concern as the Internet itself becomes faster - particularly between businesses that can afford high-speed network access. Finally, you can deploy SOAP over a number of protocols, including HTTP. This capability is important because it allows the firewall issues to be avoided and retains the optimizations that have been built into HTTP... While SOAP messages consist of XML- compliant encoding, they can be also be communicated via alternative transport mechanisms, such as RPC. Communication via RPC points back to the history of SOAP in its XML-RPC form. XML- based RPC cuts to the chase: It says, "Let's forget all this stuff about Web servers and Web clients, we just want distributed objects to be interoperable between disparate systems." SOAP over HTTP, in contrast, is a more general form of object-to-object (or agent-to-agent) communication over the Internet. It assumes what is minimally necessary: that objects are accessible via HTTP and that the data they return is self-describing." See "Simple Object Access Protocol (SOAP)."

• [March 19, 2001] STEPml Product Identification and Classification Specification. "This STEPml specification addresses the requirements to identify and classify or categorize products, components, assemblies (ignoring their structure) and/or parts. Identification and classification are concepts assigned to a product by a particular organization. This specification describes the core identification capability upon which additional capabilities, such as product structure, are based. Those capabilities are describe in other STEPml specifications and their use is dependent upon use of this specification... The structure of the STEPml markup for product identification and classification was designed based on the object model found in programming languages such as Java and on object serialization patterns. It is called the Object Serialization Early Binding (OSEB). An overview of the OSEB describes the design philosophy of this approach and the fundamental structure of the elements as well as a description of the header elements. The OSEB uses the ID/IDREF mechanism in XML to establish references between elements rather than using containment. UML object diagrams, with one extension, are used to depict the structure of the elements and attributes in these examples. Each element is represented by an instance of a class with same name as the element..." The following files supporting this STEPml specification are available. (1) the basic product identification and classification OSEB DTD; (2) a sample XML document containing the completed examples based on the simple DTD; (3) the full OSEB DTD for product identification and classification; (4) the ISO 10303-11 EXPRESS data modeling language schema upon which the DTD is based; (5) the STEP PDM Schema Usage Guide with which this STEPml specification is compatible; (6) an overview of the OSEB and the complete OSEB from the ISO Draft Technical Specification. Items 4-6 will be most useful to reviewers who are "literate in the EXPRESS language and the STEP ISO 10303 standard." See: (1) "STEPml XML Specifications", and (2) "STEP/EXPRESS and XML".

• [March 19, 2001] "Extended Path Expressions for XML." By Murata Makoto (IBM Tokyo Research Lab/IUJ Research Institute, 1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan; Email: mmurata@trl.ibm.co.jp). [Extended abstract] prepared a presentation at PODS (Principles of Database Systems) 2001. With 35 references. ZIP format. Abstract: "Query languages for XML often use path expressions to locate elements in XML documents. Path expressions are regular expressions such that underlying alphabets represent conditions on nodes. Path expressions represent conditions on paths from the root, but do not represent conditions on siblings, siblings of ancestors, and descendants of such siblings. In order to capture such conditions, we propose to extend underlying alphabets. Each symbol in an extended alphabet is a triplet (e1; a; e2), where 'a' is a condition on nodes, and 'e1 (e2)' is a condition on elder (resp. younger) siblings and their descendants; 'e1' and 'e2' are represented by hedge regular expressions, which are as expressive as hedge automata (hedges are ordered sequences of trees). Nodes matching such an extended path expression can be located by traversing the XML document twice. Furthermore, given an input schema and a query operation controlled by an extended path expression, it is possible to construct an output schema. This is done by identifying where in the input schema the given extended path expression is satisfied." Details: "XML has been widely recognized as one of the most important formats on the WWW. XML documents are ordered trees containing text, and thus have structures more exible than relations of relational databases. Query languages for XML have been actively studied. Typically, operations of such query languages can be controlled by path expressions. A path expression is a regular expression such that underlying alphabets represent conditions on nodes. For example, by specifying a path expression, we can extract figures in sections, figures in sections in sections, figures in sections in sections in sections, and so forth, where section and figure are conditions on nodes. Based on well-established theories of regular languages, a number of useful techniques (e.g., optimization) for path expressions have been developed. However, when applied to XML, path expressions do not take advantage of orderedness of XML documents. For example, path expressions cannot locate all <figure> elements whose immediately following siblings are <table> elements. On the other hand, industrial specifications such as XPath have been developed. Such specifications address orderedness of XML documents. In fact, XPath can capture the above example. However, these specifications are not driven by any formal models, but rather designed in an ad hoc manner. Lack of formal models prevents generalization of useful techniques originally developed for path expressions. As a formal framework for addressing orderedness, this paper shows a natural extension of path expressions. First, we introduce hedge regular expressions, which generate hedges (ordered sequences of ordered trees). Hedge regular expressions can be converted to hedge automata (variations of tree automata for hedges) and vice versa. Given a hedge and a hedge regular expression, we can determine which node in the hedge matches the given hedge regular expression by executing the hedge automaton. The computation time is linear to the number of nodes in hedges. Second, we introduce pointed hedge representations. They are regular expressions such that each 'symbol' is a triplet (e1, a1, e2), where e1 e2 are hedge regular expressions and a is a condition on nodes. Intuitively, e1 represent conditions on elder siblings and their descendants, while e2 represent conditions on younger siblings and their descendants. As a special case, if every hedge regular expression in a pointed hedge representation generates all hedges, this pointed hedge representation is a path expression. Given a hedge and a pointed hedge representation, we can determine which node in the hedge matches the given pointed hedge representation. For each node, (1) we determine which of the hedge regular expressions matches the elder siblings and younger siblings, respectively, (2) we then determine which of the triplets the node matches, and (3) we finally evaluate the pointed hedge representation. Again, the computation time is linear to the number of nodes in hedges. Another goal of this work is schema transformation. Recall that query operations of relational databases construct not only relations but also schemas. For example, given input schemas (A; B) and(B;C), the join operation creates an output schema (A; B; C). Such output schemas allow further processing of output relations. It would be desirable for query languages for XML to provide such schema transformations. That is, we would like to construct output schemas from input schemas and query operations (e.g., select, delete), which utilize hedge regular expressions and pointed hedge representations. To facilitate such schema transformation, we construct match-identifying hedge automata from hedge regular expressions and pointed hedge representations. The computation of such automata assigns marked states to those nodes which match the hedge regular expressions and pointed hedge representations. Schema transformation is effected by first creating intersection hedge automata which simulate the match-identifying hedge automata and the input schemata, and then transforming the intersection hedge automata as appropriate to the query operation... In Section 2, we consider related works. We introduce hedges and hedge automata in Section 3, and then introduce hedge regular expressions in Section 4. In Section 5, we introduce pointed hedges and pointed hedge representations. In Section 6, we define selection queries as pairs of hedge regular expressions and pointed hedge representations. In Section 7, we study how to locate nodes in hedges by evaluating pointed hedge representations. In Section 8, we construct match-identifying hedge automata, and then construct output schemas. In Section 9, we conclude and consider future works... We have assumed XML documents as hedges and have presented a formal framework for XML queries. Our selection queries are combinations of hedge regular expressions and pointed hedge representations. A hedge regular expression captures conditions on descendant nodes. To locate nodes, a hedge regular expression is first converted to a deterministic hedge automaton and then it is executed by a single depth-first traversal. Meanwhile, a pointed hedge representation captures conditions on non-descendant nodes (e.g., ancestors, siblings, siblings of ancestors, and descendants of such siblings). To locate nodes, a pointed hedge representation is first converted to triplets: (1) a deterministic hedge automaton, (2) a finite-index right-invariant equivalence of states, and (3) a string automaton over the equivalence classes. Then, this triplet is executed by two depth-first traversals. Schema transformation is effected by identifying where in an input schema the given hedge regular expression and pointed hedge representation is satisfied. Interestingly enough, as it turns out our framework exactly captures the selection queries definable by MSO, as do boolean attribute grammars and query automata. On the other hand, our framework has two advantages over MSO-driven approaches. First, conversion of MSO formulas to query automata or boolean attribute grammars requires non-elementary space, thus discouraging implementations. On the other hand, our framework employs determinization of hedge automaton, which requires exponential time. However, we conjecture that such determinization usually works, as does determinization of string automata. Second,(string) regular expressions have been so widely and successfully used by many users because they are very easy to understand. We hope that hedge regular expressions and pointed hedge representations will become commodities for XML in the near future. There are some interesting open issues. First, is it possible to generalize useful techniques (e.g., optimization) developed for path expressions to hedge regular expressions and pointed hedge representations? Second, we would like to introduce variables to hedge regular expressions so that query operations can use the values assigned to such variables. For this purpose, we have to study unambiguity of hedge regular expressions. An ambiguous expression may have morethan one way to match a given hedge, while an unambiguous expression has at most only one such way. Variables can be safely introduced to unambiguous expressions." See "SGML/XML and Forest/Hedge Automata Theory." [cache]

• [March 16, 2001] "Introduction to the Darwin Information Typing Architecture. Toward portable technical information." By Don R. Day, Michael Priestley, and Dave A. Schell. From IBM developerWorks. March 2001. "The Darwin Information Typing Architecture (DITA) is an XML-based architecture for authoring, producing, and delivering technical information. This article introduces the architecture, which sets forth a set of design principles for creating information-typed modules at a topic level, and for using that content in delivery modes such as online help and product support portals on the Web. This article serves as a roadmap to the Darwin Information Typing Architecture: what it is and how it applies to technical documentation. The article links to representative source code." See overview/discussion.

• [March 16, 2001] "Specialization in the Darwin Information Typing Architecture. Preparing topic-based DITA documents." By Michael Priestley (IBM Toronto Software Development Laboratory From IBM developerWorks. March 2001. Adjunct to a general article on DITA "Introduction to the Darwin Information Typing Architecture." Priestley's article "shows how the 'Darwin Information Typing Architecture' also a set of principles for extending the architecture to cover new information types as required, without breaking common processes. In other words, DITA provides the base for a hierarchy of information types that anyone can add to. New types will work with existing DITA transforms, and are defined as "deltas" relative the existing types - reusing most of the existing design by reference." From the introduction: "This in-depth look at the XML-based Darwin Information Typing Architecture (DITA) for the production of modular documentation tells how to prepare topic-based DITA documents. The instructions cover creating new topic types and transforming between types. An appendix outlines the rules for specialization. The point of the XML-based Darwin Information Typing Architecture (DITA) is to create modular technical documents that are easy to reuse with varied display and delivery mechanisms, such as helpsets, manuals, hierarchical summaries for small-screen devices, and so on. This article explains how to put the DITA principles into practice. Specialization is the process by which authors and architects define new topic types, while maintaining compatibility with existing style sheets, transforms, and processes. The new topic types are defined as an extension, or delta, relative to an existing topic type, thereby reducing the work necessary to define and maintain the new type..." See the main bibliographic item.

• [March 16, 2001] "Towards an Open Hyperdocument System (OHS)." By Jack Park. Version 20010316 or later. "In the big picture, this paper discusses one individual's (my) view of an implementation of an Open Hyperdocument System (OHS) as first proposed by Douglas Engelbart. Persistence: This project begins with persistent XTM, my implementation of an XTM engine that drives a relational database engine. It will expand to include flat-file storage of some topic occurrences. These occurrences are saved in an XML dialect specified by a DTD in the eNotebook project discussed below, and can be rendered to web pages using XSLT as desired. Collaboration: It is intended that the OHS engine, rendered as a Linda-like server as discussed below under the project jLinda, will be capable of allowing many users to log into the server and participate in IBIS discussions in the first trials. This assumes multicasting capabilities in the Content layer, which are not yet implemented. Topic Map capability: This project takes the view that navigation of a large hyperlinked document space is of critical importance; Topic Maps, particularly, those constructed to the XTM 1.0 standard are applied to the Knowledge Organization and Navigation issues. Perhaps unique to this specific project is the proposal that the XTM technology shall serve, at once, as a kind of interlingua between Context and Content by serving as the indexing scheme into a Grove-like architecture, and as the primary navigation tool for the Context layer..." [From the posting: "Recently, I have combined jTME [topic map engine] into a much larger project, a version of an Open Hyperdocument System as proposed by Douglas Engelbart http://www.bootstrap.org (as interpreted by me). An ongoing 'weblog' on that project can be found at http://www.thinkalong.com/ohs/jpOHS.pdf. To discuss this project, particularly the jTME part of it, contact me at jackpark@thinkalong.com."] See: "(XML) Topic Maps."

• [March 16, 2001] XML Encoding of XPath: DTD. Work in progress from Wayne Steele and others. See also XML Encoding of XPath: Examples, and the XML-DEV thread. Also from Ingo Macherius: A JavaCC parser for XPath and XSLT patterns. 'Here is another XPath-JavaCC grammar. I think Paul's [JavaCC grammar of Xpath] is clearer (e.g., does not use LOOKAHEAD), while ours is more complete and Unicode aware. Maybe you want to mix them, so: just in case part 2.'

• [March 16, 2001] Microsoft.NET." Special Issue of InternetWorld devoted to the Microsoft .NET Program. March 15, 2001. 18+ separate articles. "Microsoft.NET is big. Very big. Microsoft's evangelists and corporate communications directors have had difficulty explaining .Net to the financial and lay press. It's not easy to reduce the strategic vision of the largest software company in the world to a single sound bite. 'Where do you want to go today?' doesn't tell you much. We hope the analysis that follows will..."

• [March 16, 2001] ".NET Framework. Something for Everyone? When .Net arrives, will Java fans JUMP or run?" By Jacques Surveyor. In InternetWorld (March 15, 2001), pages 43-44. "In order to accommodate the shift to pervasive computing that uses a Web-distributed model and browser interface as the dominant mode of corporate development, Microsoft has embraced three Net technologies which they previously resisted or adopted only reluctantly: Java, fully object-oriented programming (OOP), and open, standardized XML... In almost contradictory fashion, Microsoft is thus far sticking to an open and standardized version of XML as the third pillar of its .Net strategy. XML is being embraced throughout the .Net Framework to make data and processes more interchangeable and interoperable. Working with IBM and other W3C participants, including the often combative Sun Microsystems, Microsoft has helped to define some key XML extensions, including SOAP, for invoking remote processes through XML, and deployed UDDI as a universal directory of Web services. In addition, the company has ceded its own W3C recommendations and adopted such XML standards as XML Schema (XSDL) for extended XML schema definitions of documents. This is in strong contrast to Microsoft's treatment of related W3C recommendations such as HTML, CSS, DOM, and other browser-based standards, where Microsoft Internet Explorer 5.5 now lags behind Netscape 6.0. In the XML arena, Microsoft has been fairly well behaved, vigorously proposing standard alternatives or updates but adhering closely to W3C final recommendations. Microsoft's adoption of XML in the .NET Framework and the .NET Enterprise Servers will help close an interoperability gap. Currently, Microsoft does not directly support either CORBA or Java 2 Enterprise Edition, including such common Web development technologies as Java Servlets and Enterprise JavaBeans. Although other indepdendent software vendors support these technologies on Windows and other OS platforms, Microsoft will now be able to offer its own direct-connect solution based on XML and SOAP. Combining this with an easier-to-use ASP and the creation of Web Services on its own Windows 2000 platform, Microsoft will have a compelling .Net message."

• [March 16, 2001] "Got SOAP? XML as a distributed computing protocol." By David F. Carr. In InternetWorld (March 15, 2001), pages 72-74. "Microsoft is promoting SOAP as a way for developers to apply the same techniques for distributed computing on an intranet, adding capabilities to a Web site, or publishing a Web service on the intranet. But it's also careful to say that it's not abandoning DCOM, the distributed version of the Component Object Model. For one thing, too many existing applications rely on DCOM. But it's also true that for many applications that are closely tied, DCOM will provide a tighter linkage and higher performance than would be possible with SOAP. Like DCOM, CORBA and Java RMI use binary protocols. That means speedier transmission across the network and more instantaneous processing by the recipient. XML messages suck up more bandwidth and have to be run through a parser before processing. However, XML messaging fans argue that those disadvantages are negligible, given the rapidly increasing speed of parsers and CPUs. Besides, they point to the success of the Web, which is also based on relatively verbose protocols but achieved far more widespread adoption than any competing network computing technology. And because XML messages are inherently open-source, a developer who is struggling with the subtleties of an API doesn't have to rely on the published documentation: He can intercept some sample messages and study them using a variety of XML tools. A protocol that is simple and open also stands the best chance of being implemented on a wide variety of operating systems and programming languages. The place where there's a clear case for XML messaging is on the Internet, where traffic in protocols like DCOM, CORBA IIOP, and RMI is rare. For one thing, firewalls tend to block them. But SOAP piggybacks on other Internet protocols that are already ubiquitous, meaning HTTP primarily, but also SMTP, FTP, and secure Web protocols such as SSL and TLS..."

• [March 16, 2001] ".NET Analysis. Microsoft Is Not Alone: Web Services Initiatives Elsewhere in the Industry." In InternetWorld (March 15, 2001). Microsoft is doing such a good job of identifying itself as a leader in the Web services movement that you'd think it invented the idea of delivering services over the network... such network computing stalwarts as Sun Microsystems Inc. and Oracle Corp. were classified as laggards in a Gartner Group analysis of the Web services trend published in October. IBM was on the rise, as it joined with Microsoft and others to define emerging standards such as SOAP. The 'visionaries' on Gartner's trademark Magic Quadrant were Hewlett-Packard, Microsoft Corp., and Bowstreet, a startup rubbing elbows with the big platform vendors. As for leaders, there are none yet in the sense that none have yet demonstrated the ability to execute on the vision. Presumably, a few more products are going to have to come out of beta before that happens... [1] HP has paid particular attention to the problems of securing Web services and authenticating users, creating a protocol of its own called Session Level Security (SLS). The SOAP specifications themselves don't specify how messages should be secured, and the simplest solution is probably to send them over a Web security protocol such as SSL. [2] Sun has probably done more than anyone over the years to promote the idea of delivering services over the network, popularizing terms like 'Web tone' (from 'dial tone') to describe a telecommunications-like environment where getting computing resources is no more complicated than picking up the phone. Sun's Jini technology also has a lot in common with the Web services approach advocated by Microsoft, including the idea of services that are published on the network and registered in searchable directories [but, says Sun] 'It became clear to us two years ago that Jini was not the appropriate technology to deliver widescale services, so we jumped on the ebXML bandwagon.' [3] John McGee, director of Internet platform marketing at Oracle, expresses similar reservations about SOAP, while claiming that Oracle is way ahead of Microsoft in delivering on the general concept of Web services. [4] IBM is more enthusiastic about SOAP, having joined Microsoft, UserLand, and DevelopMentor in co-authoring the specification. It's also participating in the development of many related technologies, such as UDDI and the Web Services Description Language (WSDL). [5] Bowstreet, a three-year-old company that was among the first to promote the concept of Web services, created its tools for aggregating and reorganizing services before the current crop of emerging standards took shape. Bowstreet has also been active in the development of standards such as Directory Services Markup Language (DSML), Transaction Authority Markup Language (XAML), and UDDI, and it plans to turn its Businessweb.com directory of services into a UDDI registry..."

• [March 16, 2001] "The .Net Initiative: Dave Winer. The President of UserLand and SOAP Co-Creator Surveys the Changing Scene." By David F. Carr. In InternetWorld (March 15, 2001), pages 53-58. "UserLand Software Inc. President Dave Winer is one of the co-authors of SOAP, the remote procedure call (RPC) now being popularized by Microsoft Corp. He has also promoted XML-RPC, an earlier spinoff of his collaboration with Microsoft and DevelopMentor. Later, IBM and its Lotus division also got involved in the development of SOAP. Now a long list of corporate supporters are backing SOAP as the foundation of the World Wide Web Consortium's XML Protocol (XP) project. And, of course, it is the foundation for distributed computing in Microsoft.NET, largely replacing DCOM (the distributed object computing version of Microsoft's Component Object Model) and challenging technologies such as Java Remote Method Invocation (RMI) and the Object Management Group's Common Object Request Broker Architecture (CORBA). Winer -- equal parts industry gadfly and software-development guru -- originally made his mark creating software for the early Apple PCs. He created several commercial hits for Living Videotext, which later became part of Symantec Corp. He has concentrated most of his development efforts on software for organizing and publishing information. He is also a prolific writer whose essays on everything from code to politics and culture appear in many industry publications as well as his self-published newsletters and the DaveNet Web site..."

• [March 16, 2001] "The Internet World Interview: Jeffrey Richter Wintellect's co-founder on teaching .Net programming to the Microsoft workforce." By Jonathan Hill. In InternetWorld (March 15, 2001), pages 61-63. "When you go in and train Microsoft employees, they have varying skill sets and backgrounds. What are some of the hot-button items, the things that are the toughest to explain? Richter: The .NET Framework is an object-oriented programming platform, and some people don't have a strong object-oriented foundation. Visual Basic programmers, for example -- they'll have some difficulties picking up some of the concepts, such as inheritance, polymorphism, and data extraction, which are the three tenets of object-oriented programming. The platform is incredibly rich and large, so in the class we cover many topics, and it happens very quickly. I'm sure that a lot of people walk out and need to go back to documentation. They won't remember everything I say, because there's so much material. IW: Do you find that the object-oriented concepts are things that you need to go over a lot, or do you refer people? Richter: No. I give them a reading list. But object-oriented programming really started to get into favor in the early '80s, so it's over 20 years old now. I think even Visual Basic programmers who may not have worked with it have had some exposure to it. I've also had some VB programmers come into the class where they do the labs in C#, Microsoft's new programming language, and they had no problem doing that. So, in certain cases, yes, I need to review with them and show them polymorphism, what it means. But I think they're able to pick it up pretty quickly..."

• [March 16, 2001] "C#: Not Just Another Programming Language." By Jeff Prosise. In InternetWorld (March 15, 2001). "Microsoft intends to provide five language compilers for .NET: Visual Basic, C++, C#, JScript, and MSIL. Third parties are actively working on .Net compilers for about 25 other languages, including Smalltalk, Perl, Python, Eiffel, and yes, even COBOL. But the language that has garnered the most attention by far is C# ('C-Sharp'). C# has become a lightning rod of sorts for the anti-Microsoft camp and is frequently characterized, fairly or not, as Microsoft's answer to Java. In reality, C# is a relatively minor player in the .Net initiative. It's one of many languages that a developer can use to write .Net apps. It's arguably the best language as well, because it's the only one built from the ground up for .Net. But at the end of the day, arguing the merits of C# versus Java is a red herring. It's the .NET Framework -- the combination of the CLR and the FCL -- that is the essence of .Net. C# is merely the cherry on top. These points nonwithstanding, C# could become one of the most popular programming languages ever if developers embrace .Net. Few C++ programmers that I know write .Net code in C++; most use C# instead. It's an easy transition, and C# code is more elegant and understandable than the equivalent code written in C++. Even a few Visual Basic developers I know are moving -- or are considering moving -- to C#. In all likelihood, the vast majority of .Net developers will do their work in either VB or C#. If .Net is in your future, then there's a good chance that C# is, too."

• [March 16, 2001] Extended DumbDown for Dublin Core metadata. From Stefan Kokkelink. Experimental. "I have set up an online demonstration of a (extended) dumb-down algorithm for Dublin Core metadata. There are several examples available, try the E[1-6] buttons. RDF documents using DC properties should be responsible for seeing that for every DC property (or subProperty) a meaningfull literal value can be calculated by the algorithm described below. Documents respecting this algorithm can use any rdfs:subPropertyOf or any additional vocabularies (e.g. for structured values) they want: the algorithm ensures that these documents can be used for simple resource discovery however complex their internal structue may be. Extended DumbDown algorithm: This algorithm transforms an arbitrary RDF graph containing Dublin Core properties (or rdfs:subPropertyOf) in an RDF graph whose arcs are all given by the 15 Dublin Core elements pointing to an 'appropriate literal'..."

• [March 16, 2001] "Querying and Transforming RDF." By Stefan Kokkelink. "QAT basic observation: The data model of XML is a tree, while the data model of RDF is a directed labelled graph. From a data model point of view we can think of XML as a subset of RDF. On the other hand XML has a strong influence on the further development of RDF (for example XML Schema <-> RDF Schema) because it is used as serialization syntax. Applications should take into account this connection. We should provide query and transformation languages for RDF that are as far as possible extensions of existing (and proven) XML technologies. This approach automatically implies to be in sync with further XML development." See the working papers: (1) "Quick introduction to RDFPath" and (2) "Transforming RDF with RDFPath" ['The Resource Description Framework (RDF) enables the representation (and storage) of distributed information in the World Wide Web. Especially the use of various RDF schema leads to a complex and heterogenous information space. In order to efficiently deploy RDF databases, we need simple tools to extract information from RDF and to perform transformations on RDF. This paper describes two approaches for transforming RDF using the RDF path language RDFPath. The first approach realizes transformations within an Application Programming Interface (API) and the second approach describes a declarative transformation language for RDF (analogously to XSLT for XML).'] From the 2001-03-16 posting: "After investigating the currently available techniques for querying and transforming RDF (for example see [1]) I would like to propose an alternative approach that is connected more closely to the XML development. Basically I would like to have the counterparts of XPath,XSLT and XQuery in the RDF world: RDFPath,RDFT and RQuery. This approach has (in my opinion) some advantages: (1) benefit from the lessons learned from XML; (2) don't reinvent the wheel: copy and paste as long as possible, extend if necessary; (3) be in sync with XML development. This approach is feasible because from a *data model* point of view XML (tree) is a subset of RDF (directed labelled graph)..." See "Resource Description Framework (RDF)."

• [March 16, 2001] ".Net Gets XML Right." By Jim Rapoza. In eWEEK (March 12, 2001). "Perhaps creating a product in a new field where there are no established leaders to catch up to (or copy) is a good thing for Microsoft Corp. The company's BizTalk Server 2000 is an excellent platform for managing XML data processing among businesses and is one of the best first-version offerings eWeek Labs has seen from Microsoft. Although BizTalk Server 2000 includes a server element for handling data transfers, its real strength lies in its suite of tools, which provide powerful, intuitive interfaces for creating and transforming Extensible Markup Language files and for collaborative creation of business processes. The product is one of the most important in Microsoft's .Net initiative because XML is at the core of .Net. Despite its still less-than-perfect support for standards, we believe BizTalk Server 2000 sets an impressive standard for functionality and usability in XML processing. For these reasons, it is an eWeek Labs Analyst's Choice. BizTalk Server 2000, which shipped last month, comes in a $4,999-per-CPU standard edition that supports up to five applications and five external trading partners, and in a$24,999 enterprise edition with unlimited support for applications and trading partners. Like most .Net servers, the product runs only on Windows 2000 Advanced Server and requires SQL Server 7.0 or later. BizTalk Server also requires Microsoft's Visio 2000 charting application and its Internet Explorer 5.0 Web browser or later. One core tool in the product is BizTalk Editor, which makes it very simple for users to create schemas specific to their business needs using an intuitive, tree-based builder interface. Another useful tool in tests was BizTalk Mapper, which let us transform XML and other data documents such as electronic data interchange and text files, using a straightforward interface to map the documents into proper formats. BizTalk Mapper then generates an Extensible Stylesheet Language Transformations file to manage the document transformations. By default, BizTalk Server 2000 is still based on Microsoft's XML-Data Reduced schema. However, the product includes a command-line conversion utility to convert data to the World Wide Web Consortium's XSD (XML Schema Definition) standard. Although this works, we would like to have XSD support built into the tools to make the server easier to integrate with other XML data systems. The server also supports Simple Object Access Protocol, an XML-based protocol for issuing remote calls... Companies that expect XML to become the lingua franca of business data interactions will find BizTalk Server 2000 to be an excellent translator. The product provides some of the most powerful and intuitive tools available for creating, managing and distributing XML data, making it an Analyst's Choice."

• [March 16, 2001] "XML and WAP." By John Evdemon (Chief Architect, XML Solutions). January, 2001. A presentation given to the Washington Area SGML/XML Users Group. 54 slides, PDF format. "Basic Definitions: [Wireless Application Protocol (WAP), eXtensible Markup Language (XML), Wireless Markup Language (WML)]; WAP's Differentiators: [Bluetooth, DoCoMo i-mode; Combining XML with a wireless protocol standard]; The Trouble with WAP; The Wireless Future. What is WAP? WAP is a technology based on Internet technologies for use by digital phones WAP is backed by major vendors: Nokia, Ericsson, Motorola, Microsoft, IBM. WAP Forum is open for all: Over three hundred companies have joined the WAP Forum. WAP supports several wireless systems: GSM, IS-136, CDMA, PDC etc. WAP has a layered architecture: The same application can be used via several systems. WAP 2.0: Next generation of WAP will include XHTML (with backwards compatibility to WML); TCP support; Color graphics; Animation; Large file downloading; Location-smart services; Streaming media; Data synchronization with desktop PIM. Specs are being built in anticipation of Network evolution and Handheld evolution..." See: "WAP Wireless Markup Language Specification (WML)."

• [March 16, 2001] "WSDL Specification Sent to W3C." By Christopher McConnell. In ent - The Independent Newspaper for Windows NT Enterprise Computing [Online]. (March 15, 2001). "A key specification for Microsoft Corp's .NET initiative has been submitted for review to the World Wide Web Consortium (W3C). The Web Services Description Language (WSDL) provides a grammar for XML, enabling computer-to-computer transactions via the web. A number of Microsoft partners have joined in co-submitting the spec to the W3C. Companies range from database vendor Oracle Corp. to ERP giant SAP AG to purveyors of development tools such as BEA Systems Corp. and Ariba Corp. to OEMs Compaq Computer Corp. and Hewlett Packard Co. WSDL complements the Simple Object Acess Protocol (SOAP) by describing the nature of a transaction through XML. With a WSDL implementation, programs can understand what types of data are transferred and how to use the data. Microsoft has aggressively pushed key .NET specifications to the W3C. It submitted SOAP for review in May, 2000, and is preparing UDDI for review..." See discussion.

• [March 16, 2001] "Can XML Succeed Where EDI Has Failed?" By Lauren Gibbons Paul. From IDGNet. March 01, 2001. "Envera provides an electronic link between chemical companies and their customers. Sounds simple, but why should the XML-based platform succeed where EDI failed? ... Clearly, the way chemical companies conduct transactions is in need of an overhaul. The question remains whether a marketplace such as Envera -- using XML as the linchpin -- can provide the answer. Last March, Mooney, Mike Giesler, then Ethyl's CIO, and two other cofounders began knocking on colleagues' doors, talking about creating an electronic hub for the chemical industry they called Envera (roughly translated from Latin, envera means "in truth"). Envera would differ from other electronic trading exchanges that were then making headlines, such as the chemical industry's CheMatch.com and the auto industry's Covisint, in that it would not attempt to match sellers with buyers. Rather, it would serve only as an electronic platform on which already-established business partners could conduct their transactions. Envera would not take a piece of each transaction that it hosted but instead would charge members an annual subscription fee of between $5,000 and$300,000, depending on company size. Mooney and Giesler got a warm reception from their peers, snagging funding from 11 companies. Things moved quickly after that. Giesler and Mooney left Ethyl in July and by the end of the summer Envera had hammered out XML document definitions for eight basic business processes in conjunction with an industry standards group. By the fall, the initial phase of Envera was up and running, with partners such as Lubrizol and Occidental Chemical beginning to conduct business online. To date, only a tiny number of transactions have taken place on Envera. Giesler expects business to jump once Envera's 40 trading partners come online this spring... Just because Envera has made it out of the starting gate is hardly a guarantee of its eventual success. Like all electronic trading exchanges and hubs, Envera faces enormous obstacles. For starters, it has new competition: a similar online exchange for the chemical industry dubbed Elemica. Elemica, a Philadelphia-based e-marketplace that went online in a test phase this past January, is also based on an XML platform, and it is backed by 22 of the largest chemical companies, including BASF, Dow Chemical and DuPont. With Elemica in the picture, Envera may find it harder to sign up more companies as subscribers.Whether Envera can grow beyond its initial image as an extension of Ethyl presents another challenge. The e-hub will succeed only if industry companies see it as a neutral platform that exists for the benefit of all companies. The fact that the nine Envera owners are also its users could become a problem down the road. Despite the uncertainty surrounding electronic exchanges, Envera has earned modest praise from some industry watchers... But just providing a standard language is not enough -- syntax is needed too. For XML to be truly useful requires the definition of standard documents, such as a purchase order, to be used within the industry. And once those business documents have been defined, they must be widely adopted. In the chemical industry -- as everywhere -- multiple groups with multiple agendas are pursuing multiple standards. Envera has made quick progress on its eight initial XML documents, but a potential battle looms with competitor Elemica. XML has an obvious advantage over EDI in that it leverages existing infrastructure (such as the Internet) and is therefore not expensive to adopt. And it does have some technical advantages... This time around, there are hopeful signs. Envera has agreed to share its eight initial XML document definitions for use with CIDX for use by any company in the industry. However, the ability of competitors to coalesce around standards was immediately tested when Elemica announced last summer that it too was working on XML document definitions for a purchase order and an order acknowledgment, among others. Representatives from Envera and Elemica gathered around the bargaining table and hammered out common definitions for the good of all. For their part, Mooney and Giesler say they'll do what's necessary to work out a common standard or arrange for Envera to map to different standards, as needed..."

• [March 16, 2001] "IBM Package Boosts Standards in WebSphere." By Ed Scannell. In InfoWorld (March 15, 2001). "Touting its ability to provide optimized delivery for Web services, IBM on Wednesday unveiled WebSphere Technology for Developers, which supports several Web standards such as the Universal Description Discovery and Integration (UDDI) specification. By supporting UDDI and the Simple Object Access Protocol (SOAP), IBM's package helps corporate users create e-business applications and services that are better able to interact with other Web-based applications. IBM officials believe Web services are spearheading a new era in e-business where the Internet will be shaped and driven by more robust applications. With the new product, IBM officials claim the company is the first to implement and fully integrate HTTPS, which combines SSL (Secure Sockets Layer) with HTTP as well as HTTP Authentication and SOAP security. The new set of capabilities includes support for digital signatures and the ability to enable end-to-end authentication, integrity, and nonrepudiation for SOAP messages. The new environment also includes Sun Microsystems' Java 2 Enterprise Edition (J2EE), which will give developers the ability to create the foundations of business-oriented applications that can operate across multiple platforms and environments. The program also offers the Web Services Description Languages (WSDL), which is able to describe programs accessible over the Internet and the message formats and protocols that are used to communicate with them. IBM believes WSDL is particularly important because it allows Web services to describe their capabilities in a standard way, making it easier for them to interoperate with other Web services and development tools... Separately, IBM announced a new version of WebSphere for its z/OS and OS/390 mainframe operating systems. It also includes support for J2EE. The new version includes WebSphere Application Server for z/OS and OS/390 and CICS Transaction Server 2.1." See the announcement.

• [March 16, 2001] "IBM Advances Web Services Strategy." By Mary Jo Foley. In CNET News.com (March 14, 2001). "IBM announced Wednesday the next phase of its Web services game plan. Big Blue is shipping a new version of its WebSphere application server -- fortified with support for the leading Web services protocols and standards -- that it plans to make available to developers for free. As the battle for developer mind share in the Web services market is heating up, each of the major software companies is attempting to play to its strength. In IBM's case, that means its middleware Internet infrastructure software and related development tools... Giga Information Group analyst Mike Gilpin agreed with Hebner's assessment. 'IBM is really the first to (make generally available) tools like these that are needed for Web services to take off,' Gilpin said. Gilpin added that widely available tools, such as those in IBM's WebSphere Technology for Developers release, will likely take the pain and expense out of hand-coded Web services. Most existing payment, insurance and travel Web services have been built by hand from scratch, Gilpin said. WebSphere Technology for Developers includes built-in support for XML (Extensible Markup Language); UDDI (Universal Description and Discovery Integration) standard; SOAP (Simple Object Access Protocol); WSDL (Web Services Description Language); and J2EE (Java 2 Enterprise Edition) technology. XML is the new lingua franca of the Web, designed to make sharing data easier. UDDI acts like a Yellow Pages for Web services by exposing them and helping developers to locate them. SOAP is an emerging standard for distributed computing interoperability. WSDL is an XML format aimed at improving Web services messaging-interoperability technology. And J2EE is a standard technology for developing and launching enterprise applications. To obtain a free copy of WebSphere Technology for Developers, a developer must be 'referred' to IBM as a potential WebSphere customer by either an IBM salesperson or an IBM partner. Developers can contact IBM for a referral. IBM announced the WebSphere release in conjunction with its weeklong WebSphere 2001 trade show in Las Vegas. IBM also announced on Wednesday availability of a version of its WebSphere Internet infrastructure software that has been written to run on its eServer z900 and OS/390 mainframes..." See the announcement.

• [March 16, 2001] "Commentary: IBM takes lead in services. [Gartner Viewpoint.]" By Massimo Pezzini, Gartner Analyst. In CNET News.com (March 14, 2001). "IBM announced WebSphere Technology for Developers because it wants to keep building the credibility of its e-business middleware strategy and to assert leadership in the emerging Web service arena. No less important is catching up with Java 2 Enterprise Edition competitors. Although several application server vendors have committed to supporting Web services in their J2EE platforms, IBM's announcement on Wednesday makes it the first to deliver a real -- albeit functionally limited -- product. The announcement further validates the notion of Web services. It follows the announcement of IBM's WebSphere strategy in November and positions IBM as a serious candidate for leadership in both J2EE and Web service technology. WebSphere Technology for Developers is not production-ready, but rather a preview of WebSphere v.4, the next major update of the WebSphere Application Server family. It runs only on Windows NT and DB2 and supports Web service protocols such as SOAP, UDDI, WSDL and XML, along with related development tools. The WebSphere 4 product set also will include the zSeries run-time version--that is, WebSphere Application Server for z/OS and OS/390, also announced Wednesday--and still unannounced Unix/Windows 2000 versions likely to be available in the second quarter. The WebSphere release will allow Java developers to familiarize themselves with, and start developing applications for, WebSphere 4 and to experiment with Web service technology. IBM has trailed other vendors in support for J2EE specifications. For example, WebSphere Advanced Edition 3.5 does not support Enterprise JavaBeans 1.1. WebSphere Technology for Developers fills this gap by being the first WebSphere version officially certified by Sun Microsystems as J2EE-compliant. When available, WebSphere 4 will be, too. Thanks to WebSphere Technology for Developers, IBM changes from a follower into a leader. In fact, vendors such as BEA Systems, Hewlett-Packard/Bluestone Software and iPlanet will have to catch up by quickly delivering SOAP/UDDI capabilities in their application servers--or be marked as technology laggards..." See the announcement.

• [March 16, 2001] "About Multimodal ZVON." By Jiri Jirat. From the ZVON project. March 2001. Abstract: "Multimodal ZVON is a demonstration of a site powered by XML/XSLT technology. The same XML sources have been used to create several different output formats (including graphics). Moreover, a sophisticated search and a site map have been very easily implemented, since all sources are XML." Description: "The 'Multimodal view of ZVON' is a demonstration project which shows how XML with XSLT can be used to create and maintain a website. Multiauthoring is very easy, thanks to the transparency of the XML format. Now the website is completely built using XML and in the following few pages we will briefly describe the framework. The whole process consists of the following main steps: (1) Defining data layout - storage of different data types in XML files, directory and data structure definition. (2) Creating presentation layout: pictures (SVG), various text formats (HTML, XML, PDF) (3) Using secondary information (metadata) - creating specialized search and a site map." [cache]

• [March 16, 2001] "Comparing Beeyond with XML." By VU/Beeyond Staff. "Beeyond does not support XML, because there is currently no universal standard which defines what XML tags represent. Individual companies have published standards related to their proprietary technologies, but these are merely product specific APIs. In order for XML to really fulfill its promise as a generic solution for data exchange between companies and programming languages, tag definitions must be agreed upon industry wide. When that occurs, and if XML is widely accepted by developers, then we will support it in Beeyond. In the meantime, however, Beeyond provides its own solution for inter-company data exchange that gets around some of the problems that will plague XML even after standards are developed. In particular, it simplifies the process of prior agreement between companies on message definitions and allows messages and their associated applications to be updated without system language programming. ['Beeyond (from Virtual Unlimited (VU), an Internet software development company based in Veldhoven, The Netherlands.) is a system for building and running secure documents and database-backed network applications with Java user interfaces. It is powerful and inexpensive enough to be used for all kinds of applications. In addition to secure applications and documents with Java user interfaces, Beeyond contains a unique class of messages called BeeXchange. These messages allow companies to exchange data easily and securely and form the basis for Beeyond's B2B application functionality. Beeyond's strong authentication and encryption also enable BeeHive servers to create secure, virtual private networks on the Internet. Third, Beeyond uses a new application development model that allows applications to be built quickly and changed easily. A few of the features that set Beeyond apart from other products: Simplicity, Message-oriented, Scriptable applications, Database brain.']

• [March 15, 2001] "The [NEL] Newline Character." By Susan Malaika (IBM). W3C Note 14-March-2001. "The omission of [NEL], the newline character defined in Unicode 3.0, from the End-of-Line Handling section in the XML 1.0 specification causes significant difficulty when processing XML documents and DTDs in IBM mainframe systems. Problem areas include: (1) Processing XML documents or DTDs generated on OS/390 systems, with XML 1.0 compliant parsers. (2) Processing XML documents or DTDs, using native OS/390 system tools. (3) Processing XML documents or DTDs retrieved from OS/390 database or file systems, in non-OS/390 environments. XML documents that contain [NEL] characters are declared invalid or not well-formed by XML 1.0 compliant parsers. We urge the W3C to include [NEL] as a legal line ending in XML, and hence as a legal white space character, in accordance with Unicode 3.0." See also (1) submission request and (2) W3C staff comment. From the W3C staff comment by C. M. Sperberg-McQueen: "XML 1.0 specifies special handling and normalization for line-boundary character sequences, in an effort to control the complexity which results from the variety of ways in which different operating systems and software products mark line boundaries in data streams. This submission describes an unfortunate but apparently reparable consequence of a design decision taken by the then SGML WG of the W3C in the fall of 1996 in specifying this part of XML 1.0, and outlines a simple and relatively non-intrusive means of making the necessary repair in a way compatible with related work. This submission will be referred to the XML Core WG for action. In light of the background, which suggests that the design of this part of XML relies on what has turned out to be a false factual assumption, the XML Core WG may choose to include the suggested change (as well as a change specifying that PS and LS should be treated as space characters, and clarifying whether they should also be treated as line-separators) in XML. The XML Core Working Group has the responsibility for deciding whether to modify the line-end handling rules of XML or to leave them unmodified in any future version of XML prepared by that Working Group." [latest version URL]

• [March 15, 2001] TAXI to the Future." By Tim Bray. From XML.com. March 14, 2001. ['Tim Bray presents TAXI, a Web application architecture that utilises the power of XML to deliver a responsive user environment.'] "There's not much new about TAXI. I'll claim that if you polled the original group of a dozen or so people, led by Jon Bosak, that defined XML 1.0, you'd find out that something like TAXI was what most of us had in mind. As we all know, XML has mostly been used in backend and middleware data interchange, not in front of the user the way its designers intended and the way TAXI does it. It's long past time for the TAXI model to catch on... TAXI: Transform, Aggregate, send XML, Interact. My claim is that TAXI delivers many of the benefits, and hardly any of the problems, of the previous generations of application architecture discussed above. Let's walk through it... Transform: A lot of business logic boils down to one kind of data transformation or another: applying transactions, generating reports, updating master files. The right place to do most of this work is on the server, where you can assume a rich, high-powered computing environment. Aggregate: The next architectural principle is the aggregation of enough data from around the server to support some interaction with the user. An example would be a list of airplane flights that could be sorted and filtered. Send XML: Once you've gathered an appropriate amount of data together on the server side, you encode it in XML and send it off to the client over HTTP. There's no need to get fancy; we generate XML using printf statements in C code. If you're fortunate, there'll be a well-established XML vocabulary available that someone else invented for use in your application; but probably not, and you'll have to invent your own. Interact: Once the XML has arrived in the client, probably a Web browser, you'll need to parse it. Your browser probably has this built-in; it may be more convenient to compile in Expat or Xerces or one of the other excellent processors out there... Why TAXI is a Good Idea: First, it comes at the user through the browser, something that they've proved they want. Second, the application can run faster and scale bigger than traditional web applications in which the server does all the work. Third, the system is defined from the interfaces out, so nobody can lock it up, and you can switch your black-box clients and servers around with little difficulty or breakage."

• [March 15, 2001] "EXSLT 1.0 Drafts." From Jeni Tennison. Posting to XSL-List. (1) Common - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - COMMON. "This document describes the common set of EXSLT 1.0. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. The common set of EXSLT 1.0 are those extension elements and functions that provide a base level of common functionality that the rest of EXSLT can build on. XSLT processors are free to support any number of the extension elements and functions described in this document. However, an XSLT processor must not claim to support EXSLT 1.0 - Common unless all the extensions described within this document are implemented by the processor. An implementation of an extension element or function in an EXSLT namespace must conform to the behaviour described in this document." (2) Functions - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - FUNCTIONS. "This document describes EXSLT 1.0 - Functions. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. EXSLT 1.0 - Functions are those extension elements and functions that allow users to define their own functions for use in expressions and patterns in XSLT." (3) Sets - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - SETS. "This document describes EXSLT 1.0 - Sets. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. EXSLT 1.0 - Sets covers those extension elements and functions that provide facilities to do with set manipulation." (4) Math - EXTENSIONS TO XSLT 1.0 (EXSLT 1.0) - MATH. "This document describes EXSLT 1.0 - Math. EXSLT 1.0 is a set of extension elements and functions that XSLT authors may find helpful when creating stylesheets. EXSLT 1.0 - Math covers those extension elements and functions that provide facilities to do with maths." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

• [March 15, 2001] XML-Deviant: Extensions to XSLT." By Leigh Dodds and Jeni Tennison. From XML.com. March 14, 2001. ['Members of the XSL mailing list have started a commnunity-based project to standardize extensions for XSLT.'] "The community has discussed alternatives to the contentious <xsl:script> element... the major concerns over xsl:script were that it would encourage scripting code, authored in Java, Javascript, VBScript and other languages, to be embedded inside XSLT stylesheets hampering usability and (potentially) interoperability. The discussion lead to suggestions that XSLT extension functions might usefully be implemented in XSLT itself, rather than, or perhaps in parallel to, implementations in other languages. A number of ways of achieving this functionality were suggested, resulting in Jeni Tennison gathering together the alternatives to further focus the debate and achieve progress: 'There seems to be a reasonable amount of support for user-defined functions written in XSLT, whether to sweeten the syntax of xsl:call-template or to allow XPaths previously only dreamed about. If we're going to move ahead with this, we need to agree on a syntax for (1) declaring the functions and (2) calling the functions. In this email, I'm going to lay out the major designs that have been suggested so far so that we can discuss them and hopefully come up with some kind of resolution that's acceptable to everyone...' What are EXSLT's advantages given that XSLT already provides an user extension mechanism? First, it ensures that the extension functions are well defined in a community-drafted specification, avoiding the need for XSLT developers to rely on proprietary definitions of similar functions provided by their stylesheet engine. Also, while the implementation language for a function may vary, developers can ensure that functions will remain consistent across processors. Second, by providing the means to define their own functions in XSLT, stylesheet authors can create truly portable stylesheets that rely only on a conformant XSLT 1.0 processor that implements the elements defined in EXSLT - Functions... A deciding factor in the success of EXSLT will be whether it's supported by XSLT engines. The prospects are promising, particularly as 4XSLT has already adopted the proposed functions. If the developers of Xalan and other stylesheet engines adopt it, then the future looks decidedly rosy for XSLT developers." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

• [March 15, 2001] Transforming XML: Entities and XSLT." By Bob DuCharme. From XML.com. March 14, 2001. ['Using XML entities can be tricky -- this article covers their usage with XSLT in both input and output documents.'] "In XML, entities are named units of storage. Their names are assigned and associated with storage units in a DTD's entity declarations. These units may be internal entities, whose contents are specified as a string in the entity declaration itself, or they may be external entities, whose contents are outside of the entity declaration. Typically, this means that the external entity is a file outside of the DTD file which contains the entity declaration, but we don't say 'file' in the general case because XML and XSLT work on operating systems that don't use the concept of files. A DTD might declare an internal entity to act like a constant in a programming language. For example, if a document has many copyright notices that refer to the current year, declaring an entity cpdate to store the string '2001' and then putting the entity reference '&cpdate;' throughout the document means that updating the year value to '2002' for the whole document will only mean changing the declaration. Internal entities are especially popular to represent characters not available on computer keyboards... Because an XSLT stylesheet is an XML document, you can store and reference pieces of it using the same technique, but you'll find that the xsl:include and xsl:import instructions give you more control over how your pieces fit together. . . All these categories of entities are known as parsed entities because an XML parser reads them in, replaces each entity reference with the entity's contents, and parses them as part of the document. XML documents use unparsed entities, which aren't used with entity references but as the value of specially declared attributes, to incorporate non-XML entities. When you apply an XSLT stylesheet to a document, if entities are declared and referenced in that document, your XSLT processor won't even know about them..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

• [March 15, 2001] "Microsoft schedules online appointments for .Net." By Mary Jo Foley. In CNET News.com (March 14, 2001). ['Microsoft is preparing to add online appointment scheduling to its suite of services for its upcoming software-as-a-service strategy, according to sources.'] "Microsoft's addition of WebAppoint, which allows for online scheduling for such items as car repair or dentist appointments, is a crucial element in Microsoft's ambitious software-as-a-service strategy, known as .Net. WebAppoint links consumers and companies and offers extra features, such as confirmation of appointments via phone or fax. The start-up's service was launched in the fall 1999. Microsoft is expected to advance the WebAppoint technology and initially launch it later this year as one of its services on Microsoft's bCentral small-business Web site, according to sources. Company representatives confirmed on Wednesday the purchase of WebAppoint.com. Microsoft already has two pilot projects in place where it is testing WebAppoint... A few other services have also fallen in step with .Net plans, such as Microsoft's Passport Internet authentication service and its ClearLead lead-management product. These services have the potential of being worked in to more complex Web applications and services... Further complicating the picture is the imminent arrival of Hailstorm, which is a set of .Net building-block technologies that Microsoft is expected to position as a key part of its overall .Net initiative. Hailstorm, which Microsoft is expected to unveil officially March 19, will incorporate next-generation versions of a number of Microsoft's existing services -- such as its Hotmail e-mail, MSN Messenger instant messaging, and Passport products -- and make them available to developers building XML-based Web services." [Netdocs: "...According to sources, Netdocs is a single, integrated application that will include a full suite of functions, including email, personal information management, document-authoring tools, digital-media management, and instant messaging. Microsoft will make Netdocs available only as a hosted service over the Internet, not as a shrink-wrapped application or software that's preloaded on the PC. Netdocs will feature a new user interface that looks nothing like the company's Internet Explorer Web browser or Windows Explorer. Instead, Netdocs is expected to offer a workspace based on Extensible Markup Language (XML), where all applications are available simultaneously. This interface is based on .Net technology that Microsoft, in the past, has referred as the 'Universal Canvas'."

• [March 15, 2001] "Build Stateless Components With XML. Implement stateless components to track and store data from multiple Web clients concurrently." By Mark J. Collins and David John Killian. From DevX.com. March 2001. "All but the most trivial Web applications require you to maintain state data -- client-specific information that must be preserved between successive requests. The question is: Where do you keep the state? You've probably encountered a few standard techniques to preserve state in Web-based applications. Some applications use cookies that reside on the client, some pass state information as parameters in the URL. Active Server Pages (ASPs) frequently use the Session object to store state. But none of these techniques help with server-based components, especially when the amount and complexity of the state data is high... Server-based components can be 'stateful' or 'stateless.' Stateful components must be dedicated to a single client. Stateless components, on the other hand, can support many concurrent clients If you're developing components to run on a server and support a potentially large number of clients, consider making your components stateless. In this article, we'll show you how to create complex stateless components that track client information efficiently, improving performance. Our solution uses some of the more advanced features of COM and ATL and assumes at least some familiarity with XML and the Standard Template Library (STL). First you'll turn a stateful class into a stateless one by moving its member attributes outside the class. Then you'll implement nested interfaces using subordinate implementation classes because real-world solutions require more complex state data. Finally, you'll use an XML document to store this complex state data so it can be easily retrieved on subsequent client requests. We've pulled these ideas together into a sample Order component you might find in an e-commerce application. The stateless component collects summary information about the order, the customer, and a list of requested items..." ['Stateful components must be dedicated to a single client. If a particular server operation takes half a second to complete and each client calls it every five seconds, for example, you'd need 10 instances to support 10 active clients -- and each instance will be idle 90 percent of the time. Every instance uses valuable system resources such as memory or database connections, even when idle, bogging down the server without providing the throughput potential. Stateless components, on the other hand, can support many concurrent clients. A single instance of a stateless server component in this scenario could support 10 active clients with no client-perceptible performance degradation.']

• [March 14, 2001] "Bill Inmon Sees Advantages and Limitations of 'Fed XML'. [XML Report.]" By Rich Seeley. In Application Development Trends (March 2001). "XML is like Federal Express. According to Bill Inmon, the consultant and author known as the Father of Data Warehousing, it provides an envelope and delivers data from point A to point B. Continuing the overnight delivery analogy during a seminar at the Data Warehousing Institute World Conference Winter 2001 in Palm Spring, Inmon said that once you receive your FedEx package, the important issue is making sense of the contents inside. 'When it comes to the semantics of understanding metadata, XML doesn't do a thing,' he said. The semantics problem involves getting departments within the enterprise and business partners and vendors outside to all agree on definitions of what terms such as 'revenue' mean. 'XML solves the problem of getting data from one place to the next,' he said. 'But it doesn't begin to solve the problem of business vs. technical data, differences of opinion as to what "revenue" means. XML doesn't solve those problems. Isn't designed to solve those problems'..." [Note on XML Report: it "provides the latest news, information, and expert analysis on the state of XML tools and technologies. Produced by the editorial team behind Application Development Trends, Java Report, and The Journal of Object Oriented Programming, XML Report will provide developers and development managers with strategic information on emerging standards-and potential pitfalls-in the fast-growing XML marketplace."]

• [March 14, 2001] "The Real Impact of XML." By John K. Waters. In Application Development Trends Volume 8, Number 3 (March 2001), page 9. Waters summarizes a Zona report on XML, "XML: The Dash for Dot.com Interoperability." See below. "... at least one prominent observer believes 'that history may regard XML as a more important development than HTML and even the Web'."

• [March 14, 2001] "XML: The Dash for Dot.com Interoperability." Zona Research Reports Online, Issue 42 (January 2001). "When the history of Web-based ecommerce is written, XML may be regarded as a more important development than HTML in accelerating business on the Web. The reason is that XML promises to do for Web application interaction what HTML did for the human reading of Web-based documents. XML will be able to bridge the islands of information locked away in incompatible computing systems to provide a freer interchange of data between these formerly isolated systems. Through the efforts of many industry consortia, XML has found a place in industries as diverse as medicine, insurance, electronic component trading hubs, petrochemicals, forestry and finance, to name a few. The promise of XML is multifaceted and huge, but has it achieved serious acceptance in the corporate world? To answer this question, Zona Research announces the release of its latest Zona Market Report, XML: The Dash for Dot.com Interoperability. This report is packed with primary research from interviews with enterprise decision makers who are currently deploying XML based solutions or plan to do so during 2001. The report explores the state of XML deployments from the users' perspective and answers these questions, amongst others: Past Approaches Have Fallen Short: Electronic Data Interchange (EDI); Past Approaches Have Fallen Short: Extended Intranets; XML Fundamentally Changes the Speed of Business; XML Fundamentally Changes the Cost of Business; XML Fundamentally Lessens the Pain of Change; XML is Politics; XML Standards: Half Baked, Fully Baked, and Incrementally Baked; XML as a Business Process Disruptor; SOAP and UDDI: A Model for Finding and Acquiring Web Services; Primary Research: What do Users Think?; Summary: What Does It All Mean?..."

• [March 13, 2001] Codes for the Representation of Languages for Information Interchange. ANSI/NISO Z39.53-200X; ISSN:1041-5653, Revision of ANSI/NISO Z39.53-1994. A Draft American National Standard Developed by the National Information Standards Organization. Status: For Ballot February 9, 2001 - March 23, 2001. The specification provides "a standardized 3-character code to indicate language in the exchange of information is defined. Codes are given for languages, contemporary and historical. The purpose of this standard is to provide libraries, information services, and publishers a standardized code to indicate language in the exchange of information. This standard for language codes is not a prescriptive device for the definition of language and dialects but rather a list reflecting the need to distinguish recorded information by language." From the Foreword: "This standard was originally prepared by Standards Committee C, Language Codes, which was organized in 1979. Charged with 'providing a standard code for indicating languages for information interchange purposes,' the committee produced a standard based on the list of MARC language codes developed by the Library of Congress in cooperation with the National Agricultural Library and the National Library of Medicine. This code list is now published as the MARC Code List for Languages. Practical application of the MARC language codes has shown that in order to serve as an appropriate retrieval device for information, a standard list of language codes must reflect the linguistic content of the universal collection to which it is applied, with language codes assigned as needed to distinguish information in a given language or group of languages. The MARC language codes constitute such a list. The committee's decision to base the standard on the existing MARC list took into account these contributing factors: (a) several years' successful application of the MARC language codes resulting in many millions of bibliographic records containing the accepted MARC codes, (b) the mnemonic relationship of the MARC codes to the English language names of the languages with English being the operational language of most American libraries, information services, and publishers, and (c) the flexibility inherent in a three-character code. The MARC list may be consulted for references from alternative forms of language names, as well as for the assignments to collective codes of languages for which individual codes have not been established. This revised edition reflects a thorough review of the document and includes changes which are a result of requests and demonstrated need from users and implementors. In addition, it includes numerous changes necessary for compatibility with bibliographic language codes in ISO 639-2 (Codes for the representation of names of languages: alpha-3 code). The MARC code list is kept consistent with both ANSI/NISO Z39.53 and ISO 639-2/B." See the Z39.53-200X description and comment form. On the broader issues of language identification using ISO 639, RFC 1766, etc., see also "Language Identification and IT: Addressing Problems of Linguistic Diversity on a Global Scale," by Peter Constable and Gary Simons. Reference: "Names of Languages - ISO 639." [cache]

• [March 13, 2001] "XHTML Tags Reference." By Michael Classen. From WebReference.com. March, 2001. "XHTML is a reformulation of HTML 4 as an XML 1.0 application. The stricter nature of XML requires you to follow more rules than before when creating documents..." See: "XHTML and 'XML-Based' HTML Modules."

• [March 13, 2001] "XML CATALOGS." By [OASIS Entity Resolution Technical Committee.] Edited by Norm Walsh. Revision date: 13 March 2001. Abstract: "The requirement that all external identifiers in XML documents must provide a system identifier has unquestionably been of tremendous short-term benefit to the XML community. It has allowed a whole generation of tools to be developed without the added complexity of explicit entity management. However, the interoperability of XML documents has been impeded in several ways by the lack of entity management facilities: (1) External identifiers may require resources that are not always available. For example, a system identifier that points to a resource on another machine may be inaccessible if a network connection is not available. (2) External identifiers may require protocols that are not accessible to all of the vendors' tools on a single computer system. An external identifier that is addressed with the ftp: protocol, for example, is not accessible to a tool that does not support that protocol. (3) It is often convenient to access resources using system identifiers that point to local resources. Exchanging documents that refer to local resources with other systems is problematic at best and impossible at worst. The problems involved with sharing documents, or packages of documents, across multiple systems are large and complex. While there are many important issues involved and a complete solution is beyond the current scope, the OASIS membership agrees upon the enclosed set of conventions to address a useful subset of the complete problem. To address these issues, this specification defines an entity catalog that maps an entity's external identifier to a URI." See also the updated issued list and the TC web site.

• [March 13, 2001] "XML Messaging Framework." By Timothy Dyck. In eWEEK (March 11, 2001). "Realizing the cart can't go before the horse, Microsoft Corp. has developed a comprehensive set of proposed standards about how to use XML to send and receive business-to-business messages online. The BizTalk Framework 2.0 specification, released in December, updates its 1.0 predecessor adding ways to check for reliable message delivery, and it includes information on how to use MIME (Multipurpose Internet Mail Extension) and Secure MIME to securely send BizTalk-based Extensible Markup Language messages over e-mail. HTTP delivery of messages is also described in detail. Another big change is that BizTalk Framework has been redesigned to conform to Simple Object Access Protocol 1.1 and XML Schema standards proposals. It also includes XML tags described using the older, nonstandard XML-Data Reduced format. It's possible that vendors other than Microsoft will support the BizTalk messaging framework and thus allow interoperability between Microsoft's own BizTalk Server and non-Microsoft products. It's too soon to tell if this will happen, though. BizTalk Server itself has not caught up to the XML standards that BizTalk Framework relies upon, as BizTalk Server uses XML-Data Reduced-formatted messages internally, not XML Schema (though a separate command-line tool is provided with BizTalk Server to convert XML-Data Reduced-formatted messages to an XML Schema format). The specifics of BizTalk Framework are fairly simple because they describe only the BizTalk message envelope and message characteristics. The items described are sender and receiver names, unique message identifier, time stamps indicating when a message was sent and will expire, topic, request for confirmation of message delivery, request for confirmation of message processing commitment, attachment data, and optional business-specific message information..." See "BizTalk Framework" and the news item "BizTalk.Org Web Site Upgraded."

• [March 13, 2001] "GXS Works on B2B Integration Issues." By Renee Boucher Ferguson. In eWEEK (March 11, 2001). "Despite promises of end-to-end solutions, many companies are still having difficulties integrating with e-marketplaces, buyers and suppliers. GE Global Exchange Services is out to change that with three initiatives it is using from its e-commerce community -- a community that boasts more than 100,000 trading partners that conduct 1 billion transactions a year. GXS, a subsidiary of General Electric Co. USA, is planning to release an Adaptor Developer Kit next quarter that simplifies the handshake necessary in back-end integration. The idea -- one that GXS competitors such as IBM have already capitalized on -- is that developers can use the kit to shortcut technology integration to the GXS platform without having to customize code. Another initiative, JMS (Java Messaging Service), promises to make it easier to transport objects among data fields within a back-end system. GXS partnered with Progress Software Corp. last year to embed JMS in Progress' SonicMQ Messaging Server and incorporate it internally with GXS' integration products. By combining SonicMQ with JMS, a standardized open architecture is added to GXS' integration products, in effect shortening the time it takes to integrate partners with Web-to-legacy and application-to-application integration. The project will be in production in the next 60 days. The company's third initiative, also scheduled for release in the next 60 days, is NBT (Network-Based Translation). That service allows GXS, of Gaithersburg, MD., to take the EDI (electronic data interchange) or XML (Extensible Markup Language) data format from companies and convert it into a standardized schema in real time. The NBT service can also translate EDI schema for offline companies. Lita Fulton, president of Fulton & Associates Inc., a full-service system and telecommunications technology company in Fairfax, Va., is beta testing NBT for a large government project. 'The way the process works is we have to establish a data model first and determine how we're going to house it,' Fulton said. 'For us, it's only one portion because all vendors are not all EDI. That's what made GXS valuable for us because they can do XML translations, too'."

• [March 13, 2001] "Microsoft's Ballmer Touts XML Web Standard." By Charles Cooper. From CNET News.com (March 12, 2001). "Microsoft CEO Steve Ballmer [speaking at the quadrennial meeting here of the Association for Computing Machinery] said Monday that the spread of the XML software standard will constitute the 'next revolution' in personal computing. Speaking before a gathering of scientists and technical professionals, Ballmer said the acceptance of XML (Extensible Markup Language) as the new 'lingua franca' of cyberspace would effectively clear away lingering barriers blocking companies from exchanging information over the Internet. 'This will be a much bigger deal' than Java, Ballmer said. He added that the adoption of a common approach embodied by XML will provide a foundation 'so that everyone's work can leverage and build upon' the work of others. 'With the XML revolution in full swing,' he said 'software has never been more important.' Ballmer's two-fisted stump speech was not surprising, given that XML is the linchpin of the Microsoft.Net strategy for software-as-a-service. 'The whole gist of XML relates to the way that things (on the Internet) can talk together,' Ballmer said. In a related vein, Ballmer spoke of the benefits of SOAP (Simple Object Access Protocol) in this next phase of the development of the Internet. SOAP, which is essentially a way to deliver XML payloads around the Internet, was co-developed by Microsoft in association with IBM and UserLand Software and has since been widely adopted by many leading developers." See also referenced here the online video, "Ballmer talks up XML, .Net." [alt URL]

• [March 12, 2001] "A Request for Proposals: OpenGIS Feature Geometry." From the Open GIS Consortium, OGC Technical Committee, Geometry Working Group. Request Number 12. RFP Issue date: March 2, 2001. Letter Of Intent Due Date: 10-August-2001; Submission Due Date: 10-September-2001. "The purpose of this Request for Proposals (RFP) is to obtain proposals for technologies and needed interfaces required to access and manipulate geospatial information modeled with OpenGIS Feature Geometry. The scope of this RFP includes technologies that create, query, modify, translate, access and transfer geospatial information in the form of Open GIS feature geometry objects or collections of feature geometry objects. Of special interest are open interfaces that conform to the standards of CORBA, DCOM, SQL, and Internet standards such as JAVA and XML. Description of Item: OpenGIS Feature Information Access and Encoding using XML. By 'information encoding and service request using XML' we mean an XML compliant set of rules for the creation, population, query and response to query for the interoperable handling of feature operations, attributes, geometry, and geometry collections. Proposal Guidelines and Conventions Specific to XML: There are at least two distinct ways to use XML in an OGC Feature environment. The first is as a simple encoding and data transfer mechanism. The second is as a message format for the transmittal of requests for services and for the transmittal of the responses to those requests. The submitters must address both issues in their response to this item. Requirements Specific to XML: A proposal for Open GIS Feature Access and Encoding using XML shall additionally include (1) XML SR1: An outline how the specification might be modified to take advantage of ongoing proposals to change or extend XML, such as GML. (2) XML SR2: The specification should indicate the type of XML compliance required. (3) XML SR3: The specification should indicate how profiles (subsets) of the base standard can be defined to allow for simplified version of the XML for applications with specific requirements of compactness or performance (4) XML SR4: It should be possible to define the current GML 2.0 as a profile of the proposed XML encoding specification. (5) XML SR5: It should be possible to define the current Catalog Implementation Specification XML messages as a profile of the proposed XML messaging specification..." See: "Geography Markup Language (GML)."

• [March 09, 2001] "Open-Source Company Dives Into Web Services SOUP." By Mary Jo Foley. In CNET News.com (March 09, 2001). "While tech kingpins such as Microsoft and Oracle have rushed to one-up each other in introducing Web-delivered software, Ximian is doing work behind the scenes to make sure Web services can run on the Linux and Unix operating systems. Ximian, an open-source software company formerly known as Helix Code, believes it can help achieve Web services compatibility by porting the Simple Object Access Protocol (SOAP) distributed-computing protocol to the Gnome user interface for Linux and Unix systems. Ximian and the Gnome project were both launched by open-source evangelist Miguel de Icaza. The goal is to allow Web-delivered software -- such as the much-touted Microsoft.Net strategy -- from different companies to work on all operating systems, from Windows to Unix and Linux. Ximian has dubbed its resulting technology 'SOUP,' not an acronym but a play on the SOAP name. SOAP, in and of itself, is an interoperability mechanism, explained Aaron Skonnard, an author and trainer with DevelopMentor, a company that trains individuals in distributed-systems technology. 'Toolkit interoperability is more of an issue than SOAP interoperability,' said Skonnard. 'As long as tools are 100 percent SOAP compliant, there's no problem, but people aren't implementing 100 percent to spec.' Web services are software applications delivered as a service over the Web. They can be standalone or integrated. They can be simple, such as automatically updated stock tickers, or more complicated, such as geographically- and device-aware travel services that could reschedule travelers on later flights before their late connection hits the ground. But the full promise of Web services won't be realized unless services developed for one software maker's environment will work with those developed using tools and software from another company. That's where Ximian's SOUP could come into play. Ximian is creating a tool that will allow Web services written for Linux to be compiled for SOAP. De Icaza said the compiler could be available to developers within two months. A compiler changes the software code into language a computer can understand, allowing the computer to run the program. The company also is writing some gateway software that will allow Web services that are written to comply with Gnome's Bonobo object architecture to talk to SOAP clients and servers. Ximian plans to incorporate this middleware into the Gnome 2.0 desktop and its Evolution groupware later this year, de Icaza said. Ximian is being neither helped nor hindered in its efforts by Microsoft or other SOAP backers, de Icaza said. Microsoft representatives said the company is aware of Ximian's work but declined further comment on the significance of SOUP to Microsoft.Net. They noted that a number of companies are developing tools for making Microsoft.Net available on platforms other than those sold by Microsoft..."

• [March 09, 2001] "VoiceXML and the Voice-driven Internet." By David Houlding (The Technical Resource Connection). In Dr. Dobb's Journal Volume 26, Issue 4 (April 2001), pages 88-94. ['David Houlding examines the concept of voice portals, and shows how simple design patterns -- together with XML and XSL- can be used to deliver Internet content to web browsers and wireless devices.'] "Wireless data services are growing at a phenomenal rate, driven to a large extent by the popularity of the Internet services they are delivering. These wireless-enabled Internet services are generally accessible not only by standard web browsers, but also by some mix of web phones, two-way pagers, and wireless organizers. The adoption of these modes of Internet access is being accelerated by the effects of mainstream Internet usage maturing from an initial novelty/hype phase into a ubiquitous set of services we use as common tools in everyday life. In this mode of use, how information is presented is less important than being able to get to the particular information you require easily, when and where you need it... Voice portals leverage both the most natural form of communication -- speech -- and the most pervasive and familiar communications network -- the global telephone network. This network is accessible by either standard wired or mobile cellphones users already have, together with service plans, so no additional cost needs to be incurred for users to access Internet services via voice portals. This eliminates the expense barriers that are currently limiting the penetration of wireless services into the marketplace. Phones also permit eyes- and hands-free operation, enabling Internet service usage via voice portals in situations where wireless devices will not suffice. In this article, I'll discuss the concept of voice portals and the associated architecture. I'll then show how simple design patterns -- together with XML and XSL -- can be used to deliver Internet content and services cost effectively not only to web browsers and various wireless devices, but also to any telephone via VoiceXML (for more information on the VoiceXML Standard, see http://www.voicexml.org/). I'll then present an implementation of this architecture that uses software that is freely available on the Internet. Finally, I'll examine key business and technical issues associated with voice-driven applications. VoiceXML is a new standard with significant industry backing. It promises to create a level playing field on which voice portals may compete for outsourcing the hosting of voice applications. This will drive down cost and improve quality of service for both application providers and their customers. From the application providers standpoint, creating voice applications using VoiceXML has the advantage that content is portable across different voice portals, delivering flexibility with respect to choosing voice portals to host voice applications. Voice portals driven by VoiceXML provide a powerful complementary new mode of access that empowers users with more options regarding when, where, and how they consume Internet services. Using speech as the most natural form of communication, the existing familiar global telephone network as the most pervasive communications network, and enabling eyes- and hands-free operation, this new mode of access promises to further accelerate the growth and maturity of Internet services into a ubiquitous set of tools we use every day." Additional resources include listings and source code. See "VoiceXML Forum."

• [March 09, 2001] "Programmer's Toolchest. SAX2: The Simple API For XML." By Eldar A. Musayev. In Dr. Dobb's Journal Volume 26, Number 2 (February 2001), pages 130-133. ['SAX, the "Simple API for XML," is an efficient and high-performance alternative to the Document Object Model. Additional resources include 'sax2.txt' listings and source code. "Just as Perl became the duct tape for the Web, XML is becoming the duct tape for e-business. As a universal data format, XML glues together disparate e-business systems that, in the process of conducting everyday business, need to perform hundreds of transactions per second without outages or crashes. Such systems need XML processors that provide high performance with a small footprint. That's what SAX offers. The article describes SAX, then shows how you can use it in Visual Basic applications via the Microsoft XML (MSXML) parser."

• [March 09, 2001] "XML Document Production Tools." Prepared by Eric Prud'hommeaux (W3C). 2001-03-09. Pointers to spec-production DTDs, schemas, example documents, and tools. "This is a quick list of XML document production tools taken from Charles McCathieNevile and a quick poll..." Covers (1) XMLSpec-based Tools and (2) XHMTL-based Tools.

• [March 09, 2001] "Representing UML in RDF." By Sergey Melnik. "A testbed converter that supports automatic translation from UML/XMI to RDFS/RDF/XML is available. The UML community developed a set of useful models for representing static and dynamic components of software-intensive systems. UML is an industry standard and serves as a modeling basis for emerging standards in other areas like OIM, CWM etc. As of today there exist a variety of UML vocabularies for describing object models, datatypes, database schemas, transformations etc. The goal of this work is to make UML 'RDF-compatible'. This allows mixing and extending UML models and the language elements of UML itself on the Web in an open manner. XMI, the current standard for encoding UML in XML by OMG, does not offer this capability. It is based upon a hard-wired DTD. For example, if a third party were to refine the concept 'Event' defined in UML statecharts into say 'ExternalEvent' and 'InternalEvent', it would not be possible to serialize the corresponding event instances in XMI." [Referenced in the 'xmlschema-dev@w3.org' list: "I'd like to support your initiative. In addition to the applications you mentioned, I see UML as well-established schema language that can be used on the Semantic Web along with RDF Schema, XML Schema, DAML-O etc. Webizing UML allows leveraging a broad spectrum of tools and existing UML schemas. I while ago I took a crack at setting up UML on top of RDF and making it interoperate with other schema languages: http://www-db.stanford.edu/~melnik/rdf/uml/." This post from Sergey was in response to a message by David Ezell on a 'Proposed UML Interest Group.'] See (1) "XML Metadata Interchange (XMI)" and (2) "Resource Description Framework (RDF)."

• [March 09, 2001] "Mapping between ASN.1 and XML." By Takeshi Imamura and Hiroshi Maruyama. Pages 57-64 (with 18 references) in Proceedings 2001 Symposium on Applications and the Internet, edited by K, Ikeda. Los Alamitos, CA: IEEE Computer Society, 2001. [SAINT 2001 Symposium on Applications and the Internet, San Diego, CA, USA, 8-12 January 2001.] "Abstract Syntax Notation One (ASN.1) is a framework for representing tree structured data. Since ASN.1 data are structured data, it should be possible to represent the same information in Extensible Markup Language (XML). The translation between ASN.1 and XML will enable us to manipulate efficient ASN.1 data in a user-friendly manner. We develop a Java library for such translation, called ASN.1/XML translator. We also confirm actual ASN.1 data were translated into expected XML documents and these documents were translated back into the original data if the data were encoded according to Distinguished Encoding Rules (DER). Moreover we discuss still existing issues and try to address them, especially support of XML Schema..." See discussion of the ASN.1/XML Translator in the IBM Security Suite: "Abstract Syntax Notation One (ASN.1) is a framework for representing tree structured data. It is widely used in communication protocols (e.g., SNMP and LDAP), security protocols (e.g., X.509), data formats (e.g., PKCS#7), and so on. ASN.1 is designed for efficiency and the data is usually packed into byte boundaries, and hence is not very readable and is hard to manipulate. Since ASN.1 data is structured data, it should be possible to represent the same information in Extensible Markup Language (XML). XML is not particularly efficient in terms of data length, but is more readable, and it has many off-the-shelf free tools (e.g., XML processors for parsing and generation, XSL processors for rendering, XML editors for authoring, and so on). For such reasons, the translation between ASN.1 and XML will enable us to manipulate efficient ASN.1 data in a user-friendly manner. This is a Java library for such translation. Using this library, ASN.1 can be translated into XML and vice versa..." See also: "ASN.1 Markup Language (AML)."

• [March 09, 2001] "XML Grammars." By Jean Berstel (Institut Gaspard-Monge, Laboratoire d'informatique Université de Marne-la-Vallée, France) and Luc Boasson (Laboratoire d'informatique algorithmique: fondements et applications - LIAFA). Pages 182--191 (with 7 references) in Mathematical Foundations of Computer Science 2000 = Lecture Notes Computer Science #1893, edited by M.Nielsen, B. Rovan. Proceedings of 25th International Symposium on Mathematical Foundations of Computer Science [MFCS 2000], Bratislava, Slovakia (28 Aug.-1 Sept. 2000). Germany: Springer-Verlag, 2000. "XML documents are described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that an XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars: one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages...The paper is organized as follows. The next section [2] contains the definition of XML-grammars and their relation to DTD. Section 3 contains some elementary results, and in particular the proof that there is a unique XML-grammar for each XML-language. It appears that a new concept plays an important role in XML-languages. This is the notion of surface. The surface of an opening tag a is the set of sequences of opening tags that are children of a. The surfaces of an XML-language must be regular sets, and in fact describe the XML-grammar. The characterization results are given in Section 4. They heavily rely on surfaces, but the second one also uses the syntactic concept of a context. Section 5 investigates decision problems. It is shown that it is decidable whether the language generated by a context-free language is well-formed, but it is undecidable whether there is an XML-grammar for it. On the contrary, it is decidable whether the surfaces of a context-free grammar are finite. The final section is a historical note. Indeed, several species of context-free grammars investigated in the sixties, such as parenthesis grammars or bracketed grammars are strongly related to XML-grammars. This relationship is sketched..." [cache]

• [March 09, 2001] "Formal Properties of XML Grammars and Languages." By Jean Berstel (Institut Gaspard-Monge, Laboratoire d'informatique Université de Marne-la-Vallée, France), and Luc Boasson. Detailed version of "XML Grammars" cited above. "XML (Extensible Markup Language) is a format recommended by W3C in order to structure a document. The syntactic part of the language describes the relative position of pairs of corresponding tags. This description is by means of a document type definition (DTD). In addition to its syntactic part, each tag may also have attributes. If the attributes in the tags are ignored, a DTD appears to be a special kind of context-free grammar. The aim of this paper is to study this family of grammars. One of the consequences will be a better appraisal of the structure of XML documents. It will also illustrate the kind of limitations that exist in the power of expression of XML. Consider for instance an XML-document that consists of a sequence of paragraphs. A first group of paragraphs is being typeset in bold, a second one in italic. It is not possible to specify, by a DTD, that inavalid document there are as many paragraphs in bold than in italic. This is due to the fact that the context-free grammars corresponding to DTDs are rather restricted. As another example, assume that, in developing a DTD for mathematical documents, we require that in a (full) mathematical paper, there are as many proofs as there are statements, and moreover that proofs appear always after statements (in other words, the sequence of occurrences of statements and proofs is well-balanced). Again, there is no DTD for describing this kind of requirements. Pursuing in this direction, there is of course a strong analogy of pairs of tags in an XML document and the \begin{object} and \end{object} construction for environments in Latex. The Latex compiler merely checks that the constructs are well-formed, but there is no other structuring method. The main results in this paper are two characterizations of XML-langua- ges. The first (Theorem 4.2) is set-theoretic. It shows that XML-languages are the biggest languages in some class of languages. It relies on the fact that, for each XML-language, there is only one XML-grammar that generates it. The second characterization (Theorem 4.4) is syntactic. It shows that XML-languages have a kind of 'saturation property'. As usual, these results can be used to show that some languages cannot be XML. This means in practice that, in order to achieve some features of pages, additional nonsyntactic techniques have to be used. ... Most of the XML languages encountered in practice are in fact regular. Therefore, it is interesting to investigate this case. The main result is that, contrary to the general case, it is decidable whether a regular language is XML. Moreover, XML-grammars generating regular languages will be shown to have a special form: they are sequential in the sense that its nonterminals can be ordered in such away that the nonterminal in the lefthand side of a production is always strictly less than the nonterminals in the righthand side..." [cache]

• [March 09, 2001] "XML: The Digital Library Hammer." By Roy Tennant (Manager, eScholarship Web & Services Design, California Digital Library). In [Digital] Library Journal. March 15, 2001. "Abraham Maslow once said, 'When the only tool you own is a hammer, every problem begins to resemble a nail.' Once you understand XML and the opportunities it offers for creating and managing digital library services and collections, you will begin seeing nails everywhere. XML (Extensible Markup Language) is born of a marriage of SGML (Standard Generalized Markup Language) and the web. HTML can't do much more than describe the look of a web page, whereas SGML is too complicated and unwieldy for most applications. XML achieves much of the power of SGML without the complexity and adds web capabilities beyond HTML... XML and software If you use software such as the Cocoon publishing framework, when a user requests an XML document from your web server, the request is passed to special software. The software then applies the XML style sheet transformations to produce the HTML version that is sent to the client along with the HTML style sheet. If you don't use special software on the server for these operations, the client software (typically a web browser) must attempt to process the XML file. The latest versions of Microsoft Internet Explorer will attempt to process the file, but you're unlikely to be pleased with the result. Don't even try with Netscape. Few people know this, but any library with an integrated library system from Innovative Interfaces (with Update D) can view XML versions of catalog records. Kyle Bannerjee of Oregon State University has used this capability to provide information essential to relocating 50,000 items to a storage facility. Bannerjee also uses it to solve problems that many other libraries face, as with his program ILL ASAP (Interlibrary Loan Automatic Search and Print). Bannerjee says that 'XML and XSLT are the most significant developments in information management since relational databases and SQL.' Bibliographies are commonplace in libraries, whether as lists of books by a particular author or pathfinders by subject. What are bibliographic citations but a structured set of textual elements? XML is made for this..." See also 'Electronic Discussion Forum on the Use of XML in Libraries'

• [March 09, 2001] "Setting the Standard: XML on Campus." By Mike Rawlins. In Syllabus Magazine Volume 14, Number 8 (March 2001), pages 30-32. ['XML standards are on the horizon, and a serious long-term campus IT strategy should take them into account.'] See also "PostSecondary Electronic Standards Council XML Forum for Education."

• [March 08, 2001] "XML: Like The Air We Breathe?" By Martin Marshall (Zona Research). In InformationWeek (March 05, 2001), pages 47-53. "XML is poised to affect just about everything corporate IT does, from e-commerce applications to legacy data. But the pervasive changes it will bring about won't become apparent until the XML products under development hit the market later this year. IT managers expect XML to fundamentally improve the speed, cost, and flexibility of their business applications. It's also expected to alter the way they build new applications and integrate data from current systems. XML will have a profound effect on business processes, easing the task of exchanging data with trading partners. To some, XML is a business-process catalyst that will pick up where electronic data interchange and extended intranets fell short. Zona Research predicted early last year that the percentage of e-commerce transactions using XML would rise from .5 percent in early 2000 to more than 40 percent by the end of 2003. In a Zona Research Market Report, "XML: The Dash For Dot.com Interoperability," released last month, a survey of more than 200 companies indicates that IT managers expect XML to dramatically improve the adaptability of their businesses. XML is much more than a markup language; it's a fundamental mechanism for the automated exchange of data and the processes that act on that data. XML's data-transformation mechanisms go beyond operating environments, transport protocols, and the arcane barriers of the applications to present true interapplication communication. XML covers everything from data and data-transformation processes to schema, development tools, XML servers, and components. XML also takes into account business-process mechanisms, layered architectures, and vertical-industry bodies that make decisions about XML data representations and process definitions for their industries. XML could supplant EDI as a mechanism for transferring data between businesses and their applications. EDI has been the main way that companies exchange business forms. EDI transactions total about $750 billion per year, with about$2 billion a year spent on EDI development and deployment, according to Zona Research. EDI does the job for bidirectional interaction, but it's expensive to implement, and the embedded business rules are rigid. EDI is a point-to-point solution that must be reengineered every time a company adds a business partner. The mapping of data sets and procedures between two trading partners in an EDI environment is generally accomplished by custom coding. There's a growing movement toward converting EDI systems to XML, according to Zona Research's survey. Among the 72 percent of respondents who use EDI at their companies, seven out of eight plan to convert EDI into XML at some point. The largest group, 30 percent, will convert some of their EDI to XML this year, while 14 percent will do some conversion next year or later. They'll do it on a selective basis, however; very few will convert all of their EDI to XML by either 2001 (2 percent) or 2002 or later (4 percent). About one in eight will convert EDI to XML on an as-needed basis. XML Solutions Corp. is an early implementer in converting EDI to XML. Its XEDI product claims to be able to handle all of the ANSI X.12 EDI interfaces. With its many technical twists, it's easy to overlook the political movement behind XML. As such, it's not born of rosy optimism about global cooperation, but rather about the expedience of operating in trading communities rather than as closed systems. Each vertical industry has a major XML effort under way to define the data term definitions and schemas for industrywide exchange of data...

• [March 08, 2001] "Jena: Implementing the RDF Model and Syntax Specification." By Brian McBride (Hewlett Packard Laboratories Bristol, UK). ['Some aspects of W3C's RDF Model and Syntax Specification require careful reading and interpretation to produce a conformant implementation. Issues have arisen around anonymous resources, reification and RDF Graphs. These and other issues are identified, discussed and an interpretation of each is proposed. Jena, an RDF API in Java based on this interpretation, is described.'] "Since the W3C's Resource Description Framework (RDF) Model and Syntax specification completed its path to W3C recommendation several implementations have been developed. These differ in some aspects of their interpretation of the specification. There has been much discussion of these issues on the RDF Interest Mailing List [refs], which so far, has not produced resolution. Inter-mixed with those discussions, have been others about changes and extensions to the specification. All this has caused confusion and uncertainty that is inhibiting the acceptance and deployment of RDF. Tool builders wish to build tools that are correct and conformant. This they cannot do, because it is not clear what it means to be correct and conformant. Similarly producers and consumers of RDF wish to produce RDF whose interpretation is well defined. Uncertainty of interpretation inhibits them from doing so. One reason for the lack of resolution is that issues are discussed individually. The issues themselves however, are interlinked. It is hard for a community discussing, say the subtleties of reification to agree when the have fundamentally different views on the nature of resources and their identification. An implementer setting out to develop an implementation of an RDF tool must have an interpretation of the specification. This paper describes the interpretation developed for Jena, an RDF API in Java. The guiding principle for this interpretation was to implement, as far as possible, the specification as it is, without embellishment. It is documented here in the hope it will prove helpful to other developers." See "Resource Description Framework (RDF)."

• [March 08, 2001] "Building the Semantic Web." By Edd Dumbill. From XML.com. March 07, 2001. ['Tim Berners-Lee's vision of the Semantic Web is undoubtedly exciting, but its success will lie in the extent to which it solves real-world problems.'] "The range of people working under the broad umbrella of the Semantic Web come from many diverse communities, from the Web-focused to experienced researchers in the fields of artificial intelligence and knowledge representation. Ultimately the skills of all those involved will be required, and it's definitely beyond the scope of any one group to provide the expertise necessary to build the ultimate Semantic Web. For me, the key thing about the Semantic Web is the word 'Web'. It's our essential starting point, and the Web at large is the ecology in which the primordial Semantic Web must grow. I spend most of my time working with the Web, as a developer and a writer, and also in involvement with the community of developers and publishers that use the Web. So, as I approach the Semantic Web (or 'SW' from here on), I'm always asking the question 'how do we get this started?' There are many interesting and exciting possibilities in the realms of logic and proofs, but getting them running on the Web must be preceded b getting more basic machine processible content out there. The evolving form of the SW has to crawl before it can run. In this article I introduce the SW vision and explore the practical steps that we need to be taking to build it. The essential aim of the SW vision is to make Web information practically processible by a computer. Underlying this is the goal of making the Web more effective for its users. This increase in meffectiveness is constituted by the automation of things that are currently difficult to do: locating content, collating and cross-relating content, drawing conclusions from information found in two or more separate sources. In the software world we can often get so enthusiastic about the systems that we're creating that we stray from a focus on the user's requirements. One of the great things about the Web is that it's unforgiving when we ignore the user. Create a site that's hard to use and nobody will come. Create a technology for page markup that's difficult to grasp and nobody will use it. In fact, you might see the creation and implementation of the SW as a near impossible task: it's still difficult to get people to use as little metadata as the <title> tag in their web pages. Clearly, to get off the starting blocks, the SW has to offer enough in reward to make it worth people's time to learn new skills and to more carefully deploy their content on the Web..." References: see: (1) W3C Semantic Web Activity and (2) "XML and 'The Semantic Web'."

• [March 08, 2001] "Knowledge Technologies 2001: Conference Diary." By Edd Dumbill. From XML.com. March 07, 2001. ['The inaugural GCA Knowledge Technologies conference brought together members of diverse communities, all concerned with managing knowledge: from RDF and Topic Maps to AI.'] "The first ever Knowledge Technologies conference, hosted by the GCA, is taking place in Austin, Texas this week. It is attended by a mixed audience of librarians, AI experts, knowledge management technologists, and the Web community. As far as XML is concerned, this means people from the RDF, Dublin Core, and Topic Maps worlds. This article is a report from the first day of the conference. Opening keynote sessions included Doug Lenat from Cycorp. Doug has gone against the flow where artificial intelligence is concerned. Twenty years ago, when others were gung-ho for AI, Lenat was a pessimist. As disillusionment has set in over recent years, Lenat reports he is now an optimist. A lot of this good feeling comes from the work he's done with CYC (pronounced "psyche"). Lenat has been steadily feeding his system facts about the world for 15 years, and reports that it's starting to get to the stage where the system can help with its own development. CYC uses a codification of natural language into a formal logical language..." Note: see also the news item on OpenCyc.

• [March 08, 2001] "XML-Deviant: Toward an XPath API." By Leigh Dodds. From XML.com. March 07, 2001. ['Since XSLT and XPointer rely on XPath, developers are asking whether an XPath API should be created.'] "While the XML-DEV storms of the last few weeks show little sign of abating, some developers have been discussing the potential for an XPath API. Over the last few weeks the XML-Deviant has reported on a number of controversies surrounding the recent activities of the W3C, and a rise in the complexity and interdependence between specifications forming the 'XML family'. The debates have continued this week. Threads on XML-DEV have discussed a possible 'fork in the road' of XML's development. The press has reacted with articles like 'Why 90% of XML Standards Will Fail' and 'The relentless march of abstraction'. Simon St.Laurent noted that the current discussions echo the 'Simplified XML' debate that raised hackles on XML-DEV at the end of 1999. Leading to the formation of the SML-DEV mailing list, the split also lead to the 'Common XML Specification' and the appearance of simple tools like Pyxie...It would be good to maintain the early interest in SAXPath in order to formulate a suitable solution to these issues. There is likely to be lots of prior art that can be mined for additional ideas. Implementations of XPath can be found in many open source XSLT engines, and thus it's likely that if an API can be agreed upon, implementations would follow very quickly."

• [March 07, 2001] "Mapping W3C Schemas to Object Schemas to Relational Schemas." By Ronald Bourret (The Open Healthcare Group). March 2001. "This paper summarizes two different mappings. The first, part of the process generally known as XML data binding, maps the W3C's XML Schemas to object schemas. The second, known as object-relational mapping, maps object schemas to relational database schemas. The two mappings can be joined (and the intermediate object schema eliminated) to create a mapping from XML Schemas to database schemas. This is not shown, but left as an exercise to the reader. Note that because individual XML Schema structures can often be mapped to multiple object structures, and because individual object structures can often be mapped to multiple database structures, there are usually multiple possible mappings from XML Schemas to database schemas. The mapping is described in terms of the data model presented in XML Schemas Part 1: Structures, rather than the XML syntax used to describe schemas. Although I might eventually add a section describing the mapping based on the XML syntax, this is currently left as a (non-trivial) exercise for the reader...The purpose of this paper is to help people write code that can automatically generate object and database schemas from XML Schemas, as well as transferring data between XML documents, objects, and databases according to mappings between them. Because the set of possible mappings from XML Schemas to object schemas is fairly large, I do not expect any software to support all possible mappings any time soon, if ever. A more reasonable strategy is for the software to pick a subset of mappings that make sense for its uses and implement those." [Introduction on XML-DEV: 'I've posted a paper mapping a (very slight) variant of the data model in W3C schemas to object schemas, and then mapping object schemas to relational schemas. The first part of the paper -- mapping XML schemas to object schemas -- is likely to be of most interest to people. It is undoubtedly similar to Sun's XML data binding (JSR-31) and Veo Systems work with SOX. In fact, I wrote it because neither of those specifications seems to be publicly available. The work also appears to be a superset of the mappings in Bill La Forge's Quick and Enhydra's Zeus project. Please note that the paper is rather terse and assumes you understand the general ideas behind the mapping from XML schemas / DTDs to object schemas. If not, see the presentation "Mapping DTDs to Databases", available from: http://www.rpbourret.com/xml.'] For schema description and references, see "XML Schemas."

• [March 06, 2001] "Extending XML Schemas." By Roger L. Costello (et al.). XML-DEV post March 06, 2001. Topic: 'What is Best Practice of checking instance documents for constraints that are not expressible by XML Schemas?' "XML Schemas - Strive to be All Powerful? As XML Schemas completes version 1 and begins work on version 2, the question comes to mind: 'should XML Schemas strive in the next version to be all powerful?' Programming languages seem to have that goal - to enable a programmer to express any problem using the language. Perhaps the goal of version 2 of XML Schemas should be to provide enough flexibility that any constraint may be expressed. Alternatively, perhaps XML Schemas should just provide a core set of constraint expressing mechanisms (as it does today), and let the marketplace create a technology (technologies?) to supplement XML Schemas. Then version 2 of XML Schemas would have few changes from version 1..." For schema description and references, see "XML Schemas."

• [March 06, 2001] "Introducing DocBook." By Norman Walsh. 5 Mar 2001 or later. "I've put the slides from my "Introducing DocBook" presentation online... This material was originally presented by Norman Walsh on 7-Mar-2001 at the WinWriters Online Help Conference in Santa Clara, CA. [*1] The slides were produced from a single XML source document using XSLT. The presentation of these slides uses Cascading Style Sheets; for best results, use a browser which can display CSS formatting. You can page through the slides one at a time, or use the frames view which offers a simultaneous table of contents. [1] Well, supposed to have been presented, actually. I was unable to travel to Santa Clara due to inclement weather." See "DocBook XML DTD."

• [March 06, 2001] "RFC: A Little IDL." By Dave Winer (UserLand Software). "I've been staring with incomprehension at various Interface Definition Languages (or IDLs) for XML-over-HTTP protocols, and wondering why they're so complicated. I thought it might have something to do with the kinds of languages and editing environments they're designed for. To find out where the disconnect is, I decided to define a simple interface definition language in XML that's suitable for scripting environments, and see if people find holes in its functionality, or if it's useful, or something we want to do. That's why I called this ALIDL, so no one could confuse it with the efforts of a standards body. It's little and human-readable. The goal is to have it work with scripting systems that are wired up to XML-RPC or SOAP 1.1..." [Note on XML-DEV Motivation: WSDL appears relatively difficult or impossible for (some) scripting environments to support. I wanted to start a public exploration of IDLs, so we can learn what the issues are, and the benefits, and to spark development of aggregators and directories. I also wanted to support XML-RPC so the fresh SOAP and deep XML-RPC communities get to know each other and can work with each other. Comments are requested on the XML-RPC discussion group and/or XML-RPC mail list. Pointers to both are at the bottom of the spec.']

• [March 05, 2001] "Comparing W3C XML Schemas and Document Type Definitions (DTDs). [XML Matters #7.]" By David Mertz, Ph.D. (Idempotentate, Gnosis Software, Inc.). From IBM developerWorks, XML Library. March 2001. ['Many developers expect that XML schemas will soon supplant DTDs for specifying XML document types. David Mertz is skeptical that schemas will replace DTDs, though he believes that XML schemas are an invaluable tool in a developer's arsenal. This installment of the "XML Matters" column steps up to the challenge of comparing schemas and DTDs and clarifying just what is going on in the XML schema world.'] "While there are a number of instances where W3C XML Schemas excel, there remain, nonetheless, a number of areas where DTDs are better. Developers are continually left with tough choices... Much of the point of using XML as a data representation format is the possibility of specifying structural requirements for documents: rules for exactly what types of content and subelements may occur within elements (and in what order, cardinality, etc.). In traditional SGML circles, the representation of document rules has been as DTDs -- and indeed the formal specification of the W3C XML 1.0 Recommendation explicitly provides for DTDs. However, there are some things that DTDs cannot accomplish that are fairly common constraints; the main limitation of DTDs is the poverty in their expression of data types (you can specify that an element must contain PCDATA, but not that it must contain, for example, a nonNegativeInteger). As a side matter, DTDs do not make the specification of subelement cardinality easy (you can compactly specify 'one or more' of a subelement, but specifying 'between seven and twelve' is, while possible, excessively verbose, or even outright contorted). In answer to various limitations of DTDs, some XML users have called for alternative ways of specifying document rules. It has always been possible to programmatically examine conditions in XML documents, but the ability to impose the more rigid standard that, 'a document not meeting a set of formal rules is invalid,' essentially, is often preferable. W3C XML Schemas are one major answer to these calls, but not the only schema option out there... At least two fundamental and conceptual wrinkles remain for any 'schemas everywhere' goal. The first issue is that the W3C XML Schema Candidate Recommendation, which just ended its review period on December 15, 2000, does not include any provision for entities; by extension, this includes parametric entities. The second issue is that despite their enhanced expressiveness, there are still many document rules that you cannot express in XML schemas (some proposals offer to utilize XSLT to enhance validation expressiveness, but other means are also possible and in use). In other words, schemas cannot quite do everything DTDs have long been able to, while on the other hand, schemas also cannot express a whole set of further rules one might wish to impose on documents. At a more pragmatic level, tools for working with XML schemas are less mature than those for working with DTDs... W3C XML Schemas let XML programmers express a new set of declarative constraints on documents for which DTDs are insufficient. For many programmers, the use of XML instance syntax in schemas also brings a greater measure of consistency to different parts of XML work; others disagree, of course. Schemas are certainly destined to grow in significance and scope as they become more familiar, and as developers enhance more tools to work with them. One way to get a jump start on schema work is to automate the conversion of existing DTDs to XML schema format. Obviously, automated conversions cannot add the new expressive capabilities of XML schemas themselves; but automation can create good templates from which to specify the specific typing constraints one wishes to impose." For schema description and references, see "XML Schemas."

• [March 05, 2001] "Information Modelling using RDF. Constructs for Modular Description of Complex Systems." By Graham Klyne (Content Security Group, Baltimore Technologies). 24 pages, with 25 references. "This paper describes some experimental work for modelling complex systems with RDF. Basic RDF represents information at a very fine level of granularity. The thrust of this work is to build higher-level constructs in RDF that allow complex systems to be modelled incrementally, without necessarily having full knowledge of the detailed ontological structure of the complete system description. The constructs used draw on two central ideas: statement sets as contexts (based in part on ideas of McCarthy [Notes on Formalizing Context] and Guha [Contexts: A Formalization and Some Applications]) to stand for a composition of individual RDF statements that can be used in certain circumstances as a statement, and a system of 'proper naming' that allows entity prototypes to be described in a frame-like fashion, but over a wider scope than is afforded by class- and instance- based mechanisms... Articulated visions for the Semantic Web require that anyone must be able to say anything about anything. It is unreasonable to expect everyone to adopt exactly the same ontological structure for making statements about an entity; apart from political and perceptual differences, that approach cannot scale. This leads to my assertion that practical modelling of complex systems requires statements that can stand independently of finer ontological details. This is not a dismissal of ontological structures; work on onological frameworks such as OIL and DAML-O is needed to underpin verification of web-based information. In due course, I would expect a theory to emerge that relates descriptions based on incomplete ontologies to more rigorously complete frameworks. I view basic RDF as a kind of 'assembly language' for information modelling, and see this use of contexts and proper naming as a parallel to procedures and formal parameters in programming languages, used to aid the construction of complex object descriptions without adding new formal capabilities. The constructs presented here are being used in the following ongoing experimental developments: (1) A graphical tool for RDF modelling. (2) An experimental RDF-driven expert shell. We also aim to develop mechanisms for trust modelling and inference; modelling social trust structures and overcoming the brittleness of purely cryptographically based approaches to trust in e-commerce, etc. Another area for investigation is the design of mechanisms for managing non monoticic reasoning, and other logical extensions of contexts. In messages to the RDF interest group, Dan Brickley has proposed an alternative approach to labelling anonymous RDF resources; i.e., resources whose formal URI or URI reference is unknown. The outcome of these discussions may affect the exact form of naming preferred." Note from GK: 'I've finally got around to putting on the web a note I drafted some time ago describing some thoughts about using RDF and contexts for modelling complex objects and concepts. It draws heavily on my earlier note about contexts and RDF, and adds some other thoughts about modularizing complex RDF graphs.' - Posted to the RDF interest group. See "Resource Description Framework (RDF)." [cache]

• [March 03, 2001] "BEA's Silversword' To Offer XML Toolkits For Web-Services Creation. Next Version of WebLogic Server Will Have Full Support for SOAP, UDDI, WSDL." By Elizabeth Montalbano. In Computer Reseller News (February 26, 2001). "BEA Systems is expected to unveil a new version of its WebLogic Server in June that has full support for XML simple messaging and description standards. The product, code-named Silversword, will provide product-level toolkits for solution providers to create Web services with Simple Object Access Protocol (SOAP), Universal Description, Discovery and Integration (UDDI) and Web Services Description Language (WSDL), says John Kiger, director of product marketing for BEA's E-Commerce Server Division. Kiger says supporting these technologies for 'the building blocks of basic Web services' is just one part of the company's Web services strategy, which BEA Chairman and CEO Bill Coleman unveiled in his keynote Monday at BEA's eWorld conference held here. Similar to an announcement made by Sun Microsystems several weeks ago, Coleman outlined a Web-services strategy based on the Java 2, Enterprise Edition (J2EE) specification and XML standards such as SOAP (recently adopted as a subset of the ebXML initiative), WSDL and UDDI. A beta version of BEA's SOAP toolkit currently is available for its WebLogic Server. Kiger says that the other aspect of BEA's Web-services strategy will be to support XML-based standards for deploying more complex, transactional-based Web services, such as the ebXML standard. Collaborate already contains support for Business Transaction Protocol (BTP), a technology BEA created and submitted to the standards consortium OASIS as a possible standard to be used in conjunction with the ebXML standard. BTP provides a standard way to define guaranteed message delivery, security and the semantics of transactions for Web services. Louise Smith, vice president of marketing for BEA's E-Commerce Integration Division, says that Collaborate also currently supports the RosettaNet standard for interfacing with trading partners in XML..." See the announcement: "BEA Unveils Comprehensive Web Services Strategy and Support for Widest Range of Web Services Standards in the Industry. Web Services Architecture of BEA WebLogic E-Business Platform Enables Real Business-to-Business Transactions and Collaboration over the Internet."

• [March 03, 2001] "Petition to withdraw xsl:script from XSLT 1.1." By Clark C. Evans, Peter Flynn, Alexey Gokhberg, et al.. See the posting of 2001-03-01. "XSLT provides an extension mechanism whereby additional functionality can be identified with a URI reference and implemented in a manner defined by a particular XSLT processor. This mechanism provides an opaque layer between the extension function's usage and its implementation -- allowing for many implementations of an extension function regardless of language or platform. This extension facility provides a rich playground where new features can be prototyped and even put into production. However, to balance this much-needed flexibility, the syntax makes it clear that such added functionality is, in fact, an 'extension' and therefore may not be portable across XSLT implementations. Success of this extension mechanism has brought about request for change by several parties. One change is the official Java and Javascript extension function binding. Although this petition does not specifically challenge this addition, some question the wisdom of this decision. An official binding could encourage wholesale importation of constructs from Java and Javascript into XSLT without thought as to how those constructs would or should be supported in XSLT proper. A second change, the addition of xsl:script, is what we challenge with this petition. As users and implementers of XSLT, we request that the W3C withdraw section 14.4, Defining Extension Functions from the current XSLT 1.1 working draft for the following reasons [...]" Note: this petition created some controversy on the XSL-List.

• [March 03, 2001] "Ostensible Markup Language. Using OML to create a little language for device name characterization." By Rich Morin. In UnixInsider (March 2001). ['The Meta Project's file-tree browser is supposed to recognize path names and supply descriptive information, but in cases like /dev/*, this can be a real challenge. Using Perl and OML, an informal variant of XML, however, Rich Morin has pieced together a solution, and in this month's Silicon Carny, he shares it with you.'] "Extensible Markup Language (XML) is, like Java, a strongly hyped language. I have even seen it presented as the way to standardize all computer-to-computer communication. Bosh. Nonetheless, XML can be a very useful addition to your bag of programming tricks. In particular, there's an informal variant of XML that's a really handy way to encode control files, intermediate data, etc. Never one to resist a pun, I call this variant Ostensible Markup Language (OML). Hype, fancy tools, and standardization aside, XML is simply a convenient format for serializing data structures. It handles hierarchical structures with ease, can be coerced into handling cross-linkages, and is very friendly to the addition of new fields. In short, it solves many of the limitations found in the traditional Unix flat file format. OML, by my informal definition, looks enough like XML to pass muster with parsers such as XML::Simple, but it may not have Document Type Definitions (DTDs), style sheets, or other niceties. It may also contain things, such as Perl regular expressions, that aren't considered kosher by normal XML standards. OML is easy for programs to generate, reasonable for humans to read (and edit, if need be), and trivial for programs to ingest. If you don't find it to be all of these things, you're probably doing something wrong! I won't pretend that this is particularly elegant, but it gets the job done in a small and reasonably simple amount of code. Part of the reason for this brevity lies in the expressive power of Perl. The CGI script as a whole, moreover, benefits from the convenience of Perl's many handy modules. Another benefit comes from using OML as a tool to build a little language. By creating OML-based parsing macros, complete with embedded regular expressions, I was able to encode some fairly complex notions in a very compact, yet malleable, format. To see the demo in action, start at the Meta Demo Help Page."

• [March 03, 2001] "XML Ain't What It Used To Be." By Simon St. Laurent. From XML.com. February 28, 2001. ['Current XML development at the W3C threatens to obliterate the original promise of XML by piling on too many features and obscuring what XML does best.'] "Current XML development at the W3C threatens to obliterate the original promise of XML -- a clean, cheap format for sharing information -- by piling on too many features and obscuring what XML does best. While users may demand some of those features for some applications, features for some users are turning into nightmares for others. Rather than creating modules users can apply when appropriate, the W3C is growing a jungle of specifications which intertwine, overlap, and get in the way of implementors and users. Various W3C activities seem to be converting XML documents from labeled, structured content to labeled, structured, and typed content. The primary mechanism for performing this transformation is the W3C XML Schema Definition Language, the most complex and controversial of all of the XML specifications, and the only one that's generated credible competition hosted at other organizations (RELAX through ISO, TREX through OASIS). Widespread grumbling about W3C XML Schemas is a constant feature of the XML landscape, with no sign of fading... The release of the Requirements for both XSLT 2.0 and XPath 2.0 suggest that the W3C plans to drive W3C XML Schema technologies deeply into the rest of XML. The requirements describe operations which both require a "post-schema validation infoset" (PSVI) and depend on parts of the W3C XML Schema spec, like the regular expression syntax defined in Appendix E of XML Schema: Datatypes. This interweaving of specifications has a number of consequences. First, it raises the bar yet again for developers creating XML tools. While borrowing across specifications may reduce some duplication, it also requires developers to interpret tools in new contexts. (As the recent XPointer draft demonstrates, there can be unexpected consequences.) Developers with existing code bases now have to teach that code about complex types. Since none of these documents offer conformant subsets, they have to be swallowed in large chunks..."

• [March 03, 2001] "XML-Deviant: Does XML Query Reinvent the Wheel?" By Leigh Dodds. From XML.com. February 28, 2001. ['XML developers contend that the overlap between XML Query and XSLT is so great that they aren't separate languages at all.'] "Debates on the XML-DEV and XSL mailing lists over the last two weeks concern the futures of XSLT, XPath, and, the latest addition to the W3C XML toolkit, XML Query. There are no signs of these debates ending this week. Discussion on XML-DEV about the design of XML Query rages on... Both sides of the debate have made convincing arguments. It's obviously desirable to factor out common features between specifications, as Evan Lenz has suggested. But having multiple tools available when tackling a job is often beneficial, which suggests that XML Query should not be dismissed out of hand. Additional lessons may also be learned from tackling similar problems from a different perspective, although to benefit in the long-term, refactoring may still be required at a later date. The common topics in the recent discussions demonstrate that the community has a number of concerns. Hopefully these can be adequately addressed if the XML Query and XSLT Working Groups further coordinate their efforts. In reality, these concerns are over early draft specifications and experience has shown that significant revisions may occur to a specification as it moves from Working Draft to Recommendation."

• [March 02, 2001] "Representing vCard Objects in RDF/XML." By Renato Iannella (IPR Systems). A submission to the World Wide Web Consortium from IPR Systems Pty Ltd. Reference: W3C Note 22-February-2001. "This note specifies a Resource Description Framework (RDF) encoding of the vCard profile defined by RFC 2426 and to provide equivalent functionality to its standard format. The motivation is to enable the common and consistent description of persons (using the existing semantics of vCard) and to encode these in RDF/XML. Details: "This note specifies a Resource Description Framework (RDF) expression that corresponds to the vCard electronic business card profile defined by RFC 2426. This specification provides equivalent functionality to the standard format defined by VCARD Version 3.0. RDF is an application of the Extensible Markup Language. Documents structured in accordance with this RDF/XML encoding may also be known as 'RDF vCard' documents. This specification is in no way intended to create a separate definition for the vCard schema. The sole purpose for this note is to define an alternative RDF/XML encoding for the format defined by VCARD. The RDF vCard does not introduce any capability not expressible in the format defined by VCARD. However, an attempt has been made to leverage the capabilities of the XML and RDF syntax to better articulate the original intent of the vCard authors. RDF uses the XML Namespace to uniquely identify the metadata schema and version. For vCard, the following URI is defined to be vCard Namespace: http://www.w3.org/2001/vcard-rdf/3.0#. [Staff comment]: "The Submission relates to the following W3C Activities: Semantic Web: (1) In the RDF Interest Group, which tracks RDF experience, applications, and deployment. (2) In the RDF Core WG, which is responsible for addressing open issues and is chartered to consider an update to the RDF Model and Syntax Recommendation. The submission will be brought to the attention of the RDF Interest Group." See the Submission Request and W3C Staff Comment. See also "vCard Electronic Business Card."

• [March 02, 2001] "Synchronized Multimedia Integration Language (SMIL 2.0) Specification." W3C Working Draft 01-March-2001. Edited by Jeff Ayars (RealNetworks); Dick Bulterman (Oratrix); Aaron Cohen (Intel); Ken Day (Macromedia) et al. Latest version URL: http://www.w3.org/TR/smil20. This document specifies the second version of the Synchronized Multimedia Integration Language (SMIL, pronounced 'smile'). SMIL 2.0 has the following two design goals: (1) Define an XML-based language that allows authors to write interactive multimedia presentations. Using SMIL 2.0, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen. (2) Allow reusing of SMIL syntax and semantics in other XML-based languages, in particular those who need to represent timing and synchronization. For example, SMIL 2.0 components are used for integrating timing into XHTML and into SVG." SMIL 2.0 is defined as a set of markup modules, which define the semantics and an XML syntax for certain areas of SMIL functionality. SMIL 2.0 deprecates a small amount of SMIL 1.0 syntax in favor of more DOM friendly syntax. Most notable is the change from hyphenated attribute names to mixed case (camel case) attribute names, e.g., clipBegin is introduced in favor of clip-begin. The SMIL 2.0 modules do not require support for these SMIL 1.0 attributes so that integration applications are not burdened with them. SMIL document players, those applications that support playback of "application/smil" documents (or however we denote SMIL documents vs. integration documents) must support the deprecated SMIL 1.0 attribute names as well as the new SMIL 2.0 names." [cache]

• [March 02, 2001] "Microsoft Releases XML Kit, Specification." By Margret Johnston. In InfoWorld (March 02, 2001). Microsoft On Friday released a beta version of its XML for Analysis software development kit and an updated XML for Analysis protocol specification, giving developers tools needed to write XML-based applications aimed at spurring the deployment of sophisticated analytical databases across multiple platforms. XML for Analysis is a new online analytical processing protocol that enables the transfer of information between analytical databases and client applications, regardless of the language used to write the application, Microsoft said in a release. It leverages not only the open Internet standard XML but also SOAP (Simple Object Access Protocol) and HTTP. The new protocol is designed to standardize the data access interaction between a client application and an analytical data provider such as OLAP (online analytical processing) and data mining. More than 50 industry-leading vendors contributed to XML for Analysis, which Microsoft described as a vendor- and platform-independent extension to its OLEDB (object linking and embedding database) for OLAP and OLEDB for Data Mining protocols. With the release of XML for Analysis, developers are able to add analytic capabilities to any client for any device or platform using any major programming language, Microsoft said." See the announcement: "Microsoft Delivers First XML-Based Protocol for Cross-Platform Analytics."

• [March 02, 2001] "Mapping the XTM Syntax to the XTM Conceptual Model." By Daniel Rivers-Moore. Posted to the XTM mailing list. 2001-03-02. "Attached is my work in progress towards a formal expression (in UML) of the mapping from the XTM Conceptual Model to the XTM Interchange Syntax. This is intended as a suggestion of an approach and a start towards a mapping, not as a completed piece of work...The diagrams used in this section are 'class diagrams', using the conventions of the Unified Modelling Language (UML). In a class diagram, each rectangle represents a class of objects (a kind of thing that can exist), and the words in the rectangle are the name of that class. The lines and arrows between the rectangles represent relationships that exist or can exist between instances of those classes (individual things of those kinds). In an object diagram, each rectangle represents an individual object, and the words in the recangle are the name of the individual, followed by a colon, followed by the name of the class of which it is an instance. The lines between the rectangles represent relationships that exist between those individual objects..." See (1) TopicMaps.Org, (2) XTM Document Web site, and (3) "(XML) Topic Maps."

• [March 02, 2001] "EXSLT 1.0 - Common, Sets and Math." Posting from Jeni Tennison to the XSL-List. March 02, 2001. Thanks to those of you that commented on the last EXSLT draft. I've put up a new draft for user-defined functions and a couple of handy extension functions at: http://www.jenitennison.com/xslt/exslt/common/. There's a list of changes to the last draft there, but also of interest is that I've created a couple more documents at: http://www.jenitennison.com/xslt/exslt/sets/ and http://www.jenitennison.com/xslt/exslt/math/ that hold some extension functions. These are intended to be a starting point for a number of groups of standard (built-in) functions. The most important issues for developing these functions are (a) whether there are other sets of functions that we should define and (b) what functions we should have in them. These documents are just a starting point - please post any comments and suggestions here..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

### February 2001

• [February 28, 2001] "The Upper Cyc Ontology in XTM." Edited by Murray Altheim (Sun Microsystems). Reference: Sun Microsystems Technical Report 27-February-2001. "This Technical Report documents research and development of an XML Topic Map (XTM) representation of the Upper Cyc Ontology, including a distribution of five XTM topic maps based on features of the ontology. This Technical Report plus any associated software and/or documentation may be submitted to TopicMaps.Org with the goal of promoting XML Topic Maps (XTM) as a suitable ontological framework, as well as a source of XTM Published Subject Indicators (PSIs). This Technical Report is a Sun Microsystems Working Draft, intended for review and comment by interested parties. It is a very preliminary 'work in progress' release, currently has no formal status, and its publication should not be construed as endorsement by either Sun Microsystems, Inc. or any other body." See the discussion and "(XML) Topic Maps."

• [February 28, 2001] Geography Markup Language (GML) 2.0. Edited by Simon Cox (CSIRO Exploration & Mining), Adrian Cuthbert (SpotOn MOBILE), Ron Lake (Galdos Systems, Inc.), and Richard Martell (Galdos Systems, Inc.). OGC Document Number: 01-029. February 20, 2001. Also in PDF and in .ZIP format. The Open GIS Consortium, supporting Geospatial and Information Technolgy Industries with open standards specifications, has now released Geography Markup Language (GML) 2.0 with a complete W3C XML Schema notation. Abstract: "The Geography Markup Language (GML) is an XML encoding for the transport and storage of geographic information, including both the spatial and non-spatial properties of geographic features. This specification defines the XML Schema syntax, mechanisms, and conventions that (1) Provide an open, vendor-neutral framework for the definition of geospatial application schemas and objects; (2) Allow profiles that support proper subsets of GML framework descriptive capabilities; (3) Support the description of geospatial application schemas for specialized domains and information communities; (4) Enable the creation and maintenance of linked geographic application schemas and datasets; (5) Support the storage and transport of application schemas and data sets; (6) Increase the ability of organizations to share geographic application schemas and the information they describe. Implementers may decide to store geographic application schemas and information in GML, or they may decide to convert from some other storage format on demand and use GML only for schema and data transport." See "Geography Markup Language (GML)." [cache HTML/ZIP; cache, PDF]

• [February 28, 2001] "XMML: Standards-Compliant Transport Of Geoscientific Data Online In The Exploration And Mining Industry." By CSIRO Exploration and Mining (Dr Simon Cox, PO Box 437, Nedlands WA 6009) and Fractal Graphics p/l (Dr Nick Archibald). 25 pages. 2000-02-22. "We propose to develop the eXploration and Mining Markup Language XMML, a web-compatible XML based exploration and mining data transfer format. This will use a sophisticated geology domain model built on the ISO geographic standards, OpenGIS Consortium implementations, and World Wide Web Consortium encoding recommendations. Because the geology model is built merely as a 'schema' on top of a generic geospatial infrastructure, it will be compatible with both generic (e.g., GIS, CAD, DBMS, spreadsheet, web-browser) and specialised (geology modelling, mechanics and fluid-flow, resource estimation, mine-planning etc) software for analysis, modelling, visualisation and transfer. The system will be capable of describing rich 3D geology, including boreholes, geophysics and analytical data, so that data can easily be exchanged between software applications, between offices, and between explorers, contractors, data-managers and regulators on a transactional basis. The self-describing plain-text form of XML documents also makes them ideal for archival purposes, overcoming the problem of loss of data because of software incompatibilities. See: "Exploration and Mining Markup Language (XMML)." [cache]

• [February 27, 2001] "Working with XML: The Java API for XML Parsing (JAXP) Tutorial." By Eric Armstrong. [Updated: "Remember that all the package names have changed! So none of the examples will work, for the moment. However, most of the information is still applicable."] This tutorial covers the following topics: (1) Part I: Understanding XML and the Java XML APIs explains the basics of XML and gives you a guide to the acronyms associated with it. It also provides an overview of the Java XML APIs you can use to manipulate XML-based data. To focus on XML with a minimum of programming, follow The XML Thread, below. (2) Part II: Serial Access with the Simple API for XML (SAX) tells you how to read an XML file sequentially, and walks you through the callbacks the parser makes to event-handling methods you supply. (3) Part III: XML and the Document Object Model (DOM) explains the structure of DOM, shows how to use it in a JTree, and shows how to create a hierarchy of objects from an XML document so you can randomly access it and modify its contents. This is also the API you use to write an XML file after creating a tree of objects in memory. (4) Additional Information contains a description of the character encoding schemes used in the Java platform and pointers to any other information that is relevant to, but outside the scope of, this tutorial..." See "Java API for XML Parsing (JAXP)."

• [February 27, 2001] "IBM Beefs Up Content Manager." By Barbara Darrow. In InternetWeek (February 26, 2001). "Managing information in its myriad forms has become a huge business -- estimated to hit the $10 billion mark by 2004, according to one researcher. With enhancements to its Content Manager software, IBM Corp. said it handles more information types than anyone else. Content Manager Version 7.1 adds new XML interfaces, the ability to handle Xerox metacode format, and integrates tightly with Siebel Systems' Call Center application, the company said. It also supports MPEG-2, Hot Media, and QuickTime streaming formats... The new software, available for Windows NT and Windows 2000, as well as AIX, will be unveiled Monday at IBM's Partnerworld Conference in Atlanta. Analysts said IBM has done a good job fleshing out its offering, and ensuring that it will interoperate with various third-party products from Vignette, Documentum, and Interwoven. Still, reliance on multiple vendors to fill an application void makes some corporations nervous. The proliferation of data types and distribution vehicles -- print, web etc. has made management increasingly complex. Currently, IBM partners with a variety of vendors to fill gaps in its own lineup. Wittle said the goal of Content Manager is to work well with a bevy of third party offerings. Pricing for Content Manager Version 7.1 starts at$15,000 per server plus 2,000 per concurrent user..." • [February 27, 2001] "User-Defined Extension Functions in XSLT." By Jeni Tennison. February, 2001. [A draft document that summarises recent public discussions on user-defined extension functions written in XSLT; informed and inspired by discussions on XSL-List with David Carlisle, Joe English, Clark C. Evans, Dave Gomboc, Yevgeniy (Eugene) Kaganovich, Mike Kay, Steve Muench, Miloslav Nic, Francis Norton, Dimitre Novatchev, Uche Ogbuji, and David Rosenborg.] "This document describes a method for defining user extension functions using XSLT in XSLT 1.0. XPath contains a number of functions that allow you to perform manipulation of strings, numbers, node sets and so on. While these cover basic functionality, there are often situations where stylesheet authors either need to or want to do more within an XPath. Most XSLT applications offer a range of extension functions. However, using only implementation's extension functions limits the stylesheet author to those thought of and implemented by a particular vendor. It also means that the stylesheet itself is limited to that vendor. Allowing users to define their own extension functions enables them to create the functions that they need for their particular application and enhances the portability of their stylesheets. Stylesheet authors need to have a ways of defining their own functions. These definitions may be in any programming language, but it is likely that different XSLT processors will support different languages. The one language that all XSLT processors support is XSLT. It therefore makes sense to allow stylesheet authors to define extension functions using XSLT - the implementation may not be as efficient as it would be in, say, Java, but at least it can be supported across platforms and implementations, and limits the number of langauges that stylesheet authors have to learn... This document is a first draft for review by the implementers of XSLT processors and the XSLT stylesheet authors. It is based on discussions on XSL-List. Comments on this document should be sent to XSL-List..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." [cache] • [February 27, 2001] "FXPath - Functional XPath." By David Rosenborg (Pantor Engineering AB). February 27, 2001. ['A comment on the document "User-Defined Extension Functions in XSLT" (called EXSL here), written by Jeni Tennison.] "The purpose of this document is to outline an alternative approach to writing extension functions in XSLT.The EXSL document and this document result from a recent discussion on the XSL mailing list (xsl-list@lists.mulberrytech.com). The EXSL document is in large an excellent compilation and presentation of the ideas and issues discussed on the XSL list. However, the EXSL document presents one of two rather different approaches on how to implement the extension functions. This document tries to present the other. The EXSL approach is to retrofit some XSLT instructions so that they can deal with all types in XPath, notably node sets. This document wants to show that there is a more natural way to accomplish the same result: write extension functions in XPath to deal with XPath types. Since XPath 1.0 lacks some vital constructs to do this, this document presents a superset, called Functional XPath (FXPath), that makes this possible in a convenient way. This document has a much narrower scope than the EXSL specification. It is concentrated around how to actually define the extension functions. The issues on calling functions, defining sets of common extension functions etc are well covered in EXSL and are not handled here. However, the set of example functions are reimplemented here, in FXPath, to enable a side by side comparison..." • [February 27, 2001] "The Relentless March of Computer Abstraction." By Frank Willison (O'Reilly editor in chief). From the O'Reilly Network. February 23, 2001. "I spent February 21-23 at XML DevCon in London. It's been a great opportunity for me to immerse myself in developments of this exciting technology. The two major directions of XML, based on what I've learned at this conference, seem to be: (1) Abstraction (2) Metadata. I'll write a section on each of these big ideas. Then I'll have comments on some interesting ideas or controversies that came out of the sessions I attended. Then I'll explain to you how, by attending XML DevCon, I've learned how the world is going to end..." • [February 26, 2001] "Budding B2B Standard Faces Big Problems. [CPExchange] Specification for Sharing Consumer Data Has No Users, Faces FTC Review." By Patrick Thibodeau. In ComputerWorld (February 12, 2001). "A data standard created to act as a high-tech lubricant for the exchange of customer information is facing problems, including a just-announced review by the Federal Trade Commission (FTC) and, perhaps more important, a lack of big end-user acceptance so far. The Customer Profile Exchange standard, or CPExchange, offers companies a way around numerous data types and the custom-designed interfaces needed to translate them. If the standard doesn't take off, the process may not improve, proponents say. 'At this point, we do data exchanges that are disastrous. Everybody speaks a different language, everybody has ways of pushing information -- from text files to XML. It is very, very nasty,' said Henri Asseily, chief technology officer at Los Angeles-based BizRate.com, a company that provides customer-generated ratings of e-commerce sites and one of 70 companies that is a member of the CPExchange Network. The first version of CPExchange was published in October, but so far, no company has adopted it. Most of its backers are vendors, with IBM being the largest. Only a few major end-user companies were involved in the standard's development, and two of those companies have apparently distanced themselves from this effort: First Union Corp. in Charlotte, N.C., and Charles Schwab & Co. in San Francisco. Both companies say they have no plans to implement the standard. Asseily said he believes the standard can solve the data exchange problems, but the 127-page specification is 'so complicated that it's very, very difficult for companies to make heads or tails of it.' The FTC announced earlier this month that it will hold a workshop on March 13 on the potential data privacy issues raised by company-to-company exchanges of customer information, prompted in part by a letter from Sen. Richard C. Shelby (R-Ala.). Shelby claims that the CPExchange technology gives companies a vastly improved ability to share and exploit personal information in pursuit of profit.'... A major selling point for proponents of the CPExchange is the standard's ability to incorporate an individual's privacy preferences. For instance, a company that needs to transmit consumer data to a supplier could attach privacy restrictions that set limits on the use of the data, such as third-party sharing." See "CPExchange Network." • [February 26, 2001] "Where XML Specifications are Clicking." By Valerie Rice. In eWEEK (February 25, 2001). "If you like an argument, you'll love XML. Extensible Markup Language is touted as the ultimate solution to every industry's dreams of business-to-business process efficiency. But first, of course, there has to be agreement on the schemata and tags that go into industry-specific XML standards... Some XML initiatives in industries such as insurance and health care were able to get there quickly because they'd already shed meeting-room blood over EDI (electronic data interchange) standards. Other initiatives -- such as RosettaNet, the high-tech industry's XML standard for manufacturing -- started early and were given a boost by the sudden rise of e-commerce. Here are a few vertical industry initiatives that have made remarkable progress and the lessons they can teach the rest: (1) Health care: health care companies early on recognized the B2B benefits that XML could bring, said Liora Alschuler, an independent XML consultant who works with HL7, the health care standards body. HL7 is made up of health care providers, vendors, consultants and other groups with an interest in using XML to share clinical, patient, financial and other information online. HL7's first area of focus was what Alschuler called the 'XMLification" of EDI'... (2) Financial services Among the industries driving xml standards, financial companies -- banks, credit card companies and so forth -- are making a lot of progress, said Wes Rishel, an analyst at Gartner Group Inc., in Stamford, Conn. Unlike in other industries, however, not all the work is being done by standards organizations. Vendors such as Veri Sign Inc., of Mountain View, Calif., and large institutions such as Visa International are playing active roles in driving the standards. That could be one of the reasons that the financial services industry's standard for digital signatures, the so-called XKMS -- for XML Key Management Specification -- was pulled together so quickly... (3) Insurance: In October 1998, the insurance industry started a project to create XML for life insurance information and processes. The industry followed it nine months later with a property and casualty effort. And so far it's all been fairly quick and painless... The ACORD XML spec for life insurance and the separate spec for property and casualty allow insurance companies to handle inquiries and quotes, as well as submit new business and process claims online... (4) High-tech manufacturing: Rosettanet is arguably the mother of all industry-specific XML efforts, with 300 consortium members and standards covering the gamut of products, from raw materials to electronic components. All told, the RosettaNet initiative identified more than 100 processes and created standard XML definitions and tags to support them. Standards exist today covering everything from inventory management to order management and design wins..." See also Chart: The Case for XML. • [February 26, 2001] "Why 90 Percent of XML Standards Will Fail." By John R. Rymer (President and founder of Upstream Consulting). In eWEEK (February 26, 2001). "Those who are making XML standards are reliving the mistakes of past standards bodies. I can see what's coming and it is a whole lot less than any of us would like or need. I think 90 percent of the current activities will not produce meaningful technology. In my view, that's failure. Pardon my skepticism, but I've lived through too many can't-miss, can't-live-without-it standards efforts. There was the gargantuan effort to create an alternative to TCP/IP by the International Standards Organization (ISO), the tortured efforts to standardize the Unix operating system, the Open Software Foundation's DCE debacle, and the gun-to-the-head tactics of the Object Management Group (OMG). Of these, only the OMG's CORBA can be called a commercial success. Each of these efforts suffered from one or two mistakes that doomed it to failure... Mistake #1: Nonalignment; Mistake #2: Over-promise; Mistake #3: Overdo it; Mistake #4: Overreach... Pardon me for being cranky about this, but the net effect of XML standards has been to slow adoption of XML products and technology. There's too much noise, too much hype, too many promises--too much risk. Shouldn't we know better by now?" • [February 24, 2001] "Interfacing With XML." By [Editor] Ajit Sagar (VerticalNet Solutions). In XML-Journal (March 2001). "A couple of weeks ago I participated in several technical meetings to define the next phase of the architecture of our current products. As usual, any initiatives for a new architecture include requirement considerations for open APIs, platform independence, and loose coupling between components as the basic criteria for the design of the platform components. Our architecture is based on J2EE and XML. The APIs that are exposed by the infrastructure can be categorized into the programmatic APIs that are exposed through object methods and structural APIs. J2EE offers the available programmatic (method-call based) APIs as a programmatic interface. XML offers an effective way of exposing structural APIs. It also provides an elegant mechanism for achieving configuration for the deployment of applications. XML offers an effective way of bridging data transfers between decoupled components or applications. The hierarchical nature of XML documents allows for the exchange of data, which retains object-style relationships such as aggregation and inheritance. Subsequently, XML offers the ability to create flexible and extensible APIs that enable applications to expose their functional capabilities. At the same time, XSLT and XPATH provide processing capabilities such as a search based on pattern matching, and the ability to match data based on matching algorithms. These functional capabilities manifest themselves as XML-based APIs that are universally understood by disparate applications. After all, XML expresses data in a string format, which is human-readable..." • [February 24, 2001] "Microsoft Commits To XML." By Wei Meng Lee (The Centre for Computer Studies, Ngee Ann Polytechnic, Singapore). In XML-Journal (March 2001). "Beginning with Internet Explorer 4.0 (IE 4), Microsoft has provided users of its operating system with a unique way of viewing XML documents. If you're running IE5, you already have the Microsoft XML parser (MSXML) installed. The MSXML parser has come a long way, beginning with version 2.0 (IE5) up to the latest version, 3.0. Depending on the software and operating system, you most likely have MSXML 2.0 (IE5) or 2.5 (Windows 2000) on your system. Since January 2000, Microsoft has showed its commitment to XML by releasing a new XML parser every other month (preview release). That early release was version 2.6, renamed version 3.0 last March. Each preview release contains improvements in performance as well as support for the W3C XML 1.0 specifications. The long-awaited production version of MSXML 3.0 was finally released last November. In this article I'll discuss some of its features, and, specifically, show you how to get started using it. In subsequent installments I'll go into greater detail on each of its components. This article covers the following: (1) Installing MSXML 3.0 on your system; (2) Using XSLT and XPath; (3) Using the Internet Explorer tools to validate XML documents... Conclusion: With MSXML 3.0, Microsoft has once again proved its commitment to the XML technologies. This isn't surprising since XML is the foundation technology for many of Microsoft's future products. In this article I've tried to avoid bogging you down with all the technical jargon related to MSXML. In a forthcoming article I'll describe how the MSXML DOM can be manipulated programmatically. I'll rewrite the XSLT stylesheet with ASP and DOM and show you how they can be used to achieve the same purpose." • [February 24, 2001] "Designing An Attribute Search Engine For B2B Negotiations." By Stephen Rao and Mary Xing. In XML-Journal (March 2001). "More and more companies are building B2B systems to conduct business on the Internet. These systems are different from catalog-based B2C Web sites. Among other things, B2B systems usually need to provide stronger negotiating capabilities. XML documents are flexible and self-explanatory and are now the preferred solution for B2B information exchanges. A good application of the technology is to use XML files as workflow documents to convey the attributes in trading negotiations. We found that while XML documents are flexible and easy to understand, searching information from such plain text files is difficult. A B2B trading system needs both flexible negotiations and convenient query capabilities. We designed an Attribute Search Engine (ASE) for XML trading negotiations using Java EJBs and a relational database (RDBMS). It's based on the generic concepts we distilled from the use of attributes and enables powerful data searches for XML attribute documents. The resultant system has the strength of both sides and delivers the functions needed in a practical B2B system. The attribute concepts we established are generic. The engine can be applied many other places...In e-commerce negotiations the parties need the flexibility to use various attributes in a workflow document to describe their commodities and terms. We conclude that implementing those documents with XML text files is better than the conventional RDBMS tables. First, relational tables have a finite number of predefined data entities. In negotiations many new things may come up dynamically. It's impractical to predefine them. Second, relational tables usually require high data integrity for transactions as they tend to have tight data constraints among one another. Negotiating documents should be loosely composed with fewer data restrictions. Third, with normalized data, the content of one document is often scattered in a number of tables in RDBMS. The danger of unintentionally changing historic data is greater. XML is effective in modeling document-oriented trading processes because the negotiating parties can add conditions/attributes at will to the same self-explanatory document with greater flexibility. Buyers and suppliers may pass around an XML document in a negotiation until a deal is reached. After the deal they can simply archive the XML as a single file, hence the information is retained independent of other variables... Conclusion Searching and analyzing B2B trading details are important to a successful e-commerce provider and its users. While XML offers users flexible attribute negotiations, an ASE makes information search and analysis easy. They work together nicely to provide the capabilities desired in a practical B2B trading system." • [February 24, 2001] "Converting Your Client/Server Applications To The Internet." By Victor Rasputnis (CTI) and Anatole Tartakovsky (CTI).. In XML-Journal (March 2001). "IT projects closely follow the path of technology. .For example, the number of Java/XML/HTML projects is increasing, replacing PowerBuilder or VisualBasic systems developed just a few years ago. And developers are asking themselves the question: Do I have to write the same app from scratch? Again? Integrating existing systems developed in previous millennium environments with the Internet is a costly and difficult task. The other approach would be to "convert" existing applications to native Internet technologies. Sound complex? It's not. Legacy systems contain rich metadata, although in a proprietary format such as PowerBuilder's PBL or VisualBasic's FRM files. All graphic controls - list boxes, buttons, labels, and so on - show up in the metadata with all their positions, sizes, colors, and other attributes. Database queries allow reconstruction of the original SQL select statements or stored procedure calls. Code scripting of the events is also available. Suppose we learn how to read the metadata and put it into XML format. What can we do with it? We can generate systems for the Internet by automatically converting existing legacy code. In this article we outline the design of the 'magic wand' that converts client/server programs into a Java/XML/XSL solution. In particular, we'll demonstrate how you can leverage investments in all your PowerBuilder DataWindows, migrating them to sites residing on J2EE application servers... We demonstrated the working approach to migrating client/server applications into the J2EE environment. It enables the automatic magic-wand conversion of databound legacy control to cutting-edge Internet technologies. The cornerstone of the solution - code generation from XML metadata - extends it far beyond the conversion process. Indeed, where the metadata is coming from is irrelevant. Combined with a proper graphic design tool, this solution may become a full-scale IDE for creating Internet applications. This proposed approach puts to work the Model/View/Controller paradigm, enforcing strict separation of the data model (XML) from presentation (XSL). In our opinion that alone should bring developers' productivity back to the level of RAD tools. In addition, the XSL-based approach to code generation provides limitless possibilities for end-user customization. The authors maintain a free online implementation of the conversion software at www.xmlsp.com." • [February 23, 2001] "The XML Meta-Architecture [and What the XML Application Interface Looks Like]." Presentation slides. By Henry S. Thompson (HCRC Language Technology Group University of Edinburgh; World Wide Web Consortium). Keynote presentation at XML DevCon Europe in London, England. February 21, 2001. Conclusion: "(1) Think about things in terms of Infosets and Infoset pipelines: Modular, Powerful, Scalable. (2) Use XML Schema and its type system to facilitate mapping: Unmarshalling is easy; Marshalling takes a little longer." Abstract: "The XML technology core has grown rapidly since the announcement of the XML 1.0 itself just over three years ago. First there was XML Namespaces, then DOM and XPath and XSLT, now XLink/X-Pointer/XBase, XML Infoset and XML Schema are (nearly) here, before long we'll have XSL-FO, XML Query and XML Protocols. I believe that XML Infoset is fundamental to understanding the relationship between all these parts. In this talk I'll present my take on an emerging perspective on the meta-architecture of XML, where each XML technology can be understood as defining a class of infoset transducers. On this account an XML application is a pipeline of infoset processing composed of such transducers, for example parser->schema processor->linker->schema proecessor->query processor. I'll suggest how I think this vision will impact on standards development, and conclude by looking at the XML/Application interface from this perspective." Also online: the Technetcast from DDJ. [cache] • [February 23, 2001] IAS XBRL Taxonomy Draft. Presented by David Prather. February 20, 2001. Context: At the first global meeting of XBRL.org in London, the XBRL member organization International Accounting Standards Committee (IASC) announced a "draft taxonomy of XBRL for Financial Statements to members of XBRL.org for review." The IASC taxonomy is an XML-based specification for the 'Commercial and Industrial' sector that allows users and suppliers of financial information to exchange financial statements across all software and technologies, including the Internet. The draft/beta taxonomy is available as an XML schema and in Microsoft Access database format. Some details are documented in a Meeting Presentation: "IAS-XBRL PowerPoint presentation from the February 20, 2001 International XBRL meeting in London." By Kurt Ramin (IASC, London), Ian Wright (PricewaterhouseCoopers, London). David Prather (IASC, London), Bruce Mackenzie (Deloitte & Touche, London. The IAS XBRL Taxonomy Draft was presented by David Prather (IAS XBRL Project Manager, IASC). "IAS XBRL Approach: October [yielded] a way forwards: produce 'trees' to identify the elements; work shared by all of Big 5. In November [PIs] agreed to key principles: elements for items in IAS standards; structure should based on the minimum formats in IAS 1 (B/S +I/S), IAS 7; general ledger closing balances [B/S items in balance sheet section etc]; all cash movements in cash flow section; all other items in the notes; detail as required or recommended by IAS cross references to IAS paragraphs; used trees to confirm complete. Expected key benefits: IAS users are familiar with IAS standards; IAS standards define or explain many of the elements; IAS is translated into 13 languages; elements are directly linked to paragraphs in standard so assist users to use the correct item..." See details. • [February 23, 2001] "Content Management Moves Ahead." By Stephanie Sanborn. In InfoWorld Volume 23, Issue 8 (February 19, 2001), page 38. "XML, the Internet, and global collaboration are all changing the still-evolving industry Content Management's roots may lie in document management, but its future will likely lie on the Web and beyond as its evolution pushes the concept of what content is and how it can be used for e-business. The Web gave content management and the life cycle of content itself a boost as companies began to realize that although running business on the Web has many benefits, it also requires making content useful and relevant online. Companies are finding a need to collaborate around content, and that often means bringing together users and content from different parts of the globe... As you get much richer in your applications and provide more content, more inventory, and a broader set of services to a broader set of people you can reach through the Web, the whole problem of managing that content becomes much greater because you have much more of it and you need to describe it much more effectively, explains Robert Perry, a senior analyst at The Yankee Group in Boston... NextPage plans to capitalize on distributed content by adding peer-to-peer technology from its acquisition of netLens to content management. Incorporating netLens' Peer Space product into NextPage's NXT3 content platform products will create a 'virtual space where all connections are able to be established and the communication can happen,' along with alerts to notify users when changes are made to content they are interested in, says Darren Lee, NextPage's vice president of marketing and product strategy. 'I need a way to connect repositories together and provide integrated access for an end-user across all of that information, not just giving them a view to the Web site. Inherent in that is that [the content] is distributed,' Lee says. 'And therefore p-to-p as an architecture is a perfect fit. It's more about information finding you than you finding information.' Another technology sure to play a big role in the future of content management is XML, which 'is starting to become the lingua franca of business, and it's starting to become customized based on the industry you're in,' Zarghamee says, noting that XML's capability of describing context is invaluable to content management. 'That becomes very powerful, and systems can start exchanging content and take action on the content. So you can truly get into e-business networks and dynamic trading partners. Those ideas have been around, but there was really no technology that enabled it [until XML].' To Interwoven's Ruck, 'XML is like Java in the sense that it's going to be a pervasive technology that's going to be adopted throughout product lines' and is particularly important for areas such as b-to-b, content syndication, and wireless, where content will be deployed 'over multiple customer touch points.' 'There are a lot of different distribution destinations now, different channels to support, often in different countries, that all need the same brand,' Perry explains. 'That's what content management can really help with: creating a single blueprint and pushing it out'..." • [February 23, 2001] "IBM, Microsoft Settle E-Commerce Standards Dispute." By Siobhan Kennedy. In InfoWorld (February 23, 2001). "A group backed by International Business Machines Corp., the world's largest computer hardware company, agreed this week to adopt an electronic-commerce standard being developed by software giant Microsoft Corp., settling a high-stakes dispute that has been rumbling for more than a year. By bringing the incompatible standards together, the two sides are seeking to provide companies with a common format for doing business over the Internet, a market expected to explode in the next few years. AMR Research in Boston predicted the market for business-to-business transactions will skyrocket to5.7 trillion by 2004 from $581 million in 2001 as more and more companies use the Web to buy products and services. 'If you don't have a standard way of communicating, then people will create lots of different ways of doing it,' Bob Sutor, IBM's director of e-business standards strategy, told Reuters on Thursday. 'And that will create big interoperability problems.' That is exactly what has happened so far. With IBM pushing one standard and Microsoft another, the result has been a sometimes bitter war of words between the two. IBM has dismissed the rival effort as lightweight and too Microsoft-centric, and Microsoft has criticized the IBM group for taking too long to get its standard out the door... OASIS' standard, called ebXML (for electronic business XML), is a series of specifications that define how businesses should communicate with each other in buying and selling goods over the Internet. XML (Extensible Markup Language) is a popular Web standard that businesses use to exchange information with each other online... John Montgomery, lead product manager for Microsoft's .Net framework, said Oasis' decision to adopt SOAP is a clear validation of the approach both Microsoft and the World Wide Web Consortium has taken with XML standardization. 'Microsoft has consistently said that the (consortium) is where XML standardization should occur,' he added. Sutor, who is also vice chairman of the ebXML group, said the OASIS members will continue to develop the ebXML standard, an overarching effort that includes a lot more work than the small part that overlapped with Microsoft's SOAP." See the ebXML announcement. • [February 23, 2001] "Standards Groups Reach E-Business Accord." By Wylie Wong. In CNET News.com (February 22, 2001). ['A brewing controversy over e-business standards may have been averted Thursday after one Web standards group agreed to support the work of another.'] "The World Wide Web Consortium (W3C), the gatekeeper of many Internet standards, and OASIS, a group of technology companies backed by the United Nations, have been developing competing technologies to allow businesses to link over the Internet and conduct e-commerce. OASIS on Thursday announced it is ceasing its effort to build a communications protocol for e-commerce business communications in favor of a competing specification under development by the W3C. The W3C recently began building an XML-based protocol based on technology developed by Microsoft, IBM and others, called the Simple Object Access Protocol (SOAP). At issue is the need to build an XML-based communications protocol that serves as a common format for businesses to swap information with each other. XML (Extensible Markup Language) is a popular Web standard for businesses to exchange information with each other via the Web. The result is one uniform standard for exchanging XML messages and less confusion among software developers on what standard to use in the future, said Bob Sutor, IBM's program director for e-business standards strategy. Companies conducting business over the Web need a common format to send information to one another, much like the post office has a standard way for people to send mail, Sutor said. People are required to write addresses and place stamps on the same places on an envelope so the post office knows where to send mail. Until now, OASIS and the W3C differed in their definitions of that format. Before OASIS' support of SOAP, both OASIS and the W3C had said they would create connectors so that the two differing communications protocols could communicate. Now, that work will be unnecessary. 'We don't need unnecessary duplication,' Sutor said of the competing efforts. 'It means that software that businesses use can be simpler because they will have fewer specifications for messaging. People can devote more time on creating really new software to make their businesses better, rather than being mired down in the details of supporting yet another messaging protocol.' OASIS, which includes IBM, Sun Microsystems, BEA Systems and others, has been working with a United Nations organization to develop a blueprint for businesses in different industries to use XML. The EBXML effort is aimed at allowing companies that use older data-exchange technology, called Electronic Data Interchange, or EDI, to start using more flexible and potentially cheaper XML-based software over the Internet." See the ebXML announcement. • [February 22, 2001] "XMLTrans: a Java-based XML Transformation Language for Structured Data." By Derek Walker, Dominique Petitpierre, and Susan Armstrong (ISSCO, University of Geneva, 40 blvd. du Pont d'Arve CH-1201 Geneva 4, Switzerland). Abstract: "The recently completed MLIS DicoPro project addressed the need for a uniform, platform-independent interface for accessing multiple dictionaries and other lexical resources via the Internet/intranets. Lexical data supplied by dictionary publishers for DicoPro was supplied in a variety of SGML formats. In order to transform this data to a convenient standard format (HTML), a high level transformation language was developed. This language is simple to use, yet powerful enough to perform complex transformations not capable with other standard transformation tools. XMLTrans provides rooted/recursive transductions, similar to transducers used for natural language translation. XMLTrans is written in standard Java and is available to the general public... The goal of DicoPro [April 1998 to Sept 1999] was the development of a uniform, cross-platform client-server tool to enable translators and other language professionals connected to an intranet to consult dictionaries and related lexical data from multiple sources. Dictionary data was supplied by participating dictionary publishers in a variety of proprietary formats. One important DicoPro module was a transformation language capable of standardizing the variety of lexical data. The language needed to be easy enough for a nonprogrammer to master, yet powerful enough to perform all the necessary transformations to achieve the desired output. We initially looked at available SGML transformation tools, XML transformation tools, and nally decided to develop our own. We began to examine available XML transduction resources. The budding standard at the time that our project began, XSL, was still not mature enough to rely on as a core for the language. In addition XSL does not provide for rooted, recursive transductions needed to convert the complex data structures found in DicoPro's lexical data. Edinburgh's Language Technology Group has produced a number of useful SGML/XML manipulation tools. Unfortunately none of these matched our specic needs. For instance, sgmltrans does not permit matching of complex expressions involving elements, text, and attributes... Given the large number of XML APIs developed for Java, this seemed to be a promising venue. The API model which best suited our needs was the Document Object Model(DOM) with an underlying SAX parser. This provides the core of the XMLTrans parser. The transducer was designed for the processing of large XML files, keeping only the minimum necessary part of the document in memory at all times. In effect, XMLTrans processes lexical entries from a dictionary that are independent of each other and that have a few basic formats. It takes as input a well-formed XML file and a file containing transformation rules and gives as output the application of the rules on the input file. [In this paper] We begin with a simple example to illustrate the kinds of transformations performed by XMLTrans. Then we introduce the language concepts and structure of XMLTrans rules and rule files... The XMLTrans transducer was used to successfully convert all the lexical data for the DicoPro project. There were 3 bilingual dictionairies and one monoligual dictionary totalling 140 Mb in total (average size of 20 MB), each requiring its own rule file (and sometimes a rule file for each language pair direction). Original SGML files were preprocessed to provide XMLTrans with pure well-formed XML input. Inputs were in a variety of XML formats, and the output was HTML. Rule files had an average of 178 rules, and processing time per dictionary was approximately 1 hour... The code is portable and should be runnable on any platform for which aJava runtime environment exists. A free version of XMLTrans can be downloaded from http://issco-www.unige.ch/projects/dicopro_public/index.html. See other details. [cache] • [February 22, 2001] "K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data." By Susan Davidson, Jonathan Crabtree, Brian Brunk, Jonathan Schug, Val Tannen, Chris Overton, and Chris Stoeckert. In IBM Systems Journal, March 2001. 23 pages, with 76 references. "The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, 'on-the-fly' integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear 'winner'. Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application. Our experiences also point to some practical tips on how updates should be published by the community, and how XML can be used to facilitate the processing of updates in a warehousing environment... Conclusions: Both the K2/Kleisli view and GUS warehouse strategies have proven useful for genomic applications within the Center for Bioinformatics. Kleisli was used for some time to implement several web-based, parameterized queries that were written for specific user groups. Users could query views that integrated many important on-line data sources (such as GenBank, GSDB, dbEST, GDB, SRS-indexed databases, KEGG and EcoCyc) and application programs (such as BLAST) by supplying values for parameters; the data sources and application programs were then accessed on demand to provide answers to the parameterized queries... it is now up to individual data source owners or third parties to modify their data sources or to provide wrappers to their data sources so that they conform to these specifications. However, we believe that the standardization of all genomic data sources is an unrealistic goal given their diversity, autonomy, and rapid schema changes. This is evidenced by the fact that interest in CORBA seems to have waned over the past year and to have been superceded by XML. As a universal data exchange format, XML may well supplant existing formats such as EMBL and ASN.1 for biological data, and as such will simplify some of the lower-level driver technology that is part of K2/Kleisli and other view integration systems. There is an abundance of freely available parsers and other software for XML, and heavy industry backing of XML. The question is whether it will do more than function as an exchange format. It may, in fact, become a basis for view integration systems by using one of the query languages developed for semistructured data or XML. However, before it becomes a good basis for an integration system we believe that several things must happen: (1) Some kind of schema technology must be developed for XML. DTDs function as only a rough schema for XML. For example, there are no base types other than PCDATA (so the integer 198 cannot be distinguished from the string \198"), no ability to specify keys or other constraints on the schema, and the reliance on order makes representing tuples (in which the order of attributes is unimportant) tricky. The recent XMLSchema proposal addresses many of these problems by providing a means for defining the structure, content and semantics of XML documents. (2) An agreement must be reached on the use of terms, or there must be a mechanism to map between terms. The discussions in this paper have sidestepped one of the most diffcult parts of data and software integration: semantic integration. Semantic integration focuses on the development of shared ontologies between domains of users, and on the resolution of naming conflicts (synonyms and homonyms). In the TAMBIS project, although Kleisli was used for the low-level (syntactic) integration, a major effort of the project was to develop an ontology through which researchers could navigate to find information of interest. The view layer layer K2MDL in K2 aids in semantic integration by providing a means for mapping between concepts in different databases, and has proven extremely useful in the integration projects for SmithKline Beecham. For XML to be useful in data integration, either the usage of tag labels must be uniform across the community, or a semantic layer must be available. (3) A standard for XML storage must be adopted. Several storage techniques are currently being explored based on relational and object-oriented technologies; new technologies are also being considered. However, there is no agreement on what is best. Warehouse developers must currently therefore provide their own mapping layer to store the result of an integration query. These issues are of current interest in the database research community, and within the near future we expect to see preliminary solutions." [cache] • [February 22, 2001] "XML on the Move." By Edd Dumbill. [Trip Report.] February 21, 2001 "On the first day of XML DevCon Europe in London, England, speakers highlighted the growth of XML in its three years of existence. Henry Thompson from the University of Edinburgh (and zealous editor of the W3C's XML Schema specification) noted in his opening keynote that XML had grown from one specification to a family of technologies. He focused on the emerging centrality of the XML Infoset and XML Schema. David Orchard of Jamcracker taught a session on web services, XML, and UDDI. Despite XML's growth in the area of program-to-program communication, there's still much to build..." • [February 22, 2001] "Keys for XML." By Peter Buneman, Susan Davidson, Wenfei Fan, Carmem Hara, and Wang-Chiew Tan. Presentation prepared for WWW10 (2001). With 21 references. "We discuss the definition of keys for XML documents, paying particular attention to the concept of a relative key, which is commonly used in hierarchically structured documents and scientific databases... If XML documents are to do double duty as databases, then we shall need keys for them. In fact, a cursory examination1 of existing DTDs reveals a number of cases in which some element or attribute is specified -- in comments -- as a 'unique identifier'. Moreover a number of scientific databases, which are typically stored in some special-purpose hierarchical data format which is ripe for conversion to XML, have a well-organized hierarchical key structure. Various forms of key specification for XML are to be found in the XML standard, XML Data, and XML Schema. Through the use of ID attributes in a DTD, one can uniquely identify an element within an XML document. However, it is not clear that ID attributes are intended to be used as keys rather than internal 'pointers'. For example, ID attributes are not scoped. In contrast to keys, they are unique within the entire document rather than among a designated set of elements. As a result, one cannot, for example, allow a student (element) and a person (element) to use the same SSN as an ID. Moreover using ID attributes as keys means that we are limiting ourselves to unary keys and, of course, to using attributes rather than elements. Finally, one can specify at most one ID attribute for an element type, while in practice one may want more than one key. XML Data introduces a notion of keys explicitly. However, its keys can only be specified in types and moreover, can only be defined for element types rather than for certain collections of elements. XML Schema has a more elaborate proposal, which is the starting point of this paper. The proposal extends the key specification of XML Data by allowing one to specify keys in terms of XPath expressions. There are a number of technical problems in connection with XPath. XPath is a relatively complex language in which one can not only move down the document tree, but also sideways or upwards, not to mention that predicates and functions can be embedded as well. The main problem with XPath is that questions about equivalence or inclusion of XPath expressions are, as far as the authors are aware, unresolved; and these issues are important if we want to reason about keys as we do in relational databases. Yet until we know how to determine the equivalence of XPath expressions, there is no general method of saying whether two such specifications are equivalent. Another technical issue is value equality. XML Schema restricts equality to text, but the authors have encountered cases in which keys are not so restricted. A more detailed discussion can be found in section 7.1. However, the main reason for writing this note is that none of the existing key proposals address the issue of hierarchical keys, which appear to be ubiquitous in hierarchically structured databases, especially in scientific data formats. A top-level key may be used to identify components of a document, and within each component a secondary key is used to identify sub-components, and so on. Moreover, the authors believe that the use of keys for citing parts of a document is sufficiently important that it is appropriate to consider key specification independently of other proposals for constraining the structure of XML documents." A related paper: "Reasoning About Keys for XML." University of Pennsylvania. Technical Report MS-CIS-00-26, 2000; local copy. Also in PDF and Postscript. [cache] • [February 22, 2001] "[Draft/experimental] Formal Specification of the RDF Syntax." By Brian McBride (HPLabs). February 22, 2001. "[This is] an experiment in doing a formal specification of the RDF Syntax. The goal is to more formally define the triples that any given RDF XML represents. The idea is to annotate the grammar with attributes. Each production in the grammar takes attributes as arguments and can return attributes as a result. A production emitTriple, which always succeeds but has the side effect of emitting a triple is introduced. There is a trivial transformation from the annotated grammar to an equivalent XSLT transform, thus in effect enabling an executable specification." ["There was a suggestion on the list last summer that transformation grammars could be used to formally specify the translation of RDF XML to triples. A quick search around the web revealled that such grammars were proposed in natural language processing, but I didn't find anything immediately useful. I liked the idea of formal definition of the transformation. I also thought it would be good to have an 'executable' specification, i.e. one that we could execute against test cases and would spit out the triples. My first thought was to use XSLT to define the transform, but that turned out pretty unreadable. My second attempt has been to produce an attribute grammar for RDF. It turns out that an attribute grammar works reasonably well, and there is a simple, possibly automatable, way of turning the attribute grammar into an XSLT transform. The attribute grammar can be found in the header comment, and the XSLT is executable, though probably buggy."] See "Resource Description Framework (RDF)." • [February 22, 2001] "Getting the Tags In: Vendors Grapple with XML-Authoring, Editing and Cleanup. [XML Authoring: Reviewing the Latest Tools.]" By Liora Alschuler. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), pages 1, 5-10. [NB. A summary does not do justice to this detailed, informative review. -rcc] Publishers of all types are seeking ways to encode their content in XML, and vendors are responding with a variety of specialized tools that run the gamut, from structured authoring to post-production data conversion. We survey the options and expand on three products that work with Microsoft Word, the leading text-editing tool in publishing today. Since the first structured editing applications surfaced some 15 years ago, the technology for adding tags to documents has struggled to earn its place alongside WYSIWYG word-processing applications. Structured authoring still hasn't reached the mainstream, but the rise in XML use on the Web and increasing demand for cross-media source files are gradually reshaping this market. Those who want to implement XML-authoring can choose from a mixed bag of specialized tools, ranging from structured input to post-production conversion. Our overview starts with updates on XML editing tools from the leading established players: Arbortext and SoftQuad. We then review the two top Microsoft Word plug-ins for creating valid XML as you type in the popular word processor: HyperVision's WorX SE and I4I's S4. Lastly, reviewing the options for post-production conversion, we take our first look at an exciting Word plug-in designed for manuscript production editors: Inera's eXtyles... The field of vendors offering XML word processors has shrunk to -- and a handful of straggling wannabes. Both of the veteran leaders reported healthy sales growth last year... WorX SE adds administration tools: WorX SE is a Microsoft Word plug-in introduced in spring 2000. The plug-in application preserves the interface and functionality of Word while adding real-time, interactive structure conversion and feedback. Users see dynamic document structure displayed in a graphic tree, as in a native structured editor, indicating which parts are valid according to either an XML DTD or an XDR schema... I4I (Infrastructures for Information) has released a product based on its long-standing development toolkit, S4/Text, which the company calls a 'tagless editor for XML.' Like WorX SE, S4/Text is a Word add-on that operates on keystroke and mouse capture through the Word API. In real time, it validates input against an XML DTD and guides users so that they create valid XML using the Word user interface... Inera aids production editors with eXtyles" In contrast with structured-authoring tools, eXtyles from Inera Corporation is not a writing tool, but an editorial and production tool designed to clean up and apply presentation and editorial styles to Word documents. EXtyles does three types of processing: it inserts publication-specific meta-data, cleans up noise such as double line-endings, hard hyphenations and misspellings, and exports tagged XML files. EXtyles can export well-formed XML documents or XML documents that have been validated against a DTD... [Conclusion:] This market is morphing rather than exploding. Buyers are demanding easy-to-use interfaces for creating reusable content, pushing even the traditional structured editor vendors to accommodate the Word interface, while Microsoft, unable to make or buy a satisfactory bolt-on to Word, licenses XMetaL. The new Word add-ons are demanding serious evaluation, but, for now at least, the perception that ultimately data conversion and WYSIWYG interfaces don't scale and don't satisfy users in mature applications leaves this market searching for solutions that don't yet exist." Related: see also the article by David Mertz which provides an "up-to-date review of a half-dozen leading XML editors." • [February 22, 2001] "NewsML Lays Groundwork for Next-Generation News Systems. [Report from the Edge.]" By Aimee Beck and Luke Cavanagh. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), pages 15-18. ['NewsML Sets Stage For Future News Systems. Passed by the IPTC in October, NewsML -- the XML-based header for multipart, multimedia news feeds -- is a dramatic step forward from the wire service standards that have been in place for decades. We outline the new standard (and its text counterpart, NITF), find out how users are gearing up to support them and gauge vendor readiness.'] "Last fall, wire services and other leading news publishers approved NewsML, an XML-based standard for transmitting news. Featuring generic markup in both header and text content and support for multimedia news feeds, NewsML paves the way for new generations of editorial systems, ones designed for digital media. Major players in the news wire industry spent more than eight years developing the News Industry Text Format (NITF), a successor to the venerable ANPA wire-story format. Originally developed in SGML and then reshaped as XML about 18 months ago, NITF was on its way to being adopted when Reuters took the idea a step further and conceived NewsML. NewsML completed a short trip through the standards gauntlet and was ratified by the IPTC this past October. With the consensus-demanding standardization process finally over, it's time to assess the implications of NewsML: Who will it affect, and how soon? What benefits will it bring? What vendors will support it? What influence will the spec have outside of the news industry? With these questions in mind, we set out to see if NewsML was gaining any traction in the field. Aside from early adopters, most notably Reuters, we found few examples of NewsML projects underway. However, change is afoot. Awareness of and demand for XML is rising rapidly in the news industry, and vendors are responding. The transition to an XML-enabled editorial environment will take years to permeate the entire global news industry, but we believe its impact ultimately will be akin to the change that the AP Leafdesk brought to U.S. newspapers in the early 1990s. Once papers had a digital wire-photo receiver, electronic pagination suddenly changed from pipedream to obtainable goal. In the same way, once papers get their hands on a system that can process XML-encoded multimedia feeds, they'll be a giant step closer to delivering well-coded files to their webmasters and in a much better position to deliver their own multimedia news coverage to consumers... [Conclusion:] The utility of XML, both in the header and the text itself, has become apparent, not only to newspapers but to a wide array of Web publishers and media companies handling news. Reuters has seeded the market with an open-source toolkit to encourage its customers to adopt NewsML, but, for NewsML to take off, the rest of the newswires and system vendors will have to support it as well. The time is right for that to happen. The blessing of NewsML by the IPTC sets a global standard for how news will be delivered in the future. With an approved spec in hand, vendors have the blueprint they need to begin implementations. Few Web content-management systems currently feature built-in support for wire services, but adding such support would mesh with their need to support XML text and variable metadata. Newspaper editorial system suppliers that have been dragging their heels on the XML issue must now catch up, or risk losing out to those that anticipated this change." See "NewsML and IPTC2000." • [February 22, 2001] "Journeying to the XML Promised Land. [Letter from the Editor.]" By Mark E. Walter, Jr. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), page 2. "The Extensible Markup Language (XML) has been a lightning rod in publishing since its debut at the annual SGML conference in the fall of 1996. It galvanized the entire industry, and within five years, the promise that the Web community would adopt a simplified form of SGML more readily than the old standard has been fulfilled. All publishers have to do to reap the riches of the Promised Land is convert to an XML-based process. Why then, are so many firms wandering the desert in search of such a process? Is it the inability to change people, or the lack of decent tools? Wherever blame lies, one thing is certain: the tools are changing. That's why in this month's cover story we look at the options for authoring and editing XML documents. Those options include not only established vendors, such as Arbortext and SoftQuad, but also a handful of lesser-known firms that we believe our readers will want to check out. Another factor that will drive XML implementation will be its adoption by coalitions of key vendors or users. In the news business, for example, the ratification of NewsML by the wire services will be the first major update to wire-copy headers in decades, a change that will impact virtually every newspaper-system supplier and wire-service customer -- from newspapers to radio stations to Web portals..." • [February 22, 2001] "Ecosystems' The Environment: Product Development Interface For XML Content Management." By Mark Walter. In The Seybold Report on Internet Publishing Volume 5, Number 6 (February 2001), pages 11-14. ['Cool product-development interface for XML content management. An application layer on top of Astoria/Eclipse sets a new user interface benchmark for XML-based collaborative editing, production and new product development in reference, book, corporate and education applications.'] The Ecosystems Environment is an application layer that sits on top of Chrystal Software's Astoria or Eclipse SGML/XML-aware document-management systems that have Web-based access for participants. While Astoria provides the basic library facilities common to content-management systems -- check-in/check-out for collaborative authoring, versioning and so forth -- it lacks a user interface for many publishing-specific functions. The Environment provides this user interface. The defining feature of The Environment is 'LiveOutline,' a tool for building new documents and products. LiveOutline manages the document assembly and modification process by creating a complex web of managed elements. LiveOutline can then be used independent of the source SGML/XML content to update, change or compare the evolution of the content by tracking 16 different states for each SGML/XML element... The Environment provides a built-in SGML/XML viewer that shows you the content if you want, as you browse the repository, without having to export it to a word processor. If you are in the Visual Difference feature, the viewer will display the structural and content changes redlined. That's important because in a collaborative XML-editing environment documents may be shredded to a level of granularity that makes it more difficult to know exactly which part of the document to check out, or if it has changed or been modified since the last time you referenced the document... The Environment furnishes [an essential user interface, and, after seeing it several times, we view it as an essential component for any Astoria customer, and -- if Ecosystems is able to hook it up to other repositories -- for users of other high-end XML-aware content-management systems as well. While it may not be the product that brings XML-based content management to the masses, The Environment sets a new user interface standard among high-end, component-based, content-management systems in reference and textbook editorial settings..." • [February 22, 2001] "XQuery: Reinventing the Wheel?" By Evan Lenz (XYZFind Corp.). February 2001. "There is a tremendous amount of overlap in the functionality provided by XQuery, the newly drafted XML query language published by the W3C, and that provided by XSLT. The recommendation of two separate languages, one for XML query and one for XML transformations, if they don't have some sort of common base, may cause confusion as to which language should be used for various applications. Despite certain limitations, XSLT as it currently stands may function well as an XML query language. In any case, the development of an XML query language should be informed by XSLT... The proliferation of XML as a data interchange format and document format is creating new problems and opportunities in the field of information retrieval. While much of the world's information is housed in relational database management systems, not all information is able to fit within the confines of the relational data model. XML's hierarchical structure provides a unified format for data-centric information, document-centric information, and information that blurs the distinction between data and documents. Accordingly, a data model for XML could provide a unified way of viewing information, whether that information is actually stored as XML or not. Access to, extraction from, and manipulation of this information together comprise the problem of an XML query language. This paper explores some issues, advantages, and disadvantages of using XSLT as a query language for XML. It attempts to show that the basic building blocks of an XML query language can be found in XSLT, by way of an introduction to and comparison with XQuery, the newly drafted XML query language published by the W3C XML Query Working Group. This paper is not a proposal for a specific implementation. [Conclusion:] In the long run, the XML Query Working Group is probably doing the right thing in first formally defining the semantics of the query language. To attain the sophistication of query optimization that we currently have with SQL, an XML query language's underlying mathematics must be well understood. But these semantics should not be developed in a vacuum. However well understood a particular set of semantics is, we will not truly understand which set of semantics is useful in an XML query language until people have built real applications involving XML query. This is the reason why XSLT should be seriously addressed: it is the most widely used and implemented XML query language yet." Note 'This paper is adapted from what I'll be presenting on 'XSLT as a query language' at XSLT-UK.' See also the related posting on XQuery. On XSLT-UK: see the events listing. Related references in "XML and Query Languages." [cache] • [February 22, 2001] "XML-Deviant: Time to Refactor XML?" By Leigh Dodds. From XML.com. February 21, 2001. ['The growing interdependency between XML specifications is causing concern among XML developers -- is this just a case of sensible reuse, or are we creating a dangerously tangled web of standards?'] "The W3C has been particularly busy over the last few weeks, releasing a flurry of new Working Drafts. While welcoming this progress, some members of XML-DEV have expressed concern over the new direction that these specifications have taken. Intertwined Specifications: A succession of new Working Drafts have appeared on the W3C Technical Reports page. The list includes requirements documents for XSLT 2.0, XPath 2.0 and XML Query a data model and an algebraic description for XML Query, and a resurrection of the XML Fragment Interchange specification. The most striking aspect of these specifications is not their sudden appearance but, rather, their mutual interdependence: (1) XSLT 2.0 must support XML Schema datatypes; (2) XPath 2.0 must support the regular expressions defined in XML Schema datatypes, as well as the XML Schema datatypes; (3) XML Query and XPath 2.0 will share a common data model; (4) XML Query may itself use XML Fragments; (5) XML Query must support XML Schema datatypes; (6) Both XPath and XML Query must be modeled around the Infoset, and particularly the "Post Schema Validation Infoset"; (7) XML Schema itself depends on XPath to define constraints. As this list shows, dependence on the XML Schema datatypes and the Post Schema Validation Infoset are particularly prominent. This has produced a few furrowed brows on the XML-DEV mailing list... Refactoring and iteration have become common features in many development methodologies. Extreme Programming is an example. Acknowledging that it's hard to get things right the first time, and allowing changes in requirements, is fundamental to complex development processes, including the XML standards process that many are keen to see take shape." • [February 22, 2001] "Corporate Users Cool Toward XML for Supply Chains." By Michael Meehan. In ComputerWorld (February 19, 2001). "XML may be the future technology underpinning of online business-to-business trading, but many companies are in no hurry to get there. At the EC Forum here, a number of companies with large electronic data interchange (EDI) systems acknowledged that they're only in the investigation phase for using XML tags in their electronic purchases and sales. Many attendees at the conference echoed that hesitance about XML, noting that established corporations for the most part already have working supply chains. Amy Hedrick, senior e-business integration analyst at AMR Research Inc. in Boston, said companies aren't going to abandon 15 years of EDI development to move to a system reliant upon XML, especially since there are no widely used standards for the data-tagging language and more than 100 variants of it. XML has also seen slow adoption in certain markets. Chris Maxwell, an e-commerce systems manager at Dallas-based Pepsico Inc., said the food and beverage world is still rooted in EDI transactions. General Electric's Global eXchange Services (GXS) division hosted today's event. Last year, GXS took the step from being an EDI partner with 100,000 companies toward creating an XML-based electronic public marketplace. GXS CEO Harvey Seegers said the migration has been slow, and he expects that it will continue to be slow. He estimated that about 1% of the transactions GXS facilitated last year were of the browser-based XML variety. GXS plans to support both established EDI networks and upstart XML initiatives -- and its executives remain split as to when XML will prove a solid return on investment for businesses with legacy systems and defined supply chains..." • [February 21, 2001] "A Practical Comparison of XSLT and ASP.NET." By Chris Lovett. From Microsoft MSDN, 'Extreme XML' Column. February 19, 2001). ['Columnist Chris Lovett uses MSDN Online's table of contents to compare XSLT and ASP.NET, complete with pros and cons for each approach.'] "People are using XML to manage the publishing process for large, complex Web sites. Examples include an interview with Mike Moore on how www.microsoft.com has used XML to manage its complex needs and 'Streamlining Your Web Site Using XML', a high-level overview of how companies such as Dell use XML to streamline their entire publishing process. The questions I am getting from a lot of customers are: Should they dump XML/XSL and go write a bunch of C# ASP.NET code instead? If they have already heavily invested in XSLT, does ASP.NET provides any value to them? Is there some middle ground where they can get the best of both worlds? If so, what are the strengths and weaknesses of each technology and when should people use them? I will drill in on a specific example so I can compare and contrast XSLT versus ASP.NET. The example is the MSDN TOC. MSDN found that XML was ideal for managing its large table of contents (TOC). The contents of this TOC come from hundreds of groups around the company. The XML format provided a way to glue together disparate back-end processes that would have been much harder to change. XML/XSL also made it possible to reach different browsers on different platforms. Given that developers are finding that XML is the best way to manage the back-end data that goes into a Web site, let's take a look at how you take this XML data and turn it into HTML. First I will look at the ASP.NET solution... So which solution performs better? On my machine the XSLT version gets 33 requests per second (using MSXML 3.0). The C# version gets about 120 requests per second. A preliminary test of a .NET Beta 2 version of XslTransform does about 47 requests per second. So clearly the C# code is faster. This is understandable, given that the C# code is hand-optimized XmlNavigator code that minimizes the number of XPath evaluations. However, XSLT can also be performed on Internet Explorer 5.x clients, although this particular style sheet requires MSXML 3.0, which will not be integrated until a future version of Internet Explorer. When XSLT is offloaded to the client, the server is then just publishing static HTML pages. These HTML pages still have to fetch the XSLT style sheet, but this gets cached on the client side. Internet Information Server 5.0 can do around 1,000 static HTML pages per second on my machine, depending on their size... There is no clear winner. There are pros and cons to each solution. Developers will have to write application-specific code anyway (such as my XmlMenuResolver class), so I could certainly see the argument for staying in a C#-only environment and saving on the XSLT training costs. On the other hand, if customers have already invested heavily in XSLT, as www.microsoft.com has, and they have clear business value already derived from that, then integrating the XSLT solution into an ASP.NET environment as I have shown here can provide the best of both worlds." • [February 21, 2001] "Microsoft Windows XP: What's in It for Developers?" By Kyle Marsh, Dave Massy, and John Boylan. MSDN Library. February 2001. "This article explores some of the features of Microsoft Windows XP and looks at the effect these changes have on software developed for Windows. The discussion focuses on the new Windows XP visuals and ComCtl32, side-by-side component sharing, and fast user switching... With Windows XP, there's an infrastructure to support assemblies and isolated applications (both COM+ and Win32). A code change should not be required to get at side-by-side assemblies from Win32 applications. Applications can use the latest system assemblies without global impact. In short, isolated applications are valuable because they are more reliable. They are built and shipped with all needed components and are not affected by changes that other applications make. Isolated applications use a manifest, which is an XML file containing information that self-describes an assembly or an application. All binding and activation metadata, such as COM classes, interfaces, and type libraries, is now stored in the manifest, rather than the registry. There are two types of manifest files: applications manifests, which describe isolated applications, and assembly manifests, which describe individual assemblies..." • [February 21, 2001] Chemical Giant Embraces XML For Direct Links To Suppliers. [E-Business Applications.]" By Michael Alexander. In InternetWeek (February 19, 2001), pages 23, 26. "Eastman Chemical Company has completed a yearlong project and met its goal of setting up system-to-system connections to 15 of its customers and suppliers. A key to the effort has been an emerging XML-based standard, called eStandard, which has been developed by the chemical industry. 'What is really interesting to us about XML is not only does it enable more robust intersystem connections but XML also can be used to paint browser screens, update databases, send data to printers and other capabilities,' said Bill Graham, integrated direct program lead at Eastman. 'With one protocol for data descriptions, we have multiple avenues to reach multiple trading partners.' For example, Eastman, with annual revenue of$4.6 billion, is evaluating building an extranet based on XML for dozens of its small suppliers, Graham said. At least 80 percent of transactions between chemical companies are conducted under contract, and putting transactions online saves time and money. Eastman said it expects that by 2002 company-to-company links will account for half of its e-business revenue, with the other half divided between extranets and B2B marketplaces. Eastman used B2B integration modules from webMethods' enterprise application integration software based on XML to link its SAP R3 application to the ERP systems of its 15 trading partners. WebMethods tools are used for secure enveloping, message delivery and other functions necessary to guarantee secure connections. The chemical company has a minority stake in webMethods, as well as investments in online marketplaces OneChem, e-Chemical and Shipchem. Eastman also has set up trading links with five customers through OneChem and Envera exchanges. Though Eastman favors setting up direct-system links, it has contracts with Koch Chemical, Vulcan Chemical and other customers that use those exchanges, Graham said. 'In this case, we connect once and gain direct transactional access to both suppliers,' he added. The XML pilot program, which Eastman concluded in January, was designed to prove the feasibility of setting up secure company-to-company links using XML and to work out interoperability issues in each partner's infrastructure and business processes. The time it takes to process purchase orders fell from a week and a half to a matter of minutes or seconds, Graham said." See "Eastman Chemical Company and webMethods Successfully Launch Business-To-Business Integration (B2BI) Solution."

• [February 21, 2001] "Sun Eyes New Auction Application. XML technology makes it easier to put excess inventory up for bid." By Ted Kemp. In InternetWeek (February 19, 2001), pages 23, 26. "As it enters its second year of selling products on popular auction sites, Sun Microsystems is mulling an upgrade to the service it uses to put items up for bid. Sun began selling workstations, enterprise servers, workgroup servers and software in December 1999 on such auction sites as eBay. Auctions help to clear older inventory and provide an outlet for products lacking state-of-the-art technology in terms of, say, microprocessors or DVD systems. Sun is considering migrating to a service that would manage an XML connection from Sun's product database to one or more online auction sites, and a second XML link from the auctions' transaction engines back to a Sun checkout page or fulfillment system. The application -- a sort of auction middleware -- is hosted by GoTo Auctions, a unit of search engine GoTo.com. Sun now uses a free service that handles such simple tasks as image-hosting; the more complex GoTo app is built around spidering technology, which locates the pages on auction sites that let sellers enter product data. It then fills out virtual forms and registration pages with product and pricing data that the seller enters into the system through a secure Web interface. A still more complex option would crunch historical selling data, product by product, and give Sun guidance on the best ways to spread goods across auctions. The new system can work with several auction sites, though Rublowsky said Sun sells almost exclusively on eBay. GoTo's fee ranges from 3 percent to 10 percent of gross sales, depending on the complexity of the app that the client selects and the products put up for bid."

• [February 21, 2001] "Relearn Old Lessons Before Embracing XML. [At Your Service.] By Julie Gable. In Imaging and Document Solutions Volume 10, Number 3 (March 2001), page 27. "XML's strengths for enabling business-to-business e-commerce often eclipse its advantages for internal content management. The Gartner Group, Stamford, CT, says XML's strength lies in "the process of integrating digitized data of multiple types in multiple formats and from multiple sources so that users can access a cohesive set of relevant information about a topic." Users in knowledge-producing organizations recognized the potential of XML early on. In 1999, a University of Michigan study on the feasibility of publishing dissertations electronically in XML estimated that it would cost about $67 per document to convert to XML vs. about$2 per document to convert to Adobe PDF. Yet the study recommended conversion to XML. Why? XML allows the same content to be customized for specific audiences and presented in different ways, including screen display, print, Braille and so forth. Documents in XML are modular in nature, so users can execute searches across specifically tagged sections rather than entire documents, resulting in more relevant search results. XML is also an excellent archival format for preserving documents over the long term because of its ability to render content regardless of platform, without relying on specific application software or hardware that is subject to obsolescence. If XML is the new ASCII, why haven't document management vendors flocked to provide XML product sets? The answer may lie in what the document management industry has already learned from prior experience: the procedural infrastructure is often the hardest part of implementation in the internal content realm, not the technology. Consider the following examples..."

• [February 21, 2001] "XML: Business Beacon or Tower of Babel? [Open Platform.] By David Weinberger. In Imaging and Document Solutions Volume 10, Number 2 (February 2001), page 55. "XML is on the verge of plummeting down the celebrity curve. We already hear refrains such as: "You know, it doesn't do everything we said it was going to do," and "just doing something in XML doesn't mean it's really open." XML's weaknesses are rooted in its very being. There should be no illusions about it. Yet illusions there are, brought about by the media's need to generate headlines and the vendors' need to differentiate their wares in an undifferentiated market. The most overinflated expectation for XML comes from the media touting it as a standard. In fact, it's a standard for writing standards. XML is like an alphabet and a grammar: Now that we agree on the letters and that sentences will consist of nouns and verbs, we can begin to create different languages. So, XML by its very nature can be a beacon, or - if an industry is excessively greedy - it can be a Tower of Babel, breeding competing standards that don't know how to talk with one another. Inevitably, both have happened. XLink is one of many examples of beacon-hood. Web-based forms are an excellent example of Babel-onia. The forms example is illustrative of the venal forces that work against what XML offers. A tiny company, PureEdge (formerly UWI.com), made a name for itself early on by proposing an XML standard, XFDL, for encoding forms - anything from a purchase order to a mortgage application. This is a good use of XML because the essence of a form is the data it's capturing, and XML is quite strong on data-capture. There are lots of considerations when designing an XML standard for forms, including capturing acceptable entry ranges and allowing for the conditional display of fields (e.g., if the house you're buying costs more than $500,000, you may have to fill in some blanks for additional insurance coverage). PureEdge's XML design was well thought out and seemed to be at least a good start..." • [February 21, 2001] "XML Enables Dynamic Technical Publishing." By Lowell Rapaport. In Imaging and Document Solutions Volume 10, Number 2 (February 2001), page 14. "XyEnterprise has developed Content@XML. Content@XML supports XML authoring environments such as Arbortext Epic, SoftQuad's XMetal and HyperWorx, an XML authoring add-on for Microsoft Word. XyEnterprise has also developed XML Professional Publisher, an XML composition engine for creating PDF files from XML content. The PDF files can be printed or delivered electronically... Content@XML retains the strengths XyEnterprise developed for paper document management over the years, providing a production environment incorporating workflow management, collaboration and integrated security. "This is a system that can compete with products like Documentum, but without the high-end deployment costs," Parsons claims. Future plans for Content@XML include improving access security for use with the Internet. XyEnterprise expects to continue serving the legal and financial markets as well as its core base of industrial publishers such as Tweddle Litho... Content@XML combines XML components in a comprehensive data management and workflow application. It supports a number of XML editing applications and manages XML/SGML tagged data in a project-centric workflow environment." • [February 21, 2001] "Small Suppliers: Weak Retail Link." By Ted Kemp. In InternetWeek (February 19, 2001), pages 1, 77. "Retailers are rethinking how to coax their smallest suppliers onto the Web, applying a mixture of training and inexpensive technology. They're finding the Web is a better fit than proprietary EDI links, but it's still not a quick supply chain fix. For decades, retailers have strived to connect with even their smallest suppliers using EDI, sometimes resorting to brute coercion. Today, the benefits of electronic links are becoming more obvious to small companies as open Internet technology makes such connectivity simpler and less expensive. But many of those same companies are still in no rush to automate. Sears later this year will test XML connectivity with suppliers of all sizes because it's more flexible and easy to use than EDI -- exactly what small suppliers need. XML data tags can be written in plain English, allowing Web pages to function like database records. EDI uses more arcane communications methods that are harder to learn and implement for enterprise-to-enterprise communications. It also requires a far bigger financial commitment. This is the latest in an ongoing series of initiatives at Sears to make electronic communications easier for suppliers. Today, the giant retailer gives its 3,000 small and midsize suppliers several billing and ordering options that it manages through vendor SPS Commerce. Suppliers without PCs or Internet access can fax invoices to SPS, which reformats them into EDI and passes them along to Sears. The reverse operation takes place with purchase orders...From the suppliers' perspective, a simple lack of know-how is keeping many small firms from linking to their big retail customers via XML or EDI, and small and midsize suppliers are finding that such links often require them to revamp their entire businesses. For example, most retailers want visibility into inventory levels and the flow of goods within supplier operations, but many small suppliers lack inventory management systems that can provide that data in any form. 'Fundamental business processes need to be reconfigured, and small companies usually aren't geared to do that,' said Deepinder Sahni, vice president of AMI Partners, a research firm that specializes in small and midsize business issues. Some 44 percent of U.S. businesses with fewer than 100 employees don't even have Internet access in their offices, and an additional 37 percent don't operate a Web site, he noted. Retail experts agree that Web links between retailers and suppliers speed up the order process..." • [February 21, 2001] "IM, XML Will Work Together To Unleash B2B Transactions." By Jamie Lewis [The Burton Group]. In InternetWeek (February 13, 2001) "Although it was first used by teenagers for AOL gab sessions, instant messaging (IM) is becoming a valuable tool for communicating within and between organizations. And its role is poised to expand further. Some companies have started using AOL's services, and both Microsoft and Lotus have moved to integrate IM functions into Exchange and Domino. But there's another role emerging for IM services that may have a profound effect on how enterprises enable distributed computing across the Internet. In short, message-oriented communications mechanisms like IM may well provide the 'software backplane' that many applications and services use to communicate in B2B transactions as well as in consumer-oriented services. As XML assumes its role as the standard syntax for encoding business transactions and communications, message-oriented protocols will enable not only application-to-application but also person-to-person communications, providing the XML-oriented pipe for routing business information online... As it matures, XML will provide that common ground. With XML, the requirement to configure distributed applications in advance before they can communicate is much lower. XML allows applications to exchange objects (or 'documents' in XML parlance) whose intended receive-side processing is self-explanatory. Ideally, a 'self-describing' XML-based B2B message would contain all the content and context that two dissimilar endpoints need to exchange the message. Senders and recipients of XML-based B2B messages would be free to process them as they wish, without being tightly bound to each other's programmatic interfaces. Combining XML's power with a message-oriented approach to application communications holds a great deal of promise. In that light, IM services take on the function of message brokers and routers. Such brokers can enable asynchronous conversations across platforms and support dynamic, message-oriented 'publish-and-subscribe' models for application-to-application communications. This doesn't mean AOL IM will become the foundation for all B2B application communications. But it does mean a new generation of products and services that looks an awful lot like IM systems will emerge to serve these needs. Jabber, for example, is an open-source IM client based on XML and managed by Jabber.org. At its core, Jabber consists of several components, including what is in essence an XML router, and other services such as presence management (which allows a communicating party to find out if another party is online). With the right security model, integration with directory services and other key functions, Jabber (or other systems like it) may well become the foundation for a message-oriented communications infrastructure that moves XML messages between applications..." ['The core of Jabber (www.jabber.org) is a vibrant community of developers working at the intersection of XML, presence, and real-time messaging. This community is building a set of common technologies for further development, including servers, clients, libraries, services, and applications. Jabber is fully based on XML, so it provides an extensible architecture for creating the next generation of services and applications on the Internet. The benefits of using Jabber include presence management, transparent interoperability, and real-time routing of structured information.'] See: "Jabber XML Protocol." • [February 20, 2001] "Canonical XTM: A canonical serialization format for XML topic maps. Version 0.1." By Lars Marius Garshol, with contributions by Geir Ove Grønmo. Posted to XTM Mailing list 2001-02-20. ['I've now written up a proposal for a Canonical XTM specification, which is appended here. It is submitted for the consideration of topicmaps.org, in the hope that it may be useful. It has already been implemented and is now used internally by Ontopia for testing purposes.'] "This specification describes a serialization format for XML topic maps which has the property that all logically equivalent topic maps have the exact same byte-by-byte representation in this format. This can be used to test the conformance of XTM processors. The specification describes the serialization of a topic map into an output document, but does not concern itself with where that topic map came from. It is NOT a goal to ensure that the canonical topic map can be successfully read into an XTM processor, but merely to confirm that all processing defined by the XTM 1.0 specification has been performed correctly. The topic map must before serialization be processed into consistent topic map, as defined by XTM 1.0. When applying canonicalization to XTM documents no string normalization such as Unicode canonical decomposition must be performed..." See: "(XML) Topic Maps." • [February 20, 2001] "Sun for ONE." By Charles Babcock. In Interactive Week (February 12, 2001). "Sun Microsystems last week announced its Open Net Environment, a software strategy that plays up its Java and Internet integration capabilities. While the software contains few new elements, it maneuvers Sun into a more competitive stance versus Microsoft as a developer platform. 'This announcement may appear boring but it has real significance,' said Frank Gillett, an analyst at Forrester Research. 'It marks the beginning of a new battle over Web services.' Sun said its Open Net Environment (ONE) approach allows developers to build 'smart services' -- software code that can recognize a customer visiting a Web site and interact with the customer in ways that match what he or she is trying to do, said Greg Popadopoulos, chief technology officer at Sun. The growing ONE software set -- which includes the Solaris operating system, iPlanet application and integration servers, as well as the Market Maker e-commerce applications and Webtop user interface -- represents an integrated product set for developers, said Scott McNealy, chief executive of Sun. Both Sun and Microsoft are emphasizing the use of eXtensible Markup Language in their product lines. One of the additions to Sun's lineup was support for Small Object Access Protocol, a Microsoft-sponsored standard for XML-based instructions that can connect dissimilar computer systems. SOAP is under review as a standard by the World Wide Web Consortium. Also, Sun now supports Universal Description, Discovery and Integration, sponsored by Ariba, IBM and Microsoft, as a standard for an XML-based registry of online services. The Sun platform supports Java language, iPlanet servers, ONE Webtop interface and XML language Strengths: Java has caught on as an enterprise and Web application language. Many network services and XML are built into iPlanet servers. Weaknesses: With the defection of Microsoft, Java is not everywhere -- it's not on the Windows desktop. Integration of Sun's Forte development tools is still to come in some respects. Sun's Star Office applications, on which Webtop is based, are not pervasive." See also the Sun Microsystems white paper. • [February 20, 2001] "NotifyMe Networks Launches With Alert Service." By Mindy Charski. In Interactive Week (February 12, 2001). "A company launching today, called NotifyMe Networks, enables businesses to send 'actionable' alerts to their employees and customers through devices including the telephone, PC and pager. While many enterprises have implemented messaging systems that can send instant alerts to employees and customers, NotifyMe is among the first to give recipients the opportunity to respond. The company's code is based on XML, so no proprietary software is necessary. Clients are charged maintenance fees and pay per minute or per alert, which Chief Executive Chuck Dietrick said amounts to 'a matter of cents.' NextJet is among the company's first clients. The package delivery service will use the NotifyMe alerts to keep up with changing airline schedules and dispatch couriers to their destinations. NotifyMe expects its technology to be used primarily within companies and between businesses, but there are consumer-oriented applications as well. CNET Network's CNET Auctions, for instance, will make the service available to bidders who wish to be notified through the telephone when they've been outbid on a product. The person can raise the bid by keying the new price into the telephone. CNET will not charge the customer for the service, which could lead to higher revenue for the site as bids increase and strengthens customer loyalty, Dietrick said. EnvoyWorldWide offers a similar alerting product, which now enables recipients to answer multiple-choice questions through a touch tone phone or keyboard." • [February 20, 2001] "Dynamically Generated XSL Revisited. [XML@Large.]" By Michael Floyd. In WebTechniques Volume 6, Issue 03 (March 2001), pages 66-69. ['You could write over 100 style sheets, or you could let some transformations do all the work. Michael Floyd gives you the short story.'] "In the January installment of this column, I demonstrated how to dynamically generate XSL style-sheet transformations, which can then be applied to XML documents. In that column, I assumed that the developer has intrinsic knowledge of the structure and organization, or "schema," of the data being transformed. That knowledge is important because style-sheet transformations often use simple step patterns (or even full-blown XPath expressions) to locate a given element or attribute in the document tree, then use <xsl:value-of> to retrieve the item's content. So, XSL style sheets are highly reliant on a document's structure. By moving from statically created style sheets to dynamically generated transformations, you shift responsibility from the style-sheet author to the DOM developer. However, if you can generalize the process, you can realize significant benefits from generating your XSL dynamically. The key to generalizing this process lies in the schema. If you have a formal schema, such as a DTD or XML Schema document, you should be able to discover enough about the organization and structure to generate a reasonable XSL style-sheet document. This month, I'll examine that process and discuss how far you can take it. I wrote this article with the assumption that you, the developer, are familiar with XML Data Reduced (XDR) schemas, the XSL Transformation language (XSLT), and the Document Object Model (DOM)... Style sheets are used to associate meaning with markup elements, preventing us from completely generalizing style-sheet transformations. You can't render a <bold> element unless you know how to generate the appropriate transformation. There are a few ways to solve this problem. One is to let the user associate meaning interactively. You might create a tool that scans the schema, presents a list of all elements (and appropriate attributes), and lets an end user assign a property or behavior. An even simpler method is to create a mapping in your code between markup elements and their transformations. Either way, by moving from statically created style sheets to dynamically generated transformations you can solve the problem of propagating style sheets, and reduce maintenance of them, while generalizing the overall process." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." • [February 20, 2001] "An Application Server with No Strings Attached. [Product Review: Enhydra 3.5 for Wired and Wireless Devices.]" By Michiel de Bruijn. In WebTechniques Volume 6, Issue 03 (March 2001), pages 70-71. [Review of Enhydra 3.5 for Wired and Wireless Devices, Lutris Technologies.] "When the prerelease version of the latest incarnation of the Enhydra application server hit my desk, two questions came to mind: What's the "wireless" moniker doing in its name, and -- because, after all, Enhydra is an open-source project -- why do developers have to pay for it? [...] Wireless sounds cool, but on closer examination it isn't very impressive. Because these markup languages tend to be based on either HTML or XML, the actual functionality required to support them is minimal. Enhydra's feature list also mentions multiple markup language support. And true enough, it supports the functional equivalent of selecting different document object models (DOMs) to serve up XML-like data over TCP/IP. While this is useful in some situations, it's hardly exciting. A very helpful non-J2EE Web development feature is the innovative XML support through Enhydra's XML Compiler (XMLC). Because it outputs dynamically recompilable Java classes based on your HTML/XML documents instead of regular JavaScript code, XMLC already significantly enhances the performance of your Web apps. However, version 2.0 takes things even further with its 'lazy' DOM parser. The lazy DOM uses a read-only template DOM that's shared across all application instances -- where data is copied into a specific instance only when it's required. This works well if only particular nodes are accessed and the instance doesn't traverse the entire document. Other items of interest include the Presentation, Session, and Database Managers -- collectively powering what Lutris calls SuperServlets. These make it easy to associate Java code with URLs, keep session state (with or without using cookies), and access JDBC-compliant databases. And, as I've mentioned before, documentation for all of the tools included with Enhydra is excellent. Without making you wade through too much verbiage, it does a good job of explaining both application server basics and more advanced topics. Since Enhydra is an open-source product, you might have some misgivings about the availability and quality of support. After all, not everyone is interested in just taking the source code and fixing problems themselves. Lutris' boxed version solves that problem quite nicely: It comes with all the amenities you'd expect from a commercial vendor, including a list of supported software, technical support (pay-as-you-go after a number of free incidents), training, and even consultancy. The bottom line is that whether you go for the free download (which may lack some documentation) or the nonfree -- but still affordable -- packaged version, Enhydra offers excellent value. Even if you've been disappointed by a higher-priced solution in the past, Enhydra just might work for you." • [February 20, 2001] "XML meets semantics: The reality. [Thinking XML, #1.]" By Uche Ogbuji (CEO and principal consultant, Fourthought, Inc.). From IBM developerWorks, XML library. February 19, 2001. ['This discussion of XML and semantics kicks off a column by Uche Ogbuji on knowledge management aspects of XML, including metadata, semantics, Resource Description Framework (RDF), Topic Maps, and autonomous agents. Approaching the topic from a practical perspective, the column aims to reach programmers rather than philosophers.]' "This new column, 'Thinking XML', will cover the intersection of XML and knowledge architecture (KA). Knowledge architecture sounds like something tossed out by a jargon bot, but it's really just an umbrella term for some very useful technologies that are emerging now that XML is entering its adolescence. Metadata management, semantic transparency, and autonomous agents are hardly concepts unique to XML, but the promise of XML to unify the syntax of structured and semistructured data helps turn the next-to-impossible into the feasible. The key feature that will distinguish this column from much of the discussion of such topics is that I'll address programmers, not philosophers. I'll focus on development tools and techniques that allow developers to use XML to better collect and navigate the knowledge latent in data, whether in corporate databases or on the Web itself. This sounds quite grandiose, but the column installments will really be an incremental procession, never leaving common sense too far behind. This first column and the next set the scene, so they will diverge a bit from my ground rule of "lots of code, little philosophy." These first two columns cover the semantics of XML and related vocabularies. I'll discuss only initiatives with existing work products for the developer to take a look at, but I won't be presenting a lot of hands-on code and techniques just yet." See: (1) "XML and 'The Semantic Web'"; (2) "Conceptual Modeling and Markup Languages." • [February 20, 2001] "Practical XML with Linux, Part 3. XML database tools for Linux. Hierarchical, relational, and object databases." By Uche Ogbuji (CEO and principal consultant, Fourthought, Inc.). In LinuxWorld (February 2001). ['Your stash of XML documents is probably growing exponentially. Uche Ogbuji provides an overview of database types, then surveys the wide range of tools available for storing and managing XML data stores.'] "There are almost as many uses of XML as there are XML users, but there are only two ways of looking at how XML documents are organized. XML's roots lie in SGML, which was originally conceived as a way of structuring documents for machine preparation and human consumption. XML has inherited much of that bias toward documents, and is often used for presentation-oriented publishing (POP). Examples include books, slide presentations, and company Websites. POP formats tend to have elements and text that flow in a flexible and free-form manner. XML has also gained popularity as the basis for data formats suitable for exchange between computer programs: consumed by machines but able to be inspected by humans. This is known as messaging-oriented middleware (MOM) because of its role in the infrastructure of applications. Examples include serialized objects, automated purchase orders, and Mozilla bookmark files. MOM formats tend to be highly regular, with elements making up well-defined fields with content according to strict data typing. MOM and POP formats often impose different needs on XML databases, based on the differences in usage patterns and format. We will decide whether certain Linux database technologies are more appropriate for MOM or POP documents. There are many ways of structuring databases. The relational model, used by well-known DBMSs like PostgreSQL and Oracle, is probably the most popular for new systems, but there are many other approaches. Databases can be: Hash-based systems; Hierarchical databases; Relational and object/relational databases; Object databases; Multi-dimensional databases ; Semistructured databases... support the notion that it is impractical to have a rigid schema for data that models the real world, given the fluidity of the real world. Many of its concepts are a natural fit for XML and related technologies like the Resource Description Framework (RDF). There is a growing body of work on how to effectively manage XML data in hierarchical, relational, and object databases." See: "XML and Databases." • [February 20, 2001] "[XSLT Tutorial. Part 1]." By Henning Behme. From iX/Raven - iX - Magazin für professionelle Informationstechnik. February 19, 2001. "In order to present XML documents or data to the user in an attractive way in browser, mobile phone or PDF format, the original data must first be converted to the necessary formats. This is the purpose of XSLT as part of the style component of XML... The tutorial begins with the basics and finishes by trying out AxKit (v 1.2) for serving XML sources dynamically." The three-part tutorial series is also available in German. See details. • [February 20, 2001] "XML Standards Reference. [EXPLORING XML.]" By Michael Classen. From WebReference.com. February, 2001. "XML standards are defined at breathtaking speed these days. It is also difficult to keep up with the various versions of those standards. This short list focuses on the XML applications that should be of particular interest to webmasters and Web developers. It is not meant to be objective or exhaustive...Try these annotated links to XML standards, recommendations, and resources..." • [February 20, 2001] " Vignette's Bill Daniel tells where enterprise content management is headed." By Martin LaMonica and Tom Yager. In InfoWorld (February 19, 2001). ['As web publishing rushed onto the world scene, Vignette was an early leader in developing content management systems with personalization. Now the company has expanded its product base to be an e-business platform, addressing content management as well as integration and data analysis. That's only natural, says Bill Daniel, Vignette's senior vice president of products. Content management products are evolving from a soup-to-nuts suite to specialized applications that run on top of application servers. InfoWorld Executive Editor Martin LaMonica and East Coast Technical Director Tom Yager talked recently with Daniel about where enterprise content management is headed.'] "... let me give you a bold statement: I don't think there will be a discreet content management market in the future. We have said there are three broad sets of functionality that customer-driven applications require: communication, collaboration, and comprehension. And those relate specifically to content, content management, and delivery capabilities. [These are] integration capabilities and analysis capabilities. Those need to be futures and functions within an application suite. I don't think any of those, over time, will completely stand alone as big markets. They will be in everything. [...] In the case of Vignette, we have a whole set of APIs so you can access the functionality. In addition to that, our content management systems utilize relational databases from Oracle and other vendors. So that we have, essentially, a very open way to get at the content, to get at the meta data. You can do it programmatically through the APIs or you can do it just through database connectivity. For the state of the art, it's all over the map. Some of our competitors have no APIs. There's no programmatic way to interface to their system. All you can do is have their systems pump information out and you can pick it up. And so that's the two extremes: from very open to literally proprietary. What you're going to see over time is that XML is going to provide the interchange, so that I can create an XML document and serve it to you. And you can unpack it and put it in your content management system. But what people are going to increasingly want is APIs that allow them to drive that information movement programmatically, as opposed to by a human." • [February 19, 2001] "UDDI4J: Matchmaking for Web services. Interacting with a UDDI Server." By Doug Tidwell (Web Services Evangelist, IBM). From IBM developerWorks, XML library. January 2001. [UDDI4J is an open-source registry implementation from IBM. Follow Doug Tidwell as he shows how to build applications that can make use of a UDDI registry.'] "As part of its continued commitment to Web services, IBM has released UDDI4J, an open-source Java implementation of the Universal Discovery, Description, and Integration protocol (UDDI). In this article, we'll discuss the basics of UDDI, the Java API to UDDI, and how you can use this technology to start building, testing, and deploying your own Web services. The central idea behind the Web services revolution is that the Web will be populated with an assortment of small pieces of code, all of which can be published, found, and invoked across the Web. One key technology for the service-based Web is SOAP, the Simple Object Access Protocol. Based on XML, SOAP allows an application to interact with remote applications. That's all well and good, but how do we find those applications in the first place? That's where UDDI comes in. UDDI provides three basic functions, popularly known as publish, find, and bind: (1) Publish: How the provider of a Web service registers itself. (2) Find: How an application finds a particular Web service. (3) Bind: How an application connects to, and interacts with, a Web service after it's been found. A UDDI registry contains three kinds of information, described in terms of telephone directories: (1) White pages: Information such as the name, address, telephone number, and other contact information of a given business. (2) Yellow pages: Information that categorizes businesses. This is based on existing (non-electronic) standards. (3) Green pages: Technical information about the Web services provided by a given business... In this article I will show how to write basic code for manipulating objects in a UDDI registry. You can test your applications with IBM's Test registry or download the UDDI registry server software available with the Web Services Toolkit. UDDI4J contains an implementation of the client side of UDDI (everything your application needs to publish, find, and bind a Web service). It also includes the source of the code, the complete JavaDoc documentation, and three sample applications. I will go over the UDDIProxy class, the most important class in the package, and cover the three sample applications... IBM's release of UDDI4J gives Web services developers a complete and robust implementation of the client side of UDDI. With this code, you can interact with any UDDI registry, giving your applications complete access to the world of Web services." Article also in PDF format. See: "Universal Description, Discovery, and Integration (UDDI)." [cache] • [February 19, 2001] "The B-To-B Integration Equation Tighter IT links among partners can equal savings in a cooling economy." By Alorie Gilbert. In InformationWeek Issue 824 (February 12, 2001), pages 41-54. "Inside Osram Sylvania, a multimillion-dollar enterprise resource planning system from SAP delivers accurate, critical data in real time to those who need it. [So] Osram Sylvania has set up a new system for sharing business data with its trading partners. The system uses XML, a flexible way to create common information formats and share both format and data via the Web, intranets, and other means. The technology should let the$2 billion lighting-products unit of Osram GmbH exchange the same up-to-the minute data with partners that's available internally. It also may let Osram Sylvania finally realize a greater return on its ERP investment... As manufacturers scramble to respond to changing market conditions and coordinate production plans with partners, business-to-business-integration initiatives are among those IT projects being spared--in some cases, even gaining higher priority. Companies are adopting emerging technologies such as XML and E-marketplaces as alternatives to proprietary EDI networks or to processes that still require fax, mail, and phone calls... Aviall Inc., a $550 million aircraft-parts distributor in Dallas, ditched its mainframe system and invested$30 million in a new technology infrastructure, reasoning that new technologies would enable it to provide better service and to sell more--thus making Aviall a more attractive channel partner. Part of Aviall's new technology infrastructure is an integration package from New Era of Networks (Neon) Inc. Aviall is using the product to integrate its internal applications with Aviall.com, and Neon is the key to ensuring that data on the site is updated in real time instead of batch mode. Aviall also plans to use the Neon product to exchange XML messages with suppliers that request system-to-system integration with the company... While many companies bank on business-to-business integration to save them money over time, it's not clear that achieving supply-chain efficiencies through XML tools will be cheaper or less complex than using EDI, at least at first. Osram Sylvania has installed Microsoft BizTalk Server, an XML translation and routing tool that takes data from Osram's SAP system, translates it to XML, and routes it to suppliers. Though Osram buys from more than 10,000 suppliers, it has integrated with only one vendor via BizTalk Server since turning on the system in October. Osram has invested three months and more than $10,000 in the project so far, but is having trouble finding business partners that can communicate via XML and are capable of delivering the kind of real-time data it's looking for. One issue, Laghaeian says, is that companies must 'really be on top of their game'--having made similar investments in ERP applications that provide real-time, accurate data--but many smaller suppliers aren't. Another trouble spot is a hesitation about XML among some businesses, which suspect it's little more than a reincarnation of the EDI concept. Still, Laghaeian predicts XML will eventually provide an integration model far superior to EDI, one that's driven by real-time business events rather than arbitrary batch transmissions. The company plans to integrate with 50 to 100 of its top suppliers this year via XML... Even$5.3 billion chemical manufacturer Eastman Chemical Co. is moving slowly to integrate with suppliers and customers via XML. Using an XML integration tool from webMethods Inc. (in which Eastman is an investor), it takes an average of eight weeks to add a trading partner to the system. So far, the company is using the application with eight suppliers and customers and has used it to integrate its two online marketplaces, OneChem and Envera. Eastman hopes to add 30 to 50 suppliers and customers to its XML hub this year, but most of its business-to-business transactions are still done via value-added network (VAN)-based EDI. The company expects XML eventually will enable real-time exchange of information about sales forecasts, inventory, and production schedules, helping to reduce inventory..."

• [February 19, 2001] "Iona Technologies Acquires Netfish." By Antone Gonsalves. In CMP TechWeb News (February 15, 2001). "Iona Technologies Inc. announced Thursday the 270 million acquisition of Netfish Technologies Inc., adding to Iona's e-commerce platform the ability to perform more complex interactions between business applications over the Internet. Iona competes with Java application server products from BEA Systems Inc., IBM, and iPlanet, the alliance between Sun Microsystems Inc., and the Netscape division of AOL Time Warner Inc.. Netfish, Santa Clara, Calif., competed with WebMethods Inc., which sells technology that leverages extensible markup language for integrating business applications across the Web. Both WebMethods and Netfish allow corporations to set business processes, such as sending a purchase order from a customer to accounting, warehouse and manufacturing applications, as well as e-mail notifications to business managers. Netfish also gives Iona integration capabilities with EDI, or electronic data interchange, an older and expensive form of electronic communication for business transactions, such as orders, confirmations and invoices, between organizations. The acquisition of Netfish helps Iona shed its image as a provider of CORBA-based technology used for application integration in older client-server environments... Since January, Iona has made three other acquisitions to build out its business application platform, including Object Oriented Concepts Inc. and technology from Software AG Inc. and Suplicity, which was founded by NEC Corp." See also the announcement. • [February 19, 2001] "AOL/Netscape, HP, Others Join OASIS." By [Staff.] In CMP TechWeb News (February 12, 2001). "XML-as-a-standard gets a boost Monday when eighteen (18) more companies sign onto OASIS, the international consortium pushing the Extensible Markup Language. Joining the 180 companies already on board are: America Online/Netscape, Aventail, Access360, Hewlett-Packard, Jamcracker, Deutsche Post AG, and the U.S. Defense Information Services Agency, among others. OASIS' goal is to build and maintain XML schemas for business transactions, customer information quality, entity resolution, directory services, registries, and repositories." See the press release for details. • [February 17, 2001] "Sun Open Net Environment (Sun ONE) Software Architecture. An Open Architecture for Interoperable, Smart Web Services." Sun Microsystems white paper. February 2001 (ca., no date given). "Application software is being broken down into its constituent parts -- into smaller, more modular application components or services. These application services make use of infrastructure software that has also been decomposed into discrete system services. All of these discrete services can be deployed across any number of physical machines that are connected to the Internet. This modular service approach gives businesses great flexibility in system design. By reassembling a few services into a new configuration, a business can create a new business service... Web Services: 'Because Web services communicate over standard Web protocols using XML interfaces and XML messages, all Web services environments can interoperate'; Web Services Technologies: 'The leading candidates include UDDI, WSDL, SOAP, and ebXML'; Making Web Services Smarter: 'What's needed is a new set of standards, an XML framework, to represent contextual information; what's also needed is an open architecture that defines how services use this information and assures service interoperability'; Core Standards and Technologies: 'At its essence, the Sun ONE software architecture is based on XML, Java technology, and LDAP'; Sun ONE Software Architecture: 'The first step involves creating the discrete services, which the Sun ONE software architecture refers to as micro services; the second step involves assembling the micro services into composite services, or macro services. Developers create micro services using integrated development environments, code generators, XML editors, and authoring tools.' Standards Backplane: 'Smart delivery supports a variety of clients using a number of device-specific presentation formats, including HTML, XHTML, WML, and VoiceXML'; Smart Process: 'The Transaction Authority Markup Language (XAML) provides an alternate method to choreograph business services; the XAML initiative is sponsored by Bowstreet, HP, IBM, Oracle, and Sun Microsystems. XAML defines a set of XML message formats and interaction models, and enables the coordination and processing of business-level transactions that span multiple parties across the Internet'; Web Services Developer Model: 'The Java API for XML Messaging (JAXM) provides a native Java interface to XML messaging systems such as ebXML MS, W3C XP, and SOAP; the Java API for XML Registries (JAXR) provides an interface to XML registries and repositories such as the ebXML registry/repository and the UDDI Business Registry'; Summary: 'The Web services technologies that are available today are still rudimentary, but that fact shouldn't stop developers form venturing into this exciting new territory. Web services represent the next generation of software. This architecture provides a guideline to help developers put the myriad XML standards, technologies, and initiatives in perspective. The developer model provides a foundation for development efforts, indicating which technologies and APIs should be used for each facet of a Web service. Sun will continue to provide tools, technologies, specifications, and advice to promote the Web services model of computing." Available in PDF format. See also the main announcement. [cache] • [February 17, 2001] "Keeping Dublin Core Simple. Cross-Domain Discovery or Resource Description?" By Carl Lagoze (Cornell University). In D-Lib Magazine [ISSN: 1082-9873] Volume 7 Number 1 (January 2001). With 33 references. "Multiple views -- different types of metadata associated with a Web resource -- can facilitate a 'drill-down' search paradigm, whereby people start their searches at a high level and later narrow their focus using domain-specific search categories. For example, the Mona Lisa may be viewed from the perspective of non-specialized searchers, with categories that are valid across domains (who painted it and when?); in the context of a museum (when and how was it acquired?); in the geo-spatial context of a walking tour using mobile devices (where is it in the gallery?); and in a legal framework (who owns the rights to its reproduction?). Multiple descriptive views imply a modular approach to metadata. Modularity is the basis of metadata architectures such as the Resource Description Framework (RDF), which permit different communities of expertise to associate and maintain multiple metadata packages for Web resources. As noted elsewhere, static association of multiple metadata packages with resources is but one way of achieving modularity. Another method is to computationally derive order-making views customized to the current needs of a client. This paper examines the evolution and scope of the Dublin Core from this perspective of metadata modularization. Dublin Core began in 1995 with a specific goal and scope -- as an easy-to-create and maintain descriptive format to facilitate cross-domain resource discovery on the Web. Over the years, this goal of 'simple metadata for coarse-granularity discovery' came to mix with another goal -- that of community and domain-specific resource description and its attendant complexity. A notion of 'qualified Dublin Core' evolved whereby the model for simple resource discovery -- a set of simple metadata elements in a flat, document-centric model -- would form the basis of more complex descriptions by treating the values of its elements as entities with properties ('component elements') in their own right. At the time of writing, the Dublin Core Metadata Initiative (DCMI) has clarified its commitment to the simple approach. The qualification principles [Dublin Core Qualifiers] announced in early 2000 support the use of DC elements as the basis for simple statements about resources, rather than as the foundation for more descriptive clauses. This paper takes a critical look at some of the issues that led up to this renewed commitment to simplicity... Metadata is expensive to create -- especially the more complex varieties -- and the benefits need to be weighed against the costs. The development of a well-scoped qualification model has defined an important niche for the Dublin Core in the larger metadata ecology. It is important to publicize this more prudent approach within the broader community, some of which has been confused over the past few years by mixed messages about Dublin Core and its scope. Equally important for the DCMI is the completion of the supporting documentation -- user guides, encoding guides, etc. -- needed to make the Dublin Core deployable with commonly available web tools. The completion of these tasks will allow the DCMI to free itself from an exclusive focus on the fifteen elements and explore, with partner communities, the roles and interaction of multiple metadata schemes in the Internet Commons..." See: "Dublin Core Metadata Initiative (DCMI)." • [February 17, 2001] "The Open Archives Initiative Protocol for Metadata Harvesting." Edited by Herbert Van de Sompel (Cornell University - Computer Science) and Carl Lagoze (Cornell University - Computer Science). Protocol Version 1.0. Document Version 2001-01-21. "The goal of the Open Archives Initiative Protocol for Metadata Harvesting is to supply and promote an application-independent interoperability framework that can be used by a variety of communities who are engaged in publishing content on the Web. The OAI protocol described in this document permits metadata harvesting. The result is an interoperability framework with two classes of participants: (1) Data Providers administer systems that support the OAI protocol as a means of exposing metadata about the content in their systems; (2) Service Providers issue OAI protocol requests to the systems of data providers and use the returned metadata as a basis for building value-added services. A repository is a network accessible server to which OAI protocol requests, embedded in HTTP, can be submitted. The OAI protocol provides access to metadata from OAI-compliant repositories. This metadata is output in the form of a record. A record is the result of a protocol request issued to the repository to disseminate metadata from an item. A record is an XML-encoded byte stream that is returned by a repository in response to an OAI protocol request for metadata from an item in that repository. Appendix 1 supplies 'Sample XML Schema for metadata formats': Each metadata format that is included in records disseminated by the OAI protocol is identified within the repository by a metadata prefix and across multiple repositories by the URL of a metadata schema. The metadata schema is an XML schema that may be used as a test of conformance of the metadata included in the record. XML Schemas for three metadata formats are provided: (1) An XML Schema for the mandatory unqualified Dublin Core metadata format; (2) An XML Schema for the RFC1807 metadata format; (3) An XML Schema to represent MARC21 records in an XML format. Appendix 2 supplies 'Sample XML Schemas for the description part of a reply to Identify request': The response to an Identify request may contain a list of description containers, which provide an extensible mechanism for communities to describe their repositories. Each description container must be accompanied by the URL of an XML schema, which provides the semantics of the container. XML Schemas for two examples of description containers are provided. See also the XML Schema for the Response Format [source] and related schemas. See "Open Archives Metadata Set (OAMS)." [cache] • [February 17, 2001] "Tobacco War: Inside the California Battles." By Stanton Glantz and Edith Balbach. XML monograph. [Cited here as an example of XML publishing.] The eScholarship Project in the California Digital Library has designed "a set of tags developed by the Text Encoding Initiative to mark up its monographic publications in XML. They serve those publications to users through the Cocoon Web Publishing Framework. • [February 16, 2001] "Rescuing XSLT from Niche Status. A Gentle Introduction to XSLT through HTML Templates." By David Jacobs (Mitre). [February, 2001.] Via Roger L. Costello. "XSLT is one of the most exciting technologies to come out of the XML family. Unfortunately, its incredible power and associated complexity can be overwhelming to new users preventing many from experimenting with it or causing them to quickly give up in disgust. In fact, unless the method of teaching and the common style of use for XSLT is radically changed to make it more accessible, XSLT will be relegated to niche status... we can see why embedded web scripting languages like Active Server Pages (ASPs), Cold Fusion, PHP and Java Server Pages (JSPs) are so popular. They all leverage a user's knowledge of HTML. They also allow the minimum amount of scripting to be added to accomplish the dynamic feature a developer is looking for. This has allowed numerous web developers to start off with very small projects and then through continuous enhancement and learning, find themselves using the full power of a complex programming language. Furthermore, because of the very incremental nature of that learning the developer was never scared off... So how do we solve this problem and help deliver XSLT's promise to the masses'? For XSLT to be successful it must be presented and used in a way that adopts those attributes discussed earlier (reuse of knowledge, fast start, and gradualism). This tutorial will attempt to ease XSLT's introduction by focusing on these attributes. First, it is only going to focus on the generation of HTML documents and users who are familiar with HTML. If your goal is to immediately start transforming one XML document into another XML document this tutorial is not for you. The second is to reframe the problem so the XSLT solutions programmers write are more naturally extensible and intuitive. Instead of trying to translate an XML source document into an HTML presentation document, the programmer should see the XML document as a complex data structure with XSLT providing powerful tools for extracting that information into their HTML documents. This allows us to leverage the experience most people have with using an HTML templating language (e.g. ASP, PHP, JSP, Cold Fusion, Web Macro, etc). These templating languages are all based on the basic premise that HTML comes first and all enhancements are then embedded in special tags. Thus, the problem is reframed as: how do I create XSL-enhanced HTML documents? With some caveats, this tutorial will show how XSLT can be used in this same way. The benefit of this approach is it allows the quick use of many of XSLT's powerful functions while letting you learn its more esoteric capabilities as the need arises. In addition the resulting XSLT files are more intuitive and maintainable..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." • [February 16, 2001] "Ruby Annotation." Edited by Marcin Sawicki, Michel Suignard, Masayasu Ishikawa, Martin Dürst, and Tex Texin. W3C Working Draft 16-February-2001. Updates the previous version http://www.w3.org/TR/1999/WD-ruby-19991217. Also available in .ZIP archive format. Abstract: "Ruby are short runs of text alongside the base text, typically used in East Asian documents to indicate pronunciation or to provide a short annotation. This specification defines markup for ruby, in the form of an XHTML module." Details: "Ruby text is used to provide a short annotation of the associated base text. It is most often used to provide a pronunciation guide. Ruby annotations are used frequently in Japan in many kinds of publications, including books and magazines. Ruby is also used in China, especially in schoolbooks. Ruby annotation is usually presented alongside the base text, using a smaller typeface. The name 'ruby' in fact originated from the name of the 5.5pt font size in British printing, which is about half the 10pt font size commonly used for normal text... This document is a W3C Working Draft produced in preparation for moving to Candidate Recommendation. This document has been produced as part of the W3C Internationalization Activity by the Internationalization Working Group with the help of the Internationalization Interest Group (I18N IG). The I18N WG expects to ask the W3C Director to advance this document to Candidate Recommendation in the near future. The I18N WG and the editors will make any adjustments to the notation in case such adjustments should become necessary as a consequence of changes to XHTML 1.1. Comments in languages other than English, in particular Japanese, are also welcome. More general public discussion of Ruby Annotation takes place on the 'www-international' mailing list... The only change in the actual markup since the Last Call publication of this document was to change the content model for simple ruby from (rb, rp?, rt, rp?) to (rb, (rt | (rp, rt, rp))) to allow two or zero rp elements, but not a single one. However, the document was editorially reorganized and rewritten substantially to take into account the many editorial comments received during Last Call. We therefore also invite further comments on presentation, wording, and examples..." • [February 16, 2001] "CSS3 Module: Ruby." Edited by Michel Suignard (Microsoft). W3C Working Draft 16-February-2001. Also available as a ZIP archive. The document "proposes a set of CSS properties associated with the 'Ruby' elements. It is a working draft of the CSS Working Group which is part of the W3C Style activity. It contains a proposal for features to be included in CSS level 3... Ruby is the commonly used name for a run of text that appears in the immediate vicinity of another run of text, referred to as the 'base', and serves as an annotation or a pronunciation guide associated with that run of text. Ruby, as used in Japanese, is described in JIS X-4051 [Line composition rules for Japanese documents, JIS X 4051-1995, Japanese Standards Association, 1995]. The ruby structure and the HTML markup to represent it is described in the Ruby specification ... This CSS ruby model is based on the XHTML Ruby Annotation module proposal, in which the structure of a ruby closely parallels the visual layout of the ruby element. In this model, a ruby consists of one or more base elements associated with one or more annotation elements. The CSS model does not require that the document language include elements that correspond to each of these components. For document languages (such as XML applications) that do not have pre-defined ruby elements, authors must map document language elements to ruby elements; this is done with the 'display' property..." See "W3C Cascading Style Sheets. Level 3." • [February 16, 2001] "URISpace 1.0." By Mark Nottingham (Akamai Technologies). W3C Note 15-February-2001. February 15, 2001. Abstract: "URISpace provides a flexible mechanism for assigning metadata to a group of resources, based on the namespace described by their URIs." Detail: "This NOTE identifies problems in the application of metadata across groups of resources, and proposes one method of addressing them. It is intended to bring such problems to the attention of the community, and to foster discussion on appropriate solutions to them... URISpace provides a framework that an application can use to assign arbitrary metadata to an entity based on its URI namespace, optionally using additional external selection criteria. This is done by building a tree of XML elements, called selectors, to describe the URI namespace and contain the metadata. This document describes the elements that allow selection into the namespace, demonstrates how the tree should be structured and used, illustrates how metadata might be represented, and defines ways to extend and optimize functionality. It is intended as a framework for other applications to adapt as their needs require... URISpace is designed to be a standard, clear and extendable mechanism for assigning metadata (such as XML-serialized RDF) to resources based on the namespace that URIs describe, as well as optional external criteria. Besides those mentioned, possible applications include an alternate form [WREC-KP] of proxy auto-configuration [PROXYCONF] that uses XML instead of a scripting language; a standard Web server configuration file format, configuration files for HTTP surrogates and content delivery networks, assigning metadata, and any other circumstance where there is a need to declare an arbitrary collection of resources. See the end of this document for examples. Major design goals for URISpace are to: (1) facilitate assignment of metadata through selection of URIs for a wide variety of existing and future applications; (2) allow flexible selection, so that groups of URIs can be accurately and precisely described; (3) enable efficient description of URI namespaces; (4) define a format that is easy for users to manipulate and understand; (5) define a format that is extensible." See also the submission request and the W3C staff comment by Dan Connolly. • [February 16, 2001] "The Design and Implementation of the Redland RDF Application Framework." By Dave Beckett (Institute for Learning and Research Technology University of Bristol). Accepted for presentation at the WWW10 Conference in Hong Kong in May, 2001. HTML, PostScript and PDF versions are available. Abstract: "Resource Description Framework (RDF) is a general description technology that can be applied to many application domains. Redland is a software library for RDF which implements a flexible framework that complements this power and provides a high-level interface allowing instances of the RDF model to be stored, queried and manipulated. Redland implements the model concepts using an object-based API and provides several of the classes as modules which can be added, removed or replaced to provide different functionality or application-specific optimisations. The framework also provides a core technology for developing new RDF applications that can experiment with implementation techniques, APIs and representation issues." The running code is on http://www.redland.opensource.ac.uk/; see also the Redland home page. See "Resource Description Framework (RDF)." [cache] • [February 15, 2001] "Perfect for Each Other. XML and PKI Together May Boost Trust in Online Marketplaces." By Michelle Nichols. In Intelligent Enterprise Volume 4, Number 3 (February 16, 2001), pages 10-11. "Some marriages seem meant to be: Extensible Markup Language (XML) is becoming the lingua franca of e-commerce; digital signatures offer companies a faster, cheaper method of conducting secure online transactions. VeriSign Inc., Microsoft, and WebMethods Inc. are the matchmakers bringing these two technologies together. These companies all have a stake in raising the level of trust in online transactions: MountainView, Calif.-based VeriSign plays a significant role in the growing acceptance of digital signatures; WebMethods, based in Fairfax, Va., helps companies set up business-to-business (B2B) marketplaces; and Microsoft's .Net architecture and BizTalk server are largely targeted at online marketplaces. For these marketplaces to mature and become more popular, businesses must be confident that transactions are legally enforceable and verifiable. Digital signatures, which have the backing of federal law, can verify identities on both sides of the transaction and the content of the transaction itself. But adopting a public key infrastructure (PKI) framework, the basis of many digital signature technologies, is not simple and can be expensive. But help may be on the way: Microsoft, VeriSign, and WebMethods recently introduced the XML key management specification (XKMS), which they believe will simplify integrating digital signatures and data encryption with Web applications. They also hope to speed development of applications using these technologies by making XKMS publicly available and submitting the specification to Web standards bodies for consideration as an open Internet standard. The companies assert that the XKMS spec, along with the recently drafted XML digital signature standards and the emerging XML encryption standard, can provide an open framework for interoperability across applications. (Microsoft plans to include XKMS in its .Net architecture.) XKMS is also compatible with the emerging standards for Web services description language (WSDL) and simple object access protocol (SOAP)..." See (1) "XML Key Management Specification (XKMS)" and (2) the announcement for an XKSM interest group meeting. • [February 15, 2001] "Functional Programming and XML." By Bijan Parsia. From XML.com. February 14, 2001. ['Current XML programming practice is dominated heavily by object-oriented techniques, but are we missing out on new and innovative ways of handling XML? Find out in our whistle-stop tour of functional programming and XML.'] "As is all too common in the programming world, much of the XML community has identified itself and all its works with object oriented programming (OOP). While I'm a fan of OOP, it's clear to me that even the best OOP-for-XML techniques aren't a panacea, and, moreover, there is an awful lot of ad hoc 'objectification' which tends merely to make our lives more difficult and our programs less clear. This short-sightedness has two negative consequences: it tends to limit the techniques and tools we use, and it tends to restrict our understanding. For example, although the Document Object Model (DOM) satisfies few and inspires fewer, its status as a standard tends to inhibit (though, fortunately, not to suppress) exploration into alternative models and practices. The debate tends to revolve around 'fixing' DOM, which is cast in terms of getting a better object model. While a better object model would be nice, it's important to keep in mind that XML is neither historically nor inherently object-oriented. Thinking otherwise may lead you to perceive or anticipate a better fit than you actually get. One cure for intellectual myopia is to go exploring. In this article, I provide a beginner's travel guide to the interesting and instructive land of functional programming (FP) and XML... As the name implies, functional programming is 'function-oriented' programming (though C doesn't really count). FP languages allow you to manipulate functions in arbitrary ways: building them, combining, them, passing them around, storing them in data structures, and so on. And it's key that FP functions are (at least, ideally) side-effect free. FP's declarative nature makes it much easier to deal with things like concurrency, data structure transformations, and program verification, validation, and analysis. Functions are to FP what objects are to OOP. While there are many ways to classify functional languages according to technical features and formal properties, such divisions tend to be more useful to the FP experienced, degenerating into mere buzzwords for the novice. I have a rough and ready way of dividing FP languages which seems to capture how they feel to the casual FP dilettante: (1) Lispy languages, especially Scheme (and, somewhat atypically, Rebol and Dylan). (2) Type-obsessed languages, such as ML, OCmal, Clean, and Haskell. (3) Prolog-derived languages, such as Erlang, Mercury, and Oz... The rest of this article highlights some of the interesting features of three FP systems for processing XML: XMLambda (an XML-specific FP language), HaXML (XML facilities for Haskell), and the new XML support in Erlang... Ideas from the functional programming world have always percolated into mainstream practice, but we seem to be reaching a point where many FP techniques and tools are poised for wholesale -- or at least retail -- acceptance. For example, James Clark's recently proposed Trex validating language for XML.." • [February 15, 2001] "Perl and XML: High-Performance XML Parsing With SAX." By Kip Hampton. From XML.com. February 14, 2001. ['Manipulating XML documents in Perl using DOM or XPath can hit a performance barrier with large documents -- the answer is to use SAX.'] "The problem: The XML documents you have to parse are getting too large to load the entire document tree into memory; performance is suffering. The solution: use SAX. SAX (Simple API for XML) is an event-driven model for processing XML. Most XML processing models (for example: DOM and XPath) build an internal, tree-shaped representation of the XML document. The developer then uses that model's API (getElementsByTagName in the case of the DOM or findnodes using XPath, for example) to access the contents of the document tree. The SAX model is quite different. Rather than building a complete representation of the document, a SAX parser fires off a series of events as it reads the document from beginning to end. Those events are passed to event handlers, which provide access to the contents of the document... There are three classes of event handlers: DTDHandlers, for accessing the contents of XML Document-Type Definitions; ErrorHandlers, for low-level access to parsing errors; and, by far the most often used, DocumentHandlers, for accessing the contents of the document. For clarity's sake, I'll only cover DocumentHandler events.. XML::Parser::PerlSAX offers a complete SAX1 API but, as you may be aware, SAX2 is now considered the standard. If you're wondering about SAX2 support for Perl, you should know that Ken MacLeod, author of XML::Parser::PerlSAX, as well as other top-notch XML Perl modules, has announced full SAX2 support for Perl using his excellent Orchard project. Orchard provides a lightning-fast element/property model upon which developers can easily implement a wide range of XML APIs (or, for that matter, any node-based property set, not just XML). In addition to SAX2, the 2.0 beta release of Matt Sergeant's XML::XPath is also built upon Orchard and the performance gains are quite astonishing. If you are serious about high-performance XML processing in Perl, I strongly encourage you to visit the Orchard project for more information." See: "XML and Perl." • [February 15, 2001] "XML-Deviant: XSLT Extensions Revisited." By Leigh Dodds. From XML.com. February 14, 2001. ['The first Working Draft of XSLT 1.1, though attempting to address the portability of stylesheets that use extension functions, has failed to please everyone in the XSLT developer community.'] "Early last year the XML Deviant reported on concerns expressed among the XSLT development community about portability of XSLT stylesheets. And despite publication of the XSLT 1.1 Working Draft which attempts to address these issues, some developers are still far from happy... In March 2000 the Deviant summarized a debate that ranged over several mailing lists. Its subject was concerns about lack of portability of stylesheets among the major XSLT engines. While the XSLT specification defined a language-neutral extension mechanism for extension functions, implementors were tied to the proprietary APIs provided by a particular XSLT engine. This meant that functions, and hence stylesheets, were not portable. (See "Unifying XSLT Extensions" for additional background.) At the time the consensus was that a standardized language binding was required, along with implementations of common extension functions. Shortly thereafter the XSL Working Group announced that it was aware of these concerns and was planning to address them in a revision of the XSLT specification..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." • [February 15, 2001] "Millau: an encoding format for efficient representation and exchange of XML over the Web." By Marc Girardot (Institut Eurécom Sophia Antipolis, France) and Neel Sundaresan (IBM Almaden Research Center San Jose, California, USA). Presented at WWW9. "XML is poised to take the World Wide Web to the next level of innovation. XML data, large or small, with or without associated schema, will be exchanged between increasing number of applications running on diverse devices. Efficient storage and transportation of such data is an important issue. We have designed a system called Millau for efficient encoding and streaming of XML structures. In this paper we describe the Millau algorithms for compression of XML structures and data. Millau compression algorithms, in addition to separating structure and text for compression, take advantage of the associated schema (if available) in compressing the structure. Millau also defines a programming model corresponding to XML DOM and SAX for XML APIs for Millau streams of XML documents. Our experiments have shown significant performance gains of our algorithms and APIs. We describe some of these results in this paper. We also describe some applications of XML-based remote procedure calls and client-server applications based on Millau that take advantage of the compression and streaming technology defined by the system... The Millau encoding format is an extension of the WAP Binary XML format. The WBXML (Wireless Application Protocol Binary XML) Content Format Specification defines a compact binary representation of XML. This format is designed to reduce the transmission size of XML documents with no loss of functionality or semantic information. For example, WBXML preserves the element structure of XML, allowing a browser to skip unknown elements or attributes. More specifically, the WBXML content encodes the tag names and the attributes names and values with tokens (a token is a single byte). In WBXML format, tokens are split into a set of overlapping 'code spaces'. The meaning of a particular token is dependent on the context in which it is used. There are two classifications of tokens: global tokens and application tokens. Global tokens are assigned a fixed set of codes in all contexts and are unambiguous in all situations. Global codes are used to encode inline data (e.g., strings, entities, opaque data, etc.) and to encode a variety of miscellaneous control functions. Application tokens have a context-dependent meaning and are split into two overlapping 'code spaces', the 'tag code space' and the 'attribute code space'..." See: "XML and Compression." • [February 14, 2001] "Soapbox: Why I'm using SOAP. One developer tells why he's feeling sold on SOAP." By Benoît Marchal (Software Engineer, Pineapplesoft). IBM DeveloperWorks XML Library. February 2001. ['In the XML zone's new opinion department, Benont Marchal steps up on the soapbox to tell why SOAP is winning him over. SOAP's selling point is its simplicity, Marchal says. Because the new protocol builds on familiar technologies, in particular the Web server and XML, it's relatively easy for developers to design and deploy SOAP servers. Learn when its best to consider and use SOAP.] "SOAP, the Simple Object Access Protocol, is a new protocol engineered by IBM, Microsoft, Userland, and DevelopMentor to support remote procedure calls (and other sophisticated requests) over HTTP. SOAP draws from two distinct environments. Built on HTTP and XML, SOAP aims to be as simple as the Web. Yet it targets object-oriented remote procedure calls borrowed from CORBA and DCOM. I think that the major benefit of adopting SOAP is that it builds on a Web server. Therefore, to understand SOAP, one needs to start with Web servers. Modern Web servers -- and in particular application servers like WebSphere, WebLogic, or ColdFusion -- are powerful development platforms. They are optimized to process requests efficiently. SOAP is an attempt to turn these Web servers into object servers. By object servers I mean the middle-tier servers in a three-tier architecture. SOAP supports object servers this way by adding a thin XML layer over HTTP... I started using SOAP and I realized that its major benefit is that it builds on the Web. Granted, SOAP is more limited than CORBA and DCOM. For example, it offers limited support for object-oriented concepts like inheritance, and it lacks transaction management (as offered by MTS with DCOM or OTS with CORBA). However, what it lacks in power, SOAP more than compensates for in its simplicity. For example, since SOAP uses HTTP, SOAP servers are Web servers. Most businesses have significant experience in deploying Web servers or developing Web applications. With SOAP, they can leverage that experience for object servers. I can think of only two drawbacks to SOAP but, depending on the project, they can be significant. First, SOAP is not an official standard yet. The W3C has launched its own Protocol Activity, and there is no guarantee that the result will be compatible with SOAP. Second, SOAP is new, and it lacks some of the tools common for CORBA or DCOM. In particular, there is no transaction management with SOAP. This is not an inherent limitation of SOAP; I'm sure transaction managers will eventually appear on the market, but they are not available yet." See "Simple Object Access Protocol (SOAP)." • [February 14, 2001] "ebXML Moves Closer to Completion." By Tom Sullivan. In InfoWorld (February 12, 2001). "[Vancouver, British Columbia.] Bringing the pending technology closer to a final specification, ebXML (e-business XML) standards bodies kicked off a week's worth of meetings and working groups here Monday. ebXML is a specification for XML-based global infrastructure for e-business transactions, being driven by the Organization for the Advancement of Structured Information Standards (OASIS) and the United Nations' Center for Trade Facilitation and E-business (UN/CEFACT). The working groups at the conference aim to further a number of ebXML pieces, including the Technical Architecture, Registry Information Model, and Registry Services specification, all of which currently are out for review. Working groups this week will also develop core competencies of ebXML, such as the Implementation Model methodology, as well as messaging and business processes. The goal is to have the final specification ready to be voted on by April 23. ebXML for the last 15 months has been driven by OASIS and UN/CEFACT, under a relationship intended to last 18 months... Bob Sular, vice chairman of the ebXML organization. pointed to the industry movement toward Web services, and the efforts of Microsoft, Sun, and Hewlett-Packard specifically, as catalysts of change in the industry that the standards bodies need to consider in the future." See: "Electronic Business XML Initiative (ebXML)." • [February 14, 2001] "Universal Description, Discovery, and Integration (UDDI) and the U.S. Federal Government." By Eliot Christian (USGS). Presented to the CIO Council XML Working Group. February 14, 2001. "Registry operator assigns a unique key to each registered business and service; operators synchronize updates among all UDDI registries Citizens, businesses, agencies, search engines, and software apps query the registry to discover services Agencies, businesses, and standards organizations register different types of services Agencies register themselves and their services Citizen, business, or agency uses service data to facilitate interaction..." See: "Universal Description, Discovery, and Integration (UDDI)." • [February 14, 2001] "MSL: A Model for W3C XML Schema." By Allen Brown [Microsoft, allenbr@microsoft.com], Matthew Fuchs [Commerce One, matthew.fuchs@commerceone.com], Jonathan Robie [Software AG, jonathan.robie@SoftwareAG-USA.com], and Philip Wadler [Avaya Labs, wadler@avaya.com]. November 2000. 22 pages, with 13 references. Accepted for WWW10, Hong Kong, May 2001. Abstract: "MSL (Model Schema Language) is an attempt to formalize some of the core idea in XML Schema. The benefits of a formal description is that it is both concise and precise. MSL has already proved helpful in work on the design of XML Query. We expect that similar techniques can be used to extend MSL to include most or all of XML Schema. Referenced in Philip Wadler's XML resources. See also the sources of the MSL paper (mv to msl-www.tar, 'tar' format). See discussion. [cache PDF, cache sources] • [February 13, 2001] "Using XML and Java technology to Develop Web Services: Keeping Migration Paths Open. [XML at Sun.]" From Sun Microsystems. February 06, 2001. "One of the greatest challenges facing developers today is how to take advantage of current Web service models while still leaving open smooth migration paths to the next generation of open, loosely coupled, context-aware, smart Web services. XML and Java technology are ideally suited to provide this smooth migration path -- XML as the open and ubiquitous language of data exchange, and Java as the industry-wide standard for scalable, cross-platform service delivery...The current Web services model requires building protocols, interfaces and products that support instantaneous combining and recombining of components through: (1) Discovery -- Discover other relevant services by means of ebXML Registry and Repository and/or UDDI. (2) Creation -- Create XML content, and develop business logic code. (3) Transformation -- Map to existing XML, Java or SQL applications. (4) Building -- Wrap components with XML message interfaces, using ebXML Message Service or SOAP. (5) Deploying -- Transmit services to service deployment machines. (6) Testing -- Test applications as they run locally or remotely. (7) Publishing -- Advertise the service to an ebXML or UDDI registry. XML: Lingua Franca of Web Services. The language of data exchange at the heart of almost all Web service components, protocols, and APIs today is XML. Unlike ASCII or HTML, XML is structured and hierarchical, which means it can be represented as a tree structure or data model appropriate to applications, but it also uses plain text for data representation, which makes it easy for both humans and search engines to read. XML is also platform-neutral, and as such ideally matched with Java technology for developing Web-based applications. Java technology provides a portable, platform-independent software environment, and XML provides portable, platform-independent data..." • [February 09, 2001] "JavaTalk: DOM and the Java API for XML." By John Wetherill. In SunServer Magazine Volume 15, Number 2 (February 2001), page 6. "The Java API for XML Processing (JAXP) encompasses three distinct Java APIs designed to process XML content using Java technology. These include SAX (Simple API for XML), DOM (Document Object Model), and XSLT (Extensible Stylesheet Language Transformations). Last month we looked at SAX, which serially parses an XML document. This column will examine the details of DOM and contrast it with SAX. DOM was defined by the W3C (unlike SAX which was created by participants in the XML-DEV mailing list), and has evolved into a standard. It is comparatively easy to use and provides the developer with a tree structure representing XML content. This structure can be manipulated directly by the application and is ideal for interactive applications that need to access and modify XML content directly. DOM has no specific ties to the Java language as it was defined to be platform and language neutral. Several DOM bindings currently exist, including Java, ECMAScript (a standardized JavaScript), and a language-neutral IDL based binding. This article will focus on the Java binding. Because the tree-structure representing the entire XML content is maintained in-memory, DOM-based applications can be CPU- and memory- intensive. Generally SAX, with its serial call-back mechanism for XML parsing, is more suitable for performance-sensitive server applications. DOM, on the other hand, is more appropriate for interactive applications which need to access the entire XML content at once. JAXP is an abstract layer on top of DOM, defining a pluggable architecture for XML-compliant parsers. The specific XML parser that is used can change without requiring any modifications to the application source code. JAXP provides a standard way to interpret, manipulate and save XML content, as well as a mechanism to translate XML and apply style sheets. Currently the specification for JAXP is part of the JCP, and was due to be finalized and ship early this year. A useful distinction to make between DOM and SAX centers on the lifecycle of a program using each. The lifecycle of a SAX-based application is equal to that of the parsing process. Here the application will receive callbacks during parsing, and will typically exit when parsing completes. By contrast, the lifecycle of a DOM-based application begins when parsing is complete. The DOM application will receive a node for the tree representing the entire document. Given this information, the application can traverse the document tree, then write, modify and store the XML content... DOM and SAX are two complimentary APIs provided by the Java API for XML processing. You can explore these technologies further by downloading Sun's implementation from http://java.sun.com/xml and building applications by following the examples presented in the tutorial available at the same location. Note that Sun's DOM implementation uses SAX to parse the original XML, however there is no dependency on SAX, and because of its pluggable design any XML parser can be used by the implementation to interpret the XML." See: "Java API for XML Parsing (JAXP)." • [February 09, 2001] "XML, Java Open The Vault to Legacy Data." By Amatzia Ben-Artzi. In SunServer Magazine Volume 15, Number 2 (February 2001), pages 14-15, 23. [This case study follows the challenge of taking advantage of legacy data as part of the larger challenge of Web content management using an innovative blend of XML and Java technologies.'] "The challenge of taking advantage of legacy data is part of the larger challenge of Web content management. Companies need to maintain a site's value by managing and delivering dynamic and targeted content potentially from any source, including legacy systems, to the site as well as other media devices. This means they need to come up with a procedure for centrally organizing, controlling, finding, sharing and storing Web content at each stage of the content lifecycle. The goal is to create rich content once, keep it looking the same across all formats and use it for multiple objectives, including e-commerce, CRM, personalization and one-to-one marketing... An innovative blend of XML and Java technologies can provide complete content lifecycle management that embraces even legacy data and systems. Sun recently selected the TrueView eCommerce System from NetPost for this purpose. With TrueView, XML serves as a content management platform that enables interactive transactions with critical e-business content including marketing data, yellow pages, catalogs, classified ads, electronic bill payment and commercial and consumer publications. A company wishing to use its legacy data for e-business needs to take a four-step approach: (1) Convert all unusable legacy data (text and graphics) to a normalized XML database using automated tools and minimal manual intervention. This database may also house newly created XML content. The conversion process requires more than a superficial re-formatting; it requires the XML to capture both the deep meaning of the content plus the layout and style. (2) Deliver important content easily and rapidly to multiple devices. TrueView uses XML to enable 'smart' presentation of content, using embedded structural information to format it in a way that is meaningful to the reader. (3) Use a single, consistent data source to maintain a consistent branding look and feel across different media devices, reinforcing brand equity and customer loyalty. This brings in new revenue streams and additional revenue from loyal customers by providing relevant content to them on the media device of their choice. (4) Embed eCommerce applications as an integral part of rich online content. Storing content in an open, standard XML format as objects in a relational database management system provides the foundation for an enterprise e-business application. In the approach used by TrueView, documents created with most commercial applications such as Microsoft Word, Quark Xpress, PDF and EPS are converted to XML format at import. This approach to conversion captures two key aspects of content: both the deep meaning, or raw data, and the layout and typography. Sometimes, in converting data from a legacy database, some of the deep meaning that's necessary for correct formatting is missing. For example, a major Bell company is using a sophisticated automated process for converting EPS-format yellow pages data into XML. When elements are ambiguous, the solution allows manual intervention to resolve the issue. Most users find they can move 90 percent of their data into the XML format without any intervention. Once the legacy data has been stored in the XML database, it is accessible and indexed for searching. Business rules in the database control access and presentation of content for each media device..." • [February 09, 2001] "The Hook: A Minimal Validation Language of One Element Based on Partial Ordering." By Rick Jelliffe. 2001/02/07. "The Hook validation language is a thought experiment in minimalism in XML schema languages. The purpose of such a minimal language would be to provide useful but ultra-terse success/fail validation for basic incoming QA, especially of datagrams. It is like a checksum for a schema. The validation it performs can be characterized as "Does this element have a feasible name, ancestry, previous-siblings and contents?", there being some tradeoff between the how fully the later criteria are tested. Let us start with the following technical criteria: (1) Smaller than DTD: if it is downloaded from a server as a separate file, it should be downloadable in the first packet group, so less than 512 (the minimum MTU) -100 (for MIME header) =412 bytes; (2) Implementable by a streaming processor; (3) No forward references; (4) No pathological schemas as far as blowouts; (5) An efficient implementation should be possible; (6) Suitable for coarse validation of document for some significant issues; (7) The schema should be namespace-aware; (8) The minimal schema should only require 1 element or perhaps fit in a PI; (9) The datatype should be expressible using XML Schemas regular expressions or simple space-separated tokens; (10) The schema paradigm is the (partial) ordering of elements against the information kept during stream processing... A Hook schema is an element containing a list of element names, some of which may be grouped by square brackets. This list represents a certain ordering of the names and validation consists of checking conformity to this ordering. The DTD for the language is [7 lines]... Hook seems to suit languages that have large flat bottoms, languages specific requirements early on in each content model, languages with specific elements that do not re-occur in different contexts with different priorities, languages with attributes that are not vital or will be checked by other mechanisms. Hook would seem useful as a coarse-grained but ultra-terse validation language. If we say that validation is to catch errors that are most likely to happen, the most likely errors are spelling errors, children in the wrong order, and required parents: Hook gets or catches most. How much would this help an interactive editor? It would know which elements can start, but for new documents it would present to many choices: however if editing existing documents it would cull the available list pretty well, because it would know what the current level was. It would know empty elements... Joe English has posted interesting material regarding formalisms for Hook, algorithm for implementing and other material..." For schema description and references, see "XML Schemas." • [February 09, 2001] "An Open Response to Microsoft. [How to .COM, Reality Check/]". By Sun Microsystems. February 05, 2001. "Sometimes, life's little ironies are just too funny for words. Consider an e-mail sent to reporters by Chuck Humble of Waggener-Edstrom, the long-time public relations agency for Microsoft. In his missive, Chuck (in a not so humble fashion) crows about Microsoft's .NET (perhaps it should be .NOT?) strategy for delivering 'Web services' over the Internet, and lobs what he's hoping are 15 tough questions for Sun to answer about our software strategy. [OpEd. Hmmm... I think Dave Winer's title would be constitute an improvement here: "Sun and Microsoft sittin in a tree. A-R-G-U-I-N-G." See Scripting News, Wednesday, February 07, 2001.] • [February 09, 2001] "B-to-B Pitfalls. [Special News Report.]" By Mark Leon. In InfoWorld Volume 23, Issue 6 (February 05, 2001), pages 36-37. With sidebar: "EDI vs. XML - Still Undecided." "The Business-to-Business automation story is no longer getting the rave reviews that were standard fare just one year ago. Professionals and analysts are realizing that, although the promise may still be there, the b-to-b tale is no different from the technology and business stories that preceded it. There are no out-of-the-box solutions, no instant panaceas to be found in the Internet, XML, or any other server products now available. None of this is news to Frank Campagnoni, CTO of General Electric's Global Exchange Services (GXS) division, which focuses on b-to-b exchanges for GE customers. Last spring, GE separated GXS from its General Electric Information Systems division for the express purpose of supercharging its next-generation b-to-b offerings. 'It is absolutely true that b-to-b is far more challenging than most people wanted to admit early last year,' Campagnoni says. 'But we always knew this. We still believe there is huge opportunity here, but there was a lot of vapor around b-to-b. Now it is subsiding.' According to Geoffrey Bock, an analyst at the Patricia Seybold Group, in Boston, this vapor has consisted of a general confusion surrounding b-to-b. 'We are in a Gertrude Stein dilemma here: We don't know the answer because we don't know what the question is,' he says... The human-to-machine interaction paradigm works well in the b-to-c [business-to-consumer] space,' GE's Campagnoni says. 'The payoff of a human sitting in front of a browser, however, is very limited for b-to-b applications.' This is why BevAccess helps its larger retail customers install and integrate software that talks both to the retailer's POS system and BevAccess' servers, which Sanders says can take as little as three days. For big customers, such as restaurant chains Bennigans or TGI Fridays, the process will probably take longer because they are custom jobs requiring integration with corporate software at company headquarters, adds Sanders. One of the critical components of b-to-b integration is deciding which data format to use between systems. Choices range from ASCII files, which Sanders calls the 'lowest common denominator,' to high-level formats such as EDI (electronic data interchange) or XML. EDI remains one of the most popular formats, despite XML's anointed status as a worthy successor (see related article, above). For example, Sanders says his retail package customers tend to prefer EDI; BevAccess plugs into GE's substantial existing EDI infrastructure to process all of its EDI transactions. But XML hasn't fallen out of favor: Both Sanders and Campagnoni still believe that because of its flexibility, XML will become the first choice for formatting e-commerce data. GE, for example, made a substantial commitment to b-to-b systems built around XML and the Internet when it created the GXS division last year to fully exploit Internet technologies such as XML to support GE's b-to-b networks. But Campagnoni says that progress is slow because 'XML is creeping in, but it remains largely invisible.' This invisibility is due to the widespread deployment of older EDI systems and the fact that XML is not necessarily easier to implement. 'If you want to exchange XML documents with another company, you have to both agree, not just on XML, but also on a particular flavor of XML,' Campagnoni says. 'We still don't have the universal standards for XML that make this easy to support.' GXS has partnered with Edifecs, a b-to-b software vendor in Bellevue, Wash., to deal with these problems. GXS will use Edifecs' CommerceDesk platform to accelerate the process of linking two or more trading partners together. The trading partners may use different versions of XML and/or EDI, but CommerceDesk will translate between languages to speed up transactions. Too good to be true? It was both unrealistic and all too human to believe that the Internet, combined with XML, would quickly democratize b-to-b marketplaces, making it possible for any business with a browser to participate. Even the optimists have a word of caution when it comes to b-to-b automation: If there is not a definite, demonstrable ROI, cost and integration issues won't really matter..." • [February 09, 2001] "ebXML Technical Architecture Specification." Version 1.0.2. By: ebXML Technical Architecture Project Team. Edited by Brian Eisenberg (DataChannel) and Duane Nickull (XML Global Technologies). 5-February-2001. "This document is a final draft for the eBusiness community. Distribution of this document is unlimited. This document will go through the formal Quality Review Process as defined by the ebXML Requirements Document. This document describes the underlying architecture for ebXML. It provides a high level overview of ebXML and describes the relationships, interactions, and basic functionality of ebXML. It should be used as a roadmap to learn: (1) what ebXML is, (2) what problems ebXML solves, and (3) core ebXML functionality and architecture. Other documents provide detailed definitions of the components of ebXML and of their inter-relationship. They include ebXML specifications on the following topics: Requirements, Business Process and Information Meta Model, Core Components, Registry and Repository. Trading Partner Information. Messaging Services. These specifications are available for download at http://www.ebxml.org." [cache] See: "Electronic Business XML Initiative (ebXML)." • [February 09, 2001] "Vertical XML Standards and ebXML." By Ron Kleinman (Chief Vertical Evangelist, Sun Microsystems, Inc.). February 2001 [ca.; no date given] "This paper summarizes several vertical XML standard efforts (in retail, hospitality, travel, and education) and examines the requirements that drove the creation of three distinct layers of XML infrastructure for each of them. Each layer is analyzed in some detail, and a set of design alternatives are identified and explored. Finally, the paper evaluates the applicability of the SOAP (Simple Object Access Protocol) and ebXML TR&P (Transport, Routing and Packaging) standards to vertical XML standard construction. The paper is presented as a set of answers to several critical XML design questions... [What exactly is a Vertical XML Standard?] The promise of XML is being realized in a large number of industries such as retail, hospitality, travel, education, and finance. Each of these industries already has or is currently developing an XML standard specific to its own unique needs. However, all of these standards share several common goals. (1) Interoperability At the minimum, such a standard defines the set of XML documents exchanged between two communicating applications, whether they are located within separate business partners, or co-located within the same corporate enterprise. The overarching goal is for a given application to interoperate with its partner independent of the platform (hardware, software, middleware) on which the partner might be deployed. (2) Plug and Play Components Vertical XML standards efforts are primarily driven by end users, although the bulk of the work is often performed by the software vendors and system suppliers who sell to them. The effort interests end users because once an application is specified in terms of the data it sends and receives, it is theoretically possible to replace it with a different application that also meets this specification, A vertical XML standard, therefore, has the potential to turn every industry-specific software package into a commodity, and to break any existing vendor 'locks' at both the system AND application level -- obviously a mixed blessing for established suppliers. (4) Compliance Testing. In order to ensure such interoperability (and often to supply a revenue stream to the standards organization), some sort of compliance test procedure is often constructed. This involves creation of a test harness and a series of test procedures that are generated from use cases defined by the various subcommittees within the standard body..." • [February 09, 2001] "XML: State of the Union." By Bill Trippe and David Guenette. In The Gilbane Report on Open Information & Document Systems Volume 8, Number 10 (December2000/January 2001), pages 1-10. "The annual XML conference produced by the GCA is still the largest single XML event, even though there are now a lot of other well-attended developer-oriented XML conferences. XML 2000 was also the most well rounded U.S. conference this year in terms of attendees. During the opening keynote I asked for a show of hands and it looked like almost 40% of the approximately 3000 attendees were not developers. Nothing gets implemented without developers, but business and IT managers do have something to say about the projects that will get funded and staffed. In any case, the attendee mix make it the best event of the year for getting your hands around both what is hot, and what is actually being done with XML. Because of the event's history and the fact that the majority of early XML implementations focused on web publishing and content management, XML 2000 was also a great place to look for the latest content-oriented application approaches, tools, and experience. Bill and David produced a Gilbane Report show daily at the conference and were all over the show floor, in the conference sessions, and in the hallways and social events to pick up the latest news and buzz. Topic Maps, Schemas, XSLT, XML content management, the Semantic Web, and continuing efforts to get XML from Microsoft Word were some of the areas they found getting a lot of attention... The Gilbane Report and AIIM (The Association for Information and Image Management - www.aiim.org) sponsored a special interest day on the use of XML in Enterprise Content Management. This was a good opportunity to see how large organizations are using XML, away from the hype of any particular vendors. Following opening remarks from Frank Gilbane and AIIM President John Mancini, and some tutorials, case studies were presented on how XML is used at Underwriters Laboratory, the Pratt-Whitney division of United Technologies, and the National Library of Medicine. While each organization had different challenges, and had arrived at different solutions, there were some interesting commonalities. The net effect of the ECM Special Interest Day case study presentations was at once both sobering and exciting. Sobering in that the many advances being talked about elsewhere at the conferences didn't immediately much figure into the work at hand, and certainly not as an easy panacea to the hard work required to change the work processes of the various departments represented. But the real-world nature of the work being reported provided a kind of refreshing relief, too, since the case studies quantified concrete and significant gains for each of the enterprises, and brought home the message that XML really does provide results today, even as the various XML tools and processes gain power and sophistication... The conference confirmed many of the trends that we have been seeing over the last year, and suggested a few emerging ones. (1) XML is now entrenched as the de facto standard for data interchange in a growing number of Web-related industries. Even areas such as EDI, which had been viewed as somewhat impenetrable, are adopting XML approaches. (2) Schemas will overtake DTDs, and have the potential to become a central tool in Web development. (3) Whether Topic Maps are the answer, the time is now for taxonomies to be applied to web content. It makes all the sense in the world. (4) The 'Word' problem ['Holy Grail: Word-to-XML'] needs to be solved. The products demonstrated at the conference are an interesting second generation of software to be applied to this problem; the next generation needs to be even easier to use and lower in cost. In general, there is a lot of development activity, as well as excitement about the kinds of solutions that can be built around XML. The large and growing number of XML developers and users who are familiar with what XML can do are bringing new demands, new perspectives, and new skills. The notion of schema-centric development, for example, is a new development, and one that uniquely comes from the concerns and ideas these new users have. It is a welcome trend in a growing and dynamic industry..." • [February 09, 2001] Document Object Model (DOM) Level 3 Content Models and Load and Save Specification. Version 1.0. Revised version. Reference: W3C Working Draft 09-February-2001. Edited by Ben Chang, Oracle; Andy Heninger, IBM; Joe Kesselman, IBM; Rezaur Rahman, Intel Corporation. "This specification defines the Document Object Model Content Models and Load and Save Level 3, a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents. The Document Object Model Content Models and Load and Save Level 3 builds on the Document Object Model Core Level 3." Also as single HTML document and as PDF. See related references in "W3C Document Object Model (DOM)." • [February 09, 2001] "XML.gov Goes Live." By Susan M. Menke. In Government Computer News (February 05, 2001). "The Chief Information Officers Council last month brought its prototype site for Extensible Markup Language users online at xml.gov. The site will serve as a clearinghouse for federal XML activities as well as a collaborative workspace. XML tags make digital documents easier to search and repurpose than Hypertext Markup Language tagging does. The council asked federal CIOs to register their agencies' XML initiatives on a questionnaire at the site. It particularly encouraged reporting projects in human resources and records management, directory services, electronic forms, financial and performance reporting, and energy and environmental matters. The council's XML Working Group plans a second generation of the site that will integrate a registry of government-oriented XML data elements and document type definitions into paperwork reduction processes. The working group's next meeting is Feb. 14. More information appears at http://xml.gov/agendas.cfm..." See: "US Federal CIO Council XML Working Group." • [February 09, 2001] "Air2Web Enables Wireless ERP, CRM, and SCM Via XML." By [Intelligent CRM Staff]. In Intelligent CRM (February 06, 2001). "Air2Web Inc. has upgraded its Mobile Internet Platform to version 2.0, which includes business objects and components that developers can use to connect their ERP, CRM, and SCM systems directly to Air2Web's wireless platform. Air2Web has also established the DevCenter, a supported, Web-based development community, and a global partner program for integrators, OEMs, and application solution providers. Air2Web designed its hosted platform to help businesses create, deploy, and deliver ERP, CRM, and SCM applications to their customers and employees using any digital wireless device. This includes Short Message Service (SMS) and Wireless Application Protocol (WAP) phones, pagers, and handheld computing devices such as Palm Inc. PalmPilots. Air2Web's XML request/response-based platform integrates wireless capabilities with enterprise applications. According to Air2Web, an application can incorporate blended media interactions including data, speech recognition, and text-to-speech conversions through the XML interface. The platform's Blended Media Engine provides interactivity compatible with device capability and user needs. The comapny also said that applications based on the Air2Web platform will be compatible with all the wireless protocols, networks, and devices." From the company description: "The adapters manage specific functionality as it relates to wireless device technology. Included in this component are adapters designed to process 1-way and 2-way SMS devices, wireless PDA's, web-enabled phones, and digital phones. Voice and Audio functionality including Speech Recognition, Voice Print Biometrics, DTMF, IVR, Text-to-Speech, and Streaming Media are also built into the Interface Adapters component. As new wireless technologies emerge, new adapters can be designed and implemented as needed into this component giving plug-n-play functionality with each technology without affecting other areas of the platform. The Interface Adapters component also manages Enterprise Application Connectors built by Air2Web Development Partners that connect leading enterprise applications in CRM, Sales Force Automation, database, and back office systems directly to the Mobile Internet Platform... The M-commerce Engine provides secure transactions through the use of PKI for wireless devices. The M-commerce Engine processes requests through SSL connections to the customer servers using XML or OFX for financial institutions... The Billing subsystem provides usage and transaction metrics for internal Air2Web use and the external Reporting facility. Logs of all XML requests and responses as well as messaging, call times, incoming phone numbers, audio processing time, and mCommerce transactions are generated for accurate usage measurement for automated billing..." • [February 09, 2001] "Baan Portal Applications Unveiled." By [InfoWorld Staff]. In InfoWorld Volume 23, Issue 6 (February 05, 2001), page 5. "New products include iBaan OpenWorld, designed to allow companies to integrate third-party and legacy applications into the Web using graphical business object mapping and XML-based gateway technology..." See the announcement: "Baan Extends Enterprise Integration With Launch Of New iBaan XML-Based Technology Framework": "...Baan, the global provider of enterprise business solutions, has launched iBaan OpenWorld 2.0 as the embedded integration technology for its new generation iBaan suite of Internet-enabled value web collaboration solutions. The new iBaan OpenWorld universal integration framework extends Baan's powerful enterprise integration capability, and uses new XML-based iBaan OpenWorld technology framework to deliver the highest levels of flexibility and integration for Baan customers. Using iBaan OpenWorld, customers can seamlessly integrate their Baan applications with third party solutions and legacy systems, accelerate time to market for new business processes and ensure that they maximize the potential of their B2B business initiatives. iBaan OpenWorld features a number of key components, including iBaan OpenWorld Broker, iBaan OpenWorld Adapter, iBaan OpenWorld Studio, iBaan OpenWorld Gateway and iBaan OpenWorld Domains that deliver Business Object Interfaces (BOIs). iBaan OpenWorld is available both as embedded integration technology for iBaan solutions and as a standalone solution. iBaan OpenWorld features a new graphical business object mapping capability that allows the drag-and-drop creation and automatic generation of data mapping between applications. Another major feature is the iBaan OpenWorld Gateway component that provides generic access to BOIs through a dynamic XML interface that provides customers with the facility to integrate Web-based applications and data." • [February 09, 2001] "Baan Unveils iBaan Portal Management Applications." By George A. Chidi. In InfoWorld (January 30, 2001). "Dutch ERP (Enterprise Resource Planning) software maker Baan unveiled a suite of Internet-based management applications and services Monday, its first offering since British software maker Invensys bought a controlling stake in the embattled company last year. The Web-enabled applications are based around the iBaan Portal, an online access point for employees to view relevant information sources, applications, and business processes. New products include iBaan OpenWorld, designed to allow companies to integrate third-party and legacy applications into the Web using graphical business object mapping and XML-based gateway technology; iBaan Collaboration, which allows businesses to reconfigure applications on the fly as relationships change; iBaan Webtop, a thin client for iBaanERP 5 that provides customers with a Web interface using the Webtop navigation framework and integrating Baan functionality within a standard Web browser; and iBaan Solutions, a package of e-business applications and services for online customer interaction, business-to-business Web selling, and secure Web-based buying and procurement..." • [February 09, 2001] "Tutorial: Adventures with OpenOffice and XML." By Matt Sergeant. From XML.com. February 07, 2001. ['The lack of a competitive open source XML editor has long been lamented by the XML developer community. The move towards an XML file format in the OpenOffice word processor looks set to change that. In our main feature this week, "Adventures with OpenOffice and XML," Matt Sergeant explores the new XML file format and shows how transformations can be used to integrate OpenOffice into an XML content management solution.'] "At the Open Source conference in Monterey last year, Sun announced their plans to release the current source code for Star Office, renamed OpenOffice. In October they followed up on their plans, releasing both the source code and binaries for OpenOffice build 605. One of the features added since Star Office 5.2 was the ability to save files as XML. In addition to being open source, saving as XML makes OpenOffice truly open. Aside from being open source, XML's self-documenting nature allows us to dive into the document format without having to dive into C++. And more significantly, XML allows us to use simple, free tools to manipulate the documents themselves. In this article we will examine the structure of the format. We will not go into great detail, as Sun has already done so in a 400 page specification. Instead we will focus on using the XML to generate something of potential interest to web developers and content editors. It's important to note that OpenOffice isn't ready to be an every day word processor. Components like printing and spell checking were removed in the migration to open source because Sun didn't own them. I expect they will be added back by the open source community as time goes by. When Sun releases Star Office 6 I expect they will include the proprietary spell checker and print engine again. Also worth noting is that OpenOffice is relatively unstable at the moment. I experienced several crashes and other serious problems while working on this article. Thanks to Daniel Vogelheim of Sun for helping me through those troubles... the server side of XML processing is competitive with, if not better than, proprietary products, the client-editor side of things was a long way off. OpenOffice's XML format changes everything. Now you really can edit a richly formatted document in a WYSIWYG word processor and publish it directly to the Web. That's a huge step in the right direction for the open source community. Other ideas that could be implemented include (1) convert a presentation file to Sun's XML slide format and then to SVG using their toolkit; (2) use stylesheets to generate OpenOffice's XML format from XML formats like DocBook or XHTML (or the output from the transformation above) to create a form of round-trip editing; (3) use stylesheets to generate XHTML directly, rather than an interim format. Doubtless there are many more possibilities." For other details and references, see "StarOffice XML File Format." • [February 09, 2001] "Next Generation Internet: The 'Fourth Tier' Is Born. Solving the problem of disparate content types." By Reza B'Far (eBuilt, Inc.). In Computer Technology Review Volume 21, Number 1 (January 2001), pages 16-18. "Web content began as static HTML pages and evolved to include client-side scripting, proprietary content technologies, and application programming interfaces. HTML has remained the basis of all Web content -- until now. We are about to witness the revolutionary move of content from HTML to XML (Extensible Markup Language). XML is a set of rules for defining a document using tags in a self-described vendor- and platform-neutral manner. XML has numerous advantages over HTML. It is easily transformable and can describe any type of content. HTML is a rendered presentation of data for a specific set of clients (namely HTML-based browsers), while XML can be data, its presentation, or a combination of both. Metaphorically speaking, HTML is a picture of a 3D object (Data, Presentation, and Flow Logic) while XML is the 3D object itself. Viewing an HTML object from a different perspective will produce a fuzzy picture at best because the object's entire data set is unavailable. Cell phones, PDAs, or embedded devices may have problems with HTML, which often has extraneous or missing data. Content in XML can be transformed into a wide range of other content (like voice based content) and made available to a wider range of devices (like digital cell phones). XML content can be rendered in one way for cell phones (like WML for WAP) and in another way for PC-based browsers (like XHTML)..." • [February 09, 2001] "The Politics of Schemas: Part 2." By Kendall Grant Clark. From XML.com. February 07, 2001. ['Having established in the first half of this essay that schemas are essentially political, this second installment examines the relevance of this to the XML community, and avenues for further consideration.'] "You may find yourself agreeing that schemas are political but wondering, nevertheless, what it has to do with XML practitioners or with XML itself. XML is, however, a universal data format. If we take the universal claims made about XML seriously, professional schema-makers must ask whether some interests and views of contested concepts might be excluded, perhaps systematically, from schema-making and from schemas; whether such exclusion is socially beneficial or harmful; and, if harmful, what should be done about it. From the early days of XML's development there's been talk about vendor neutrality, interoperability, and universality. Such talk was part of SGML's appeal since the mid-70s and rightly so. Today that talk fails regularly to take account of politics. XML advocacy often ignores the fact that schemas may be vendor neutral but cannot be interest neutral; that schemas may be universally accessible but formalize a strongly contested understanding of a vital part of the world; or that schemas may distort or impede some people's interactions with the world in ways they find inequitable or inappropriate. XML schemas are often placed in the public domain and available for anyone's royalty-free use (subject obviously to uncommon levels of knowledge and expertise) -- a state of affairs clearly preferable to proprietary alternatives. But is it enough? What good does it do that one can use, even modify a de facto standard schema, royalty free, when the schema reflects interests inimical to one's own, formalizes an understanding of the world one strongly contests, and is used in a widely deployed, vital Semantic Web application that has no serious competitor? What good does it do to modify the schema to reflect one's own interests and understandings if doing so renders it unusable? [...] Political schemas may limit what we notice, what we can say or think about what we notice, and to whom we can say it, especially inasmuch as we use machines to mediate parts of the world to us. The Semantic Web vision means, if anything at all, creating software systems that mediate the world to some of us in useful and, one hopes, fair, just, and good ways. What XML technologists say and think and do about the politics of schemas, the Semantic Web, and the social benefits of the technology they create will go a long way to determining the Web's future, and maybe something of society's future too. I hope I at least have said enough to encourage the wide-ranging and free conversation it is the responsibility of XML technologists, along with others, to have." For schema description and references, see "XML Schemas." • [February 09, 2001] "Transforming XML: Setting and Using Variables and Parameters." By Bob DuCharme. From XML.com. February 07, 2001. ['This article shows how variables and parameters can be used in XSLT stylesheets to substitute values into templates.'] "A variable in XSLT has more in common with a variable in algebra than with a variable in a typical programming language. It's a name that represents a value and, within a particular application of a template, it will never represent any other value -- it can't be reset using anything described in the XSLT Recommendation. (Some XSLT processors offer a special extension function to allow the resetting of variables.) XSLT variables actually have a lot more in common with constants in many programming languages and are used for a similar purpose. If you use the same value multiple times in your stylesheet, and there's a possibility that you'll have to change them all to a different value, it's better to assign that value to a variable and use references to the variable instead. Then, if you need to change the value when re-using the stylesheet, you only change the value assigned in the creation of that variable..." DuCharme has presented several other articles in this column 'Transforming XML'. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." • [February 09, 2001] "XML-Deviant: Schemarama." By Leigh Dodds. From XML.com. February 07, 2001. ['For the past two weeks XML-DEV has seen fascinating exchanges between three inventors of alternative XML schema proposals.'] "During the last week, XML-DEV has been the scene of a series of interesting and innovative discussions concerning schemas in general and also specific schema languages. The XML-Deviant provides a round-up. Grammars Versus Rules: Most schema languages rely on regular grammars for specifying schema constraints, a fundamental paradigm in the design of these languages. The one exception is Schematron, produced by Rick Jelliffe. Schematron throws out the regular grammar approach, replacing it with a rule-based system that uses XPath expressions to define assertions that are applied to documents... A unique feature of Schematron is its user-centric approach, allowing useful feedback messages to be associated with each assertion. This allows individual patterns in a schema to be documented, giving very direct feedback to users. Indeed a recent comparison of six schema languages highlights how far Schematron differs in its design. At times the discussion strayed into comparisons of several schema languages. Rick Jelliffe provided his interpretation of the different approaches behind TREX, RELAX, XML Schemas and Schematron: 'Underlying Murata-san's RELAX seems to be that we should start from desirable properties that web documents need: lightweightedness, functioning even if the schema goes offline (hence no PSVI) and modularity. I think underneath James Clark's TREX is that we can support plurality if we have a powerful-enough low-level schema language into which others can be well translated. I think underlying W3C XML Schemas is that a certain level of features and monolithicity is appropriate (though perhaps regrettable) because of the need to support a comprehensive set of tasks and to make sure that there are no subset processors (validity should always mean validity); however the processors are monolithic but the schemas are fragmented by namespace. Underlying Schematron is that we need to model the strong (cohesive) directed relationships in a dataset and ignore the weak ones, that constraints vary through a document's life cycle, and that lists of natural language propositions can be clearer than grammars.' [...] James Clark's summary of the advantages of TREX over W3C XML Schemas is also worth reading in its entirety. TREX, like Schematron, is a very simple yet powerful schema language..." For schema description and references, see "XML Schemas." • [February 09, 2001] "Requirements for and Evaluation of RMI Protocols for Scientific Computing." By Madhusudhan Govindaraju, Aleksander Slominski, Venkatesh Choppella, Randall Bramley, and Dennis Gannon (Department of Computer Science, Indiana University, Bloomington, IN). From the SoapTeam: Extreme Computing Lab. 2000-08-17. "Distributed software component architectures provide a promising approach to the problem of building large scale, scientific Grid applications. Communication in these component architectures is based on Remote Method Invocation (RMI) protocols that allow one software component to invoke the functionality of another. Examples include Java remote method invocation (Java RMI) and the new Simple Object Access Protocol (SOAP). SOAP has the advantage that many programming languages and component frameworks can support it. This paper describes experiments showing that SOAP by itself is not efficient enough for large scale scientific applications. However, when it is embedded in a multi-protocol RMI framework, SOAP can be effectively used as a universal control protocol, that can be swapped out by faster, more special purpose protocols when large data transfer speeds are needed." See the group's SoapRMI: "SOAP RMI is our implementation of RMI based on nanoSOAP, our implementation of a simple SOAPv1.0 serialization and deserialization mechanism. SOAP RMI uses an XML-Schema specification of the server interface to generate the associated stubs and skeletons. A remote object reference is an HTTP URL along with information that uniquely identifies the instance. The stubs and skeletons do not directly interact with the SOAP implementation, but instead use a communication object which is an abstraction that helps hide the underlying implementation of SOAP. This design is useful as it allows run-time insertion of different SOAP implementations." See "Simple Object Access Protocol (SOAP)." • [February 08, 2001] "Object Database Management Systems." From Barry & Associates, Inc. " Object Database Management Systems (ODBMSs) are designed to work well with object programming languages such as C++ and Java. These articles provide a background on ODBMSs and their use..." [Announcement: "Object Database Technology Overview Available on the Internet. Barry & Associates, Inc. today announced publication of an extensive overview of object database management system (ODBMS) technology on the Internet. The overview includes more than 70 ODBMS articles featuring examples, definitions, and commentary. The articles are available for no cost. Doug Barry, who has been involved with ODBMS technology since 1987, prepared the articles. Mr. Barry said, "Object databases are useful in many types of architectures. Embedded systems, financial systems, web sites using XML for B2B applications or online catalogs, telecommunications, airline reservations, and very large database applications are just a few examples where ODBMSs are being used today. ODBMSs are designed to work well with object programming languages such as Java and C++. These articles provide a background on ODBMSs and their use." Since 1992, Barry & Associates has provided facts about database products and their use in advanced applications. They particularly focus on database product comparison and selection by providing publications and services that accelerate the decision-making process..." See also "XML and Databases." • [February 08, 2001] "Retail Group Promotes XML Trading Standard." By Marcia MacLeod. From ComputerWeekly.com (February 08, 2001). "An international version of the Internet data language XML standard is being developed to aid the growth of e-commerce in the retail sector. The Global Commerce Initiative (GBI), which comprises retailers, consumer goods manufacturers, barcode bodies the EAN and the UCC, and the US Voluntary Integrated Chain Store Group, has been working on the standard for EBXML for several months. A standard XML schema for retail would enable firms to share data and improve supply chain efficiency. The standard for four messages - order, invoice, delivery note, and master data and party alignment - is due to be published next month. Master data and party alignment latter matches data about the product with that relating to the buyer or seller... With the standards smaller suppliers could carry out e-commerce transactions with multiple retailers using the same messages. However, this will only happen, if software houses pick up the standard, said Peter Jordan [chairman of GBI and director of European systems at Kraft Foods]. GBI will begin talking to major software companies as soon as the standard is deliverable. As well as working on the EBXML standard, GBI is looking at data catalogues and how products are identified. It wants data in all catalogues to be interoperable, wherever the catalogue is produced. For example, the EAN/UCC code would be the same in a German catalogue as in a UK one, as would other data, such as that used to describe height and weight of products... GBI is supported by the four main retail trading exchanges - GNX, WWRE, Transora and CPGMarket - that have been set up. GNX, which is possibly the most advanced, now has about 30 early adopters, as well as the seven equity partners: Sainsbury's, Carrefour and PPR from France, Germany's Metro, Sears Robuck and Kroger from the US, and Australia's Coles Myer." For additional/background detail, see (1) "Uniform Code Council and EAN International Report Rapid Progress of Global XML Pilots - Global Commerce Initiative Stands Behind Effort for B2B e-Commerce Standard", and (2) "Industry Issues Global Commerce Internet Protocol." • [February 07, 2001] "XML Meets 'IVR With An LCD'. Will WAP and VoiceXML partner up?" By Robert Richardson. In Computer Telephony Volume 9, Issue 2 (February 2001), pages 92-97. [Feature article on two disjoint paradigms for phone-based m-commerce.'] "Wireless may be the future, but we can't help but notice that when it comes to convergence, wireless is mostly clueless... Converged wireless is coming, no doubt, whenever 3G (that is, fast-data-enabled) wireless comes. Or at least it'll look converged - it's still an open question whether digital voice will ever wind up in the same sorts of packets as data when both are beamed out to handsets. In any case, you'll be able to talk via your Bluetooth headset while scrolling through your daily appointments on the handset... The problem in a nutshell: If we have a device that allows more than one mode of interface, it stands to reason that we ought to be able to use the input modes interchangeably. Or, as a minimum, the underlying protocols enabling those interfaces ought to make it possible for developers to support multi-modal interfaces by explicitly hand-coding options for different kinds of input and output into the same application (see diagram). Right now, the two protocols figuring most visibly in the wireless arena -- WAP and VoiceXML -- don't provide hooks to each other. Still, that's likely to change, and it's a good thing, too, because multi-modal seems like a nearly no-brainer way to make mobile applications a lot more appealing to mobile consumers. In this article, we take a look both at what WAP and VoiceXML don't do right now, how they're likely to learn how to live happily with each other, and at how at least one savvy vendor is already working to deliver some pretty sexy near-convergence scenarios... Why is it that WAP can't handle phone calls better, given that it's whole raison d'etre is to make a cell phone more usable? The most obvious answer is that it's a young standard, still in transition (an obvious fact that plenty of critics have been far too quick to overlook when dishing out anti-WAP broadsides). As things stand, WAP conveys its 'web pages' to mobile handsets using WML (wireless markup language)... the current rendition of WAP knows how to do with regard to voice calls is to initiate them. A user can make a menu selection from a WML page (or card, in WAP parlance) and a special, built-in telephony interface (on a WML card, access to this interface is simply via a URL that begins with 'wtai://' rather than 'html://') drops the current WAP phone call and dials up the new number... VoiceXML, too, shares some of the same 'early-days' shortcomings of WAP. In some respects, VoiceXML is slightly better prepared for a multi-modal world. For one thing, VoiceXML supports a tag that will initiate a call and provide rudimentary monitoring of call progress. Unlike WAP, this kind of call can be initiated either as a bridged or a blind transfer. If blind, it's no different than the WAP call to a voice number. If bridged, however, the new phone line is conferenced into the existing call. A bridge transfer assumes that the call will terminate within a preset time limit and that control will transfer back to the current VoiceXML page (and, in fact, voice options from that page are still in operation within the call). The VoiceXML server never hangs up, so the context of the call isn't lost. The fly in this ointment is that a bridged call, by virtue of the fact that the call that's already in progress is a voice call, can't handle data packets. So you won't be updating your WAP deck with a bridged call... the World Wide Web Consortium (W3C) has taken at least two steps that are sure to have an impact on future multi-modality. First, the group officially adopted XHTML Basic as a W3C recommendation. This puts the specification on track for IETF adoption and general use across the Internet. A key feature of XHTML Basic is its cross-device usability. It's designed to work on cell phones, PDAs, pagers, and WebTV, in addition to the traditional PC-with-a-VGA-screen. Second, the W3C's working group held a session in conjunction with the WAP Forum at a recent meeting in Hong Kong to discuss precisely the problem of making WAP and VoiceXML aware of each other. The upshot was a decision to form a multi-modal working group. Interested parties presenting at the Hong Kong workshop included Nuance, Philips, NTT DoCoMo, IBM, NEC, PipeBeach, and OpenWave. The WAP Forum, a technical consortium representing manufacturers and service providers for over 95% of the handsets in the global wireless market, is already taking steps toward interoperability with other XML-based protocols..." See: (1) "VoiceXML Forum" and (2) "WAP Wireless Markup Language Specification (WML)." • [February 07, 2001] "What is VoiceXML?" By Kenneth Rehor. In VoiceXML Review Volume 1, Issue 1 (January 2001). ['If you are new to VoiceXML, this overview article will serve as an excellent starting point. For those of you who have already been authoring VoiceXML applications with one of the software developer kits, platforms, and/or on-line developer "web studios" available from various vendors, this article goes beyond the syntactical elements of the language and describes the typical reference architecture in which the VoiceXML interpreter resides.]' "VoiceXML is a language for creating voice-user interfaces, particularly for the telephone. It uses speech recognition and touchtone (DTMF keypad) for input, and pre-recorded audio and text-to-speech synthesis (TTS) for output. It is based on the Worldwide Web Consortium's (W3C's) Extensible Markup Language (XML), and leverages the web paradigm for application development and deployment. By having a common language, application developers, platform vendors, and tool providers all can benefit from code portability and reuse. With VoiceXML, speech recognition application development is greatly simplified by using familiar web infrastructure, including tools and Web servers. Instead of using a PC with a Web browser, any telephone can access VoiceXML applications via a VoiceXML 'interpreter' (also known as a 'browser') running on a telephony server. Whereas HTML is commonly used for creating graphical Web applications, VoiceXML can be used for voice-enabled Web applications. There are two schools of thought regarding the use of VoiceXML:(1) As a way to voice-enable a Web site, or (2) As an open-architecture solution for building next-generation interactive voice response telephone services. One popular type of application is the voice portal, a telephone service where callers dial a phone number to retrieve information such as stock quotes, sports scores, and weather reports. Voice portals have received considerable attention lately, and demonstrate the power of speech recognition-based telephone services. These, however, are certainly not the only application for VoiceXML. Other application areas, including voice-enabled intranets and contact centers, notification services, and innovative telephony services, can all be built with VoiceXML. By separating application logic (running on a standard Web server) from the voice dialogs (running on a telephony server), VoiceXML and the voice-enabled Web allow for a new business model for telephony applications known as the Voice Service Provider. This permits developers to build phone services without having to buy or run equipment..." See related description and references in "VoiceXML Forum." • [February 07, 2001] "Open Dialog: Activities of the VoiceXML Forum and W3C." By Gerald M. Karam. In VoiceXML Review Volume 1, Issue 1 (January 2001). ['Even if you're already involved in VoiceXML technology, perhaps you'd like to know a bit more about the origins of the language. This article provides insightful background on the the VoiceXML Forum, the Forum's working relationship with W3C, and how to get involved in both arenas.'] "With the launch of the VoiceXML Forumin March of 1999, and the release of the VoiceXML 1.0 specification in March 2000, there has been a surge of activity in the speech and telephony industry around the VoiceXML concept, products and services. In conjunction with these events, the VoiceXML community has been progressing the language further and improving the business environment in which VoiceXML exists. Most notableare the efforts of the VoiceXML Forum and the World Wide Web Consortium (W3C) Voice Browser Working Group (VBWG). Figure 1 below provides a brief history lesson in how all the participants work together. [...] the VoiceXML Forum and W3C felt it would be mutually beneficial to have a working relationship with regard to VoiceXML activities. Consequently, in September 2000, the two organizations and their constituents began formal negotiations on a memorandum of understanding that would define the ways in which collaboration would take place. We're hoping to have this memorandum approved in January 2001. At this time, the language work takes place within the W3C VBWG, chaired by Jim Larson of Intel Corp., and in its various subgroups. The specific work developing what is expected to become VoiceXML 2.0 is taking place in the Dialog Language Sub-Working Group chaired by Scott McGlashan of PipeBeach. The development of other markup languages (e.g., for speech grammars and speech synthesis) is handled in other subgroups. The work takes place through email, teleconferences and in face-to-face meetings that occur every couple of months. The VoiceXML Forum has activities in marketing (chaired by Carl Clouse, Motorola), conformance (chaired by the author), and education (where the author is acting chair). Participation in these committees is limited to VoiceXML Promoter and Sponsor members. If you would like to get involved, please contact the VoiceXML Forum office (membership@voicexml.org). For the full range of VoiceXML Forum activities, please check out the Web site at http://www.voicexml.org. As you can see, VoiceXML is heating up, and with the wide range of industrial support behind the VoiceXML Forum and the W3C VBWG, the best intellectual and corporate resources are collaborating to make VoiceXML a driving force in the telephony and speech application world..." See related description and references in "VoiceXML Forum." • [February 06, 2001] "XML.gov - Call for Participation and Enhancement Suggestions." Letter from Enterprise Interoperability and Emerging Information Technology Committee (EIEITC) to CIO. January 19, 2001. "Following a period of prototyping, the xml.gov site has now been activated at http://xml.gov. Please make your staff aware of the site and encourage them to become active participants in helping to develop and improve its design and content. The longer-term objective is not only to provide a comprehensive and authoritative reference for government-related XML activities but also a collaborative work space to support those activities... The XML Working Group will be issuing an RFP for the second generation, embracing the principles outlined in Raines' Rules. One of the features contemplated for future implementation is an ISO/IEC Standard 11179 compliant registry/repository of inherently governmental XML data elements, DTDs, and schemas. If such a repository is established, it may be used on a pilot basis to register the data elements (fields) represented on Standard and Optional Forms (SFs & OFs), with the longer-term objective of integrating the registry into the information burden reduction process mandated by the Paperwork Reduction Act... Questions, comments, and especially enhancement suggestions should be conveyed to the Working Group's co-chairs, Owen Ambur of the Department of the Interior (Owen_Ambur@ios.doi.gov) and/or Marion Royal of GSA's Office of Governmentwide Policy (marion.royal@gsa.gov)." See: "US Federal CIO Council XML Working Group." • [February 05, 2001] "BEA Sends B-to-B XML Specification to OASIS." By Tom Sullivan. In InfoWorld Volume 23, Issue 6 (February 05, 2001), page 29. "In a bid to provide coordination for Web services and a model for defining and managing business-to-business interactions, BEA Systems submitted its proprietary business transaction protocol (BTP) technology to a standards body this week. BEA also formed a technical committee to shepherd BTP through the open standards process. 'The protocol is responsible for managing the life cycle of a transaction and the events around such transactions,' said Rocky Stewart, CTO of San Jose, Calif.-based BEA and chair of the Organization for the Advancement of Structured Information Standards (OASIS), a consortium for developing XML standards for e-business. BTP allows complex XML message exchanges to be tracked and managed as loosely coupled 'conversations' among businesses. BTP is but the latest specification to emerge around XML. Fairfax, Va.-based webMethods, for example, in November announced the XML Key Management Specification (XKMS), which enables the integration of digital signatures and data encryption into e-commerce applications. The idea behind such specifications, said Kimberly Knickle, an analyst at AMR Research in Boston, is to support XML with a variety of functions so as to guarantee that it works well. 'All these things are focused on allowing more open communication,' said Kevin Costello, a consultant at Arthur Andersen, in Chicago. 'It's one of the reasons that XML is a better answer than EDI [electronic data interchange]; it's more open and easier to translate.' Although several vendors, including IBM, Oracle, Sun, Hewlett-Packard, and Bowstreet have begun work on another similar specification, XAML (Transaction Authority Markup Language), BEA, rather than waiting, submitted the technology to OASIS. '[The other companies] didn't have their spec written yet, and we have a product already, so we decided we'd turn it over to the standards body,' BEA's Stewart said. The technology behind the specification is currently being used as the eXtended Open Collaboration Protocol (XOCP) for BEA's WebLogic Collaborate..." See (1) the announcment for the OASIS Business Transactions Technical Committee, and (2) the BEA announcement: "BEA Leads New OASIS Technical Committee to Develop Open Industry Standards For Business Transaction Management. Sun Microsystems, Interwoven and Bowstreet Join OASIS Committee to Help Facilitate Deployment of Business-to-Business E-Marketplaces Worldwide." • [February 05, 2001] "The Universal Translator. New uses of XML facilitate enterprise integration of applications and devices." By George Lawton. In Knowledge Management Magazine (February 2001). "The eXtensible Markup Language has roots in content management, but developers working on e-commerce applications have watered those roots, and the language has grown so sturdy that it now supports many back-end enterprise applications. Today, XML is used to manage everything from power stations to supply chains, and its ability to translate data on the fly makes it useful for wireless applications as well. These applications are being facilitated by the adoption of new protocols and extensions of XML, such as the eXtensible Stylesheet Language (XSL), which helps in transforming documents from one format to another without programmers having prior knowledge about the different information systems that must communicate. XSL has a user-definable style sheet that can dynamically translate XML documents into any format desired. Originally conceived as a way of formatting data for different Web browsers, now this capability is applied to new Internet platforms and different kinds of users. For example, XSL makes it possible to strip out only the most important information for display on a cell phone. It can be set to present a personalized view of information that takes into account an employee's job function... Most Internet applications have been designed for the dimensions of PC screens. XML promises to open up data-intensive applications to even the smallest interfaces and displays, such as audio-based systems, cell phones and personal digital assistants (PDAs). For example, automobile dashboards have perhaps the most limited display space of all. OnStar, a subsidiary of General Motors Corp., has developed a that uses XML to format information. This system allows consumers to get personalized information such as news, sports scores, stock quotes and e-mail in their vehicle. It is based on an XML mapping architecture developed by ObjectSpace Inc. of Dallas, which allows OnStar to automatically transform content from various providers to an audio format so drivers don't have to take their eyes off the road. Application service providers (ASPs) also are exploring XML as a way to port applications to other mobile platforms. For example, PurchasePro Inc. of Las Vegas is using XML to facilitate communications with PCs, cell phones and PDAs. The company provides a payment authorization service for customers such as America Online Inc., computer vendor Gateway Inc., Hilton Hotels Corp. and casino operator MGM Mirage Inc. This service, according to Jim Jensen, director of wireless strategy at PurchasePro, enables managers to view and approve purchase orders from a handheld device while on the road instead of having to be available to sign paper documents. The ability for XML to translate between applications also leads companies to hope they can integrate their applications in a more dynamic manner. For example, Sharp Electronics Corp. has begun using XML to improve the integration of its business processes both internally and with those of its partners for business-to-business (B2B) applications. Don Levallee, director of strategic business operations at Sharp in Camas, Wash., says he expects this new architecture to help to reduce operations costs and add new functions. Sharp's approach features a modular architecture that breaks business processes into different components, such as a product database, a customer relationship management (CRM) system and an enterprise relationship management (ERM) system. The company is using software from Extricity Inc. of Belmont, Calif., to manage the flow of XML messages between its applications and its partners'. To do so, says Bruce Elgort, Sharp's senior e-commerce developer, his group creates charts that define the flow of XML messages, which reduces the amount of coding required to tie the applications together." • [February 05, 2001] "XML Schema Slowly Matures. XML Schema can't fix everything by itself, but it fills a gaping hole in the XML group of technologies and specifications." By Don Kiely (Third Sector Technologies). In XML Magazine Volume 2, Number 1 (February/March 2001). ['XML's Document Type Definition provides a means of defining XML structure. DTDs are well supported in the software industry, but they come with a substantial set of problems, too. Can this marriage be saved? Find out how XML Schema may save structured XML.'] "Now that the XML Schema specification is a W3C candidate recommendation, it is entering a period of its life when the standards committee thinks that all the basic parts are there and working. There are still a few kinks to work out, but people are encouraged to start building proof-of-concept tools and applications. With any luck, it will hit full maturity sometime later this year, and we'll have full benefit of all its features. But what's the big deal about those features? The XML 1.0 recommendation has a means of defining XML structure built into it, the Document Type Definition, or DTD. DTDs are well supported in the software industry because of their origins with SGML, and XML is a derivative language of SGML. There are lots of DTDs out there doing lots of good work, and lots of people understand them well. There is a wealth of books, journal articles, and Web resources that provide plenty of information about them. The problems with DTDs are several. DTDs are a decidedly non-XML syntax that is hard to learn, they have no sense of data types and only the loosest limits on some structural constraints, they have a closed architecture, they lack support for namespaces, and generally they do not adhere even to the most trivial of the goals built into the design of XML. Probably the biggest issue driving the XML Schema specification is the lack of data types in DTDs. Even if you ship me an XML document that declares itself to be strictly compliant with a DTD, the XML data can have almost random data as element content and attribute values, even if the element name clearly suggests an integer or floating point number, for example. Very messy, and very unlike XML... The listings and code that accompany this article use sample XML data from the XML Schema Part 0: Primer candidate recommendation document for purchase order data... In general, you'll want to validate XML data that is shared between different applications, particularly if you don't have control over both applications. This way the application that is consuming the XML data doesn't need to have extensive data-checking code to make sure that the data is usable and in the structure it expects. Depending on the validation structure you use, you still may need to do some programmatic error checking. For example, when using a DTD for validation, you'll still need to check that content and attribute values can be converted from their string representation to the type of data you are expecting, such as currency or date values. On the other hand, if you have control over both ends of the data sharing, such as two applications you wrote or between two components in a single application, you may be able to forego validation and save the processing cycles. It really boils down to yet another of many design and architecture decisions necessary for software development. XML Schema, unlike XML itself, is unlikely to cure the world of all its ills. But it fills a gaping hole in the XML group of technologies and specifications and can achieve full status as a W3C recommendation none too soon." For schema description and references, see "XML Schemas." • [February 05, 2001] "At the Heart of the Open Source Movement. As a key enabler of innovation, XML emerges as the lingua franca of next-generation Web services. [COVER FEATURE.]" By Stuart J. Johnston. In XML Magazine Volume 2, Number 1 (February/March 2001). ['XML is flexible, so it's no surprise that it is emerging as the lingua franca of open source software. See how vendors are using XML technology to develop next-generation Web services tools.'] "Three years ago, Jeremie Miller had a brainstorm. Instant messaging was catching on with Internet users like wildfire, but there was just one problem. Every provider of instant messaging had a proprietary format that it guarded jealously. Users of America Online's Instant Messenger couldn't instantly communicate with users of competing messaging from Microsoft, Yahoo, and others, because there was no common infrastructure. (Even today, what infrastructure exists is still tightly controlled.) Two other events happened at almost the same time: the open source movement gained credibility, even spreading into corporate America. And the importance of XML started to grow because of its flexibility and self-descriptive characteristics. Then Miller had his brainstorm. He wrote one of the first XML parsers using JavaScript. Microsoft's planned .Net notification service will merge instant messaging with other types of messaging, including e-mail, fax, and voice. But Miller has even bigger plans. The company he founded, Denver-based Jabber.com and its related Jabber.org site, has produced an open source instant messaging infrastructure of the same name built entirely on XML, including the transport technology. 'I wanted to convert the messaging and presence formats into a common language, and it was a natural fit for XML,' says Miller. As far as using XML to build the transport mechanisms, 'HTTP is an excellent way of moving objects between servers [but] is not useful for instant messaging,' Miller says. Fundamentally, however, just as Microsoft plans to use XML as the key enabling technology in its grand .Net vision, so do the open source and open standards communities see XML as the key ingredient in their own Web services visions... Many in the open source community do not see delivering software for a fee over the Internet as part of their Web services vision. Rather, they tend to see the delivery of information as the valuable income-producing component of the future Web. Meanwhile, both camps see consulting services as a money-making proposition in the e-future. And while there are fundamental differences between what the two groups mean by the term 'Web services,' what is common to everyone's vision is the universal delivery of information... [Covered in the article: Jabber: An Instant Messaging Infrastructure; Red Hat Network: XML Delivers/Manages Software; GNOME: an Interface for All; Sun Also Rises; Apache and the United Nations (ebXML); Hybrids.] • [February 05, 2001] "Bootstrapping. Using Python and XML-RPC to create a practical, real-world XML application: writing scripts that create a new story on the server. [COLLABORATION.]" By Dave Winer (UserLand Software). In XML Magazine Volume 2, Number 1 (February/March 2001). ['To get the benefits of using XML, developers have to work together, supporting implementors working in different environments, even supporting competitors. Why bother? Columnist Dave Winer says, because it's fun.'] "When engineers build a suspension bridge, first they draw a thin cable across a body of water. Then they use that cable to hoist a larger one. Then they use both cables to pull a third, and eventually create a thick cable of intertwined wires that can support a road that hundreds of cars and trucks can drive over at the same time. That's a bootstrap. First you take a step you know is on the path, learn from it, and use it to lift up the next level. And unlike the designer of a suspension bridge, software developers must be more flexible, because the pace of innovation in our art is so rapid. We don't know exactly what next year's trucks will look like, how much they weigh, or how many wheels they have. I'm interested in a community of developers working together to create desktop tools and server apps that work together. That's why I got involved with XML-RPC and its successor, SOAP. As I've said in previous columns, XML is the common format, but to get the benefits of using XML we have to work with each other. This means supporting implementors working in different environments, even competitors. That's how markets grow, and developer to developer, that's why it's fun -- not only do I get to impress you with my software, but I also get to enjoy other people's creations. I've created a Web application called Manila, that runs on Windows 2000 on servers that my company runs at Exodus in Santa Clara, CA. You can create a free Manila Web site and use it to publish your thoughts and links to articles you find interesting. That's great, but there's a hidden feature in every Manila site: It's scriptable over the Internet via SOAP and XML-RPC. If you've wondered when someone would put up a practical real-world XML application, the wait is over. In the rest of this column, I'll show you how to write Python scripts that use XML-RPC to create a new story on the server, and copy the content of a message to a file on my local hard drive. I chose XML-RPC because it's more mature, and offers more choices today than SOAP does. You can see a list of XML-RPC implementations here: http://www.xmlrpc.com. So if your preference is Unix, or Mac, or Windows; Python, Java, Tcl, Perl, or even my own Frontier, you can find compatible implementations and start building your own distributed applications with XML. By this time next year, surely SOAP will be as mature and as broadly supported. We needed a thin cable, XML-RPC to pull up SOAP, and then to pull up new applications, and then build systems. We're going to learn a lot in the next few years, and if XML achieves its promise, we'll share much of what we learn, and that'll be a bootstrap, and that'll be fun." See: "XML-RPC." • [February 05, 2001] ".Net has XML on its Menu. Use C# and the .Net framework's XML-friendly features to create dynamic XML hierarchical menus. [INTEGRATION.]" By Dan Wahlin. In XML Magazine Volume 2, Number 1 (February/March 2001). ['Microsoft's .Net framework allows you to work with XML documents in a variety of ways. Columnist Dan Wahlin talks about how, with the help of C#, you can leverage this asset to create a handy hierarchical menu.'] "It's no secret that XML can be used in a variety of ways, ranging from exchanging data in complex business-to-business (B2B) transactions to providing structure for application configuration files. As XML continues to gain software support, it's a safe bet that XML's utility will continue to increase. One such application leverages XML to provide end users with a more satisfying Web experience by creating a hierarchical menu system similar to the Windows Start menu. Let me show you how it works. By using C#, XML, and Microsoft's .Net framework on the server side, this application generates a DHTML structure that IE4 or higher can manipulate and dynamically display on the client side. Because you can quickly access XML on the server and because XML has the ability to describe hierarchical relationships, it proves to be an excellent choice for marking up parent/child menu data. In addition to learning how XML can be used to create a menu application, I'll also introduce you to the .Net framework's main XML classes, which are located in the System.Xml assembly. Working with XML in C# files requires that you reference a specific namespace. A namespace in the .Net platform is used as an organizational system for program components and is important in resolving naming conflicts (much like XML namespaces). This XML-based menu system is created using the System.Xml namespace located in the System.Xml assembly. If you're not familiar with assemblies, the .Net SDK defines them like this: An assembly is a collection of types and resources that are built to work together and form a logical unit of functionality, a 'logical' dll. An assembly takes several physical files such as interfaces, classes, resource files, and so on, and creates metadata (referred to as a manifest) about how the files work together. The assembly can also contain information about versioning and security. One of the many nice things about assemblies is that they can be used in ASP.NET applications without adding a Class Identifier (CLSID) to the registry using regsvr32.exe. This makes updating assemblies as easy as copying the appropriate assembly to the bin directory of an ASP.NET application. Let's take a closer look at the classes that are found within the System.Xml namespace and assembly..." • [February 05, 2001] "SOAP/XML Must Mature Quickly." By Jim Fawcette. In XML Magazine Volume 2, Number 1 (February/March 2001). ['Corporate interest in XML and SOAP has never been higher, but the danger of fragmentation has never been greater. What needs to happen before the enterprise accepts the new technologies? And to ensure interoperability? Find out what the co-founder and Chairman of Iona Technologies thinks.'] "chairman of Iona Technologies, Dr. Christopher J. Horn runs a company whose existence depends on helping corporations tie together many technologies, from legacy systems to the bleeding edge of object-oriented programming. This makes Horn well qualified to put the emerging landscape in perspective -- from XML and SOAP, to Java 2 Enterprise Edition, from wireless to peer-to-peer networking. FTP, Inc. President James Fawcette traveled to Iona's Dublin, Ireland, headquarters to interview Horn about these technologies and their importance in the corporate middleware market. Listen to Part I and Part II of the full audio stream of this interview... It is in the early days; nevertheless, we see widespread corporate interest in employing SOAP for Internet and extranet use rather than for intranets -- more for applications that are outward looking from the corporation. XML and SOAP are more verbose than binary technologies such as CORBA or Enterprise Java Beans (EJB), where one gets a stronger, better-performance coupling... The basic technology is being extended to handle issues such as transactions and security. Until enterprise capability has been demonstrated in SOAP, there will be resistance to widespread use. It is coming, though. This is much like 1993 when CORBA was an emerging technology. A few pioneers were willing to adopt it, and that opened the floodgates in 1995, '96, and '97. We can expect widespread adoption of XML and SOAP in 12-18 months... In the CORBA specification, security levels vary from straightforward SSL to full-scale authentication, including the ability to handle repudiation of access rights. Similarly, there are sophisticated transaction models around full two-phase commit. For SOAP, in most applications, a simpler, lighter-weight, high-performance model will suffice. But some less widely used applications require more. Quality of service requirements also have different levels, from low-band sync and rollover to synchronous and asynchronous messaging, publish-and-subscribe, and notification. These are just different communications paradigms underlying presentation syntax of something like SOAP. You could say that this doesn't really matter in an XML-based system like SOAP, but in practice, we feel that there are things you can do to make capabilities such as publish-and-subscribe easier to use for the developer..." See "Simple Object Access Protocol (SOAP)." • [February 05, 2001] "Jabber: Smooth Talking for Short Messages. Neither SMS nor instant messaging can easily implement real-time, immediate, cross-enterprise communication. Can XML surmount this tower of babel? [PRESENTATION.]" By Jeff Jurvis (Rainier Technology, Inc). In XML Magazine Volume 2, Number 1 (February/March 2001). ['Although short messaging services and instant messaging are available on mobile and other devices, neither technology is perfect for implementing real-time, cross-enterprise communication. Columnist Jeff Jurvis discusses how XML could give you a fast-talking solution.'] "Short messaging services (SMS) on mobile phones and pagers are wildly popular in Europe and Japan, but have had little impact on North American users of mobile devices. This is due in part to the wireless communications infrastructures in Japan and Europe, which have been supporting reliable and fast SMS for a few years. The problem in the United States is the usual one: incompatible carrier infrastructure and a lack of interoperability standards make any attempt at a short messaging service insufficient. Is short messaging doomed to fail in North America? Not if you consider the close relative of SMS, instant messaging (IM). IM is huge on services like America Online and the Microsoft Network. The usage patterns are very similar -- IM and SMS messages are usually shorter than 100 characters with no attachments or other need for persistence. Both services are available on mobile devices, and software for both can be integrated into enterprise messaging strategies... The good folks of the Jabber Project in the open source community are building royalty-free, noncommercial, distributed instant messaging software that uses XML to connect IM users. Unlike AOL IM or any of the other proprietary services, Jabber does not require messages to run through a central server controlled by someone outside your company. You host the Jabber server and have greater control over security and availability. The Jabber architecture is distributed, very much like Internet e-mail. Each Jabber server is independent, but can talk to any other Jabber server on the Internet. Jabber clients use simple TCP sockets to exchange XML documents with Jabber servers. The Jabber server listens for client connections and provides message delivery, message storage (store and forward), buddy list storage, and user authentication. Developers can add components called transports to the Jabber server that serve as gateways to other IM services such as AOL IM, ICQ, MSN Messenger, Yahoo Instant Messenger, IRC, and even e-mail and telnet. Whether or not a message is intended for another Jabber client or a foreign client such as AOL, the communication always conforms to the Jabber XML specifications. The XML Hook The three main XML elements in the Jabber architecture are iq (info/query), message, and presence. Info/query is used to authenticate users, manage rosters (buddy lists), get time and version information, and for other general queries. For example, in this info/query, this Jabber XML code is sent by a client to authenticate a user... Jabber is a simple, protocol-agnostic framework for routing XML-based instant messages to any device that can access a TCP socket -- everything from the smallest Web-enabled cell phone to desktop and server computers. Even more intriguing, there is nothing about Jabber that makes it stop with IM. The Jabber Project has experimented with transporting SOAP calls and other XML namespaces over the architecture." See: "Jabber XML Protocol." • [February 05, 2001] "Pushing the Boundaries of the Web." By Kurt Cagle. In XML Magazine Volume 2, Number 1 (February/March 2001). ['Take advantage of the powerful Web design potential offered by extending your XHTML with XSLT. Kurt discusses the advantages of using XSLT libraries and XSLT-based transformations in your Web page code.'] "Using XSLT to extend XHTML means that you can view complex interfaces in terms of dynamic components and change the look, feel, and data content of your pages. Web page design is hard. There, I've said it. Many developers scoffed at the hordes of HTML "programmers" who began working with HTML as far back as 1993 -- these baby programmers weren't working with real languages like C++. This same crew thought that Visual Basic programmers were baby programmers and denigrated the Java programmers who emerged in 1995-1996 as 'not real developers,' either. Perhaps a little elitism was at work. The stigma attached to HTML coders has remained, although these folks are attempting to do something impressive. They are attempting to write code that can produce similar (and similarly functioning) Web pages and work on six or seven browsers that span four major operating systems -- and produce a prodigious amount of content. Any good Web publisher probably has a set of extensive ASP, JSP, or Perl scripts that handle the generation of much of the continuity material -- navigation menus, top stories, boilerplate code, and so forth. The irony here is that most of the code that is produced to do this is written by programmers who are used to writing code in Visual Basic or Java. However, the cost for getting these programmers to write these routines is significant. Neither VB nor Java is especially geared toward the manipulation of strings, so a lot of effort is expended in writing ASP code, in particular where the code processing is intermingled with HTML (or that encodes the HTML within quoted strings, which is even less legible). This significantly limits the ability of programmers to create generalized routines. XSLT was supposed to change this scenario, but in many cases, XSLT code is just as thoroughly intertwined with the output code as was ASP. The fact that the embedded and embedding content are in XML gives you some advantages, but ultimately it would be nice to be able to have a simple HTML-like document where you could just create your own general tags..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." • [February 05, 2001] "Build Centralized Apps with VB and XML. Client/server is no longer enough; remote users need Internet access to constantly updated information." By A. Russell Jones. In XML Magazine Volume 2, Number 1 (February/March 2001). ['Developers have been writing client/server applications for some time, but to control access and usage and minimize updates -- as well as serve remote employees -- you'll need to modify these applications so they'll work over intranets and the Internet. Learn how to create VB applications that work over HTTP with XML and the XMLHTTPRequest object.'] "The era of the standalone application installed on a single PC is rapidly drawing to a close. Most business applications today need to share data. Visual Basic programmers have been writing client-server applications for quite some time, but most of these applications work over a private network. However, there are several reasons to modify these applications so they'll work over intranets and the Internet. What are these reasons? First, remote employees -- the numbers of which increase daily -- need access to company data. Second, by centralizing the data for your application, you can monitor and control access and usage. Third, using the techniques in this article, you can maintain and update global application settings by retrieving them from a central location at application startup, which helps minimize application updates to the desktop. Fourth, by performing database access from a Web server rather than the remote client, you can avoid sending database login and password information over the network. Finally (with Internet Explorer), by retrieving data in the background, you can avoid redrawing the entire page to alter part of the contents. The key to creating VB applications that work over HTTP is XML and the XMLHTTPRequest object. This object is part of Microsoft's XML parser (msxml.dll). The XMLHTTPRequest object lets you make GET and POST requests by HTTP to a remote server. Code running on the remote server (I'll use an ASP page, but the application could be any server-side active content delivery mechanism) accepts the request, interprets the contents, and returns data or an error message to the calling application. You may recognize this as similar to the description of SOAP -- and it is -- but I won't use SOAP here because it complicates the code. Nevertheless, it is important to understand that the ideas behind the techniques used here are the same as SOAP, but less complex..." • [February 05, 2001] "Customize XML Data with SQL Server. Simplify app development with SQL Server 2000's query and data-manipulation features." By Dan Wahlin. In XML Magazine Volume 2, Number 1 (February/March 2001). ['SQL Server 2000 provides a rich set of features dedicated to working with XML data. Discover how easy queries and data manipulation can be.'] "In my previous article on leveraging SQL Server 2000's XML features, I demonstrated several different queries that you can use to return data structured as XML. These include URL queries, XML template queries, and XPath queries. These types of queries execute using HTTP, which eliminates layers of code normally required in applications that connect and return results from a database. In this article, I'm going to show additional ways that you can query and manipulate SQL Server 2000 data using FOR XML EXPLICIT queries, OPENXML, and Updategrams. Last issue, my article showed how to retrieve data structured as XML by using the FOR XML keywords. Although using these keywords in a query returns XML data, you don't have full control over how the XML is structured. FOR XML EXPLICIT queries offer you the power to completely customize the structure of the returned XML data. These queries are best understood by thinking in terms of hierarchical relationships amongst XML elements... While the different queries shown up to this point are certainly very useful, they don't allow for updating or deleting data using XML. To fill this void, Updategrams have been introduced. Updategrams are currently in their first beta release so some of this information may change in the future. Updategrams work by presenting a before and after view of data that is marked-up using XML syntax. If data in the before view is different than in the after view, an update is performed to the appropriate fields. For example, if the before view of a record shows a value of John Doe for the ContactName field but the after view shows a value of Dan Wahlin, the Updategram will know that the ContactName field needs to be updated since the values have changed. If the before view contains data, but the after view is empty, the appropriate row will then be deleted. To understand this more fully, let's look at how Updategrams are structured. Listing 6 shows the general framework used to construct an Updategram... There are other more advanced aspects of Updategrams that I have not covered in this article. For example, multiple updates, inserts, or deletes can be performed in a single Updategram, and updates involving multiple tables can be performed with the aid of mapping schemas. Following their official release, Updategrams promise to provide additional power to simplify application development. SQL Server 2000 provides a rich set of useful features dedicated to working with data in the form of XML. Through leveraging these features, layers of code can be eliminated in applications. Direct access to the database can be accomplished via URL queries, templates, XPath queries, and through OPENXML and Updategrams. When used with XSL, this presents an efficient solution that can simplify your application development." • [February 05, 2001] "Readers Respond: XML Makes EDI Technology Even Stronger." By Neel Bhowmik (WebXL Systems). In InternetWeek (January 29, 2001). "Regarding Ken Vollmer's column 'Don't Believe The Hype: EDI And XML Are Just Perfect Together' -- I agree that EDI and XML complement each other. EDI remains viable for most organizations because they find that it works reasonably well. They won't want to replace it with a new and yet unproven technology. Why, then, should an organization look to use XML? To answer this question, let's first look at some of the problems inherent in EDI. One of these is the lack of connectivity between different EDI networks because of multi-ple EDI standards. For instance, if a supplier that runs on X12 wishes to do business with an automobile manufacturer that uses EDIFACT as its EDI standard, it has to switch over to EDIFACT. This option is not at all feasible from the supplier's perspective. Another problem is the prohibitive cost of setting up an EDI network. Doing business with large EDI-networked organizations would put smaller suppliers at their mercy--requiring those suppliers to incur the expense of setting up EDI networks themselves and, in some cases, carrying out transactions by fax and phone. XML can address both issues. As for establishing connectivity between networks following different EDI standards, XML allows data from one network to flow onto the other. Concerning the disparities between larger organizations and small suppliers, a supplier can use an XML-based infrastructure to communicate with a bigger organization's EDI network. While XML can't cure every supply chain problem, it complements EDI by providing a more seamless level of interactivity within components of a supply chain..." • [February 05, 2001] "Make way for the 'browserless Web'." By John Cox and Ellen Messmer. In Network World (January 29, 2001). "Say goodbye to the Web browser . . . at least for business-to-business commerce. Sure, the browser has been the very embodiment of the Web - a standardized way to let people view information formatted in HTML. But over the past 18 months companies have started creating ways to let Web applications interact automatically, either reducing or eliminating the need for a human working with a browser. The goal is to let one company's business processes, such as purchasing, interact directly with those of another, such as ordering. In other words, the 'browserless Web' is on its way. Automating interactions is needed because of the soaring volume of Web transactions, according to Melody Huang, chief architect with Keane, an IT services company in Boston. In large-scale, business-to-business interactions, 'time is of the essence,' she says. 'You can't afford to have a clerk re-enter the data through a Web browser.' 'I've seen figures that say 60% of the time, the guy on the phone taking your order gets it wrong,' says Dan Connolly, XML activity lead with the World Wide Web Consortium, where he's working on the semantic Web - standards for facilitating computer-to-computer interactions. 'Computers never get the order wrong,' he says. Exchanging information via documents formatted in XML is a fundamental part of the browserless Web. The potential impact can be seen in early form in e-commerce coalitions such as RosettaNet, which unites sellers and buyers of electronic components. A new report from Zona Research, 'The Dash for Dot.com Interoperability,' cites several RosettaNet examples. Arrow Electronics, for instance, says it reduced turnaround time to customers from 'next day' to 'same day.' Lucent claims that disseminating technical information via XML has cut component selection time in half... [Sun's] Phipps and [Netscape's] Marc Andreessen say many basic components are in place to start building and using such company APIs. XML is the basis for describing and sharing data, and application logic can be written as Java or ActiveX components. Regardless of the object model used by these components, they can be called over the Web via Simple Object Access Protocol (SOAP). An alternative mechanism, called ebXML, is being positioned for complex transactions. An emerging standard called Universal Description, Discover and Integration - akin to a Web directory service - will let applications identify other Web-based services they need and then connect with them..." • [February 03, 2001] "Archives and Photographs: the 'European Visual Archive' Project (EVA)." By René van Horik (Researcher and Project Manager, Netherlands Institute for Scientific Information Services - NIWI). In Cultivate Interactive Issue 3 (January 29, 2001). ['An article on the EVA project which details how they used Dublin Core for their description elements and XML for data exchange.'] "The EVA Project project aims to investigate relevant issues to enhancing access to historical photographic collections. These issues include: copyright issues, selection procedures, user surveys, digitization techniques, description standards, pricing policy and digital information management systems. Based on the outcomes of this research a Web-based information system is being developed: the EVA system. This system contains descriptions and digital images that belong to the photographic holdings of two City archives: the London Metropolitan Archives and the City archives of Antwerp. The EVA project has two main audiences: Image producers and image consumers. Based on the outcomes of the project an archive will be able to digitize and document its photographs in a well thought-out way. The low threshold for collections to join the EVA system provides them with a tool to get in contact with a huge potential of image consumers. These users can search the image descriptions, view reference images and order images for specific use. The purpose of this article is to report on the main outcomes of the studies carried out within the framework of the project and to describe the starting points on which the EVA system is based... Based on the results of the preparatory studies the content providers of the project, the City archives of Antwerp and London Metropolitan Archives, each started the process of selecting photographs by creating 10.000 digital master files. These digital images had to be 'rich' enough to serve as the basis for derivative images that are published online in the EVA system... Digitizing historical photographs is more than just putting photographic prints on a scanner. A lot of information associated with the creation of digital images is relevant for (future) use, access, update and maintenance of the images and the relation with the original prints. This information (or data) about data is called metadata. It turned out to be that within the 'universe of discourse' of the EVA project several metadata schemes are of potential importance. This is because roughly speaking the EVA project is covers three related 'things': firstly, the historical photograph as a physical medium, secondly, the digital surrogate that is based on the photograph and thirdly, that what is visible on the photograph and the processed digital image. For the sake of abstraction these three 'things' together (the photograph format, the digital image and the visible scene or content) are called an 'EVA visual object', abbreviated as EVO... For the implementation of the data exchange between the local archive information systems and the central EVA system the project decided to use the XML standard. This is an application independent data structure. For each description of a photograph a separate XML file is created. An XML document contains special instructions called tags, which usually enclose identifiable parts of the document. The elements that are allowed are specified in a DTD (Document Type Definition). The DTD used by the EVA system is called EVOlite DTD. In this way self-describing documentation units are created. Two examples of XML formatted descriptions [are provided]. The creation of the XML files in principle is the responsibility of the archives. Within the project software and procedures were developed to assist them in the creation of output in XML format. In the future probably more and more information systems will facilitate the creation of data in XML format and it will become easier to manage data consistency between a local archive management system and the Web-based access system. Just like with the images the XML files are sent via FTP to the server of the EVA system. The archives can independently add, change and delete descriptions... Individual archives create XML files, extracted from the local archive information system. The XML files are sent by the standard Internet protocol FTP to the server of the EVA system. Archives can add, delete and replace XML files independently... The EVA system aims at two types of usage: End-users interested in access to a catalogue of images and descriptions, and users interested in the results of the EVA project and the model of the EVA system. Based on the information on the Web site an archive employee should be able to evaluate the relevance of the project results for the conversion and dissemination of its own collection. The XML formatted descriptions are automatically converted to the database on which the EVA system is based. Periodically the database is refreshed with new information that is sent to the server by the archives with the help of the FTP protocol. The interface between the database and the end-user consists of several Web pages. The input fields are based on the database that contains information that originates from the XML formatted files provided by the archives." See: "European Visual Archive Project (EVA)." • [February 03, 2001] "Application Profiles, Or How to Mix and Match Metadata Schemas." By Makx Dekkers. In Cultivate Interactive Issue 3 (January 29, 2001). With 22 references. ['Makx Dekkers of PricewaterhouseCoopers, Luxembourg describes some recent developments in the area of application profiles and how application profiles are being used, based on experiences in the SCHEMAS project.'] "If you want to define a metadata schema for your electronic resources, you may want to base your work on what others have done. Until some time ago, everybody who needed to define a metadata element set (or schema) to be used for a particular project or collection of resources, invented their own solution. It is becoming apparent that this approach, re-inventing the wheel so to speak, is not the optimal way of working. It is now becoming accepted that it is a good starting point to base the definition of a local schema on work that other people have done. To support this, the SCHEMAS project aims to build an information service where schemas developed in many places around the world can be found. For a start, this information service will begin to solve one of the major problems encountered by metadata schema designers: the difficulty to find out what has happened elsewhere. However, finding out about existing schemas is only a first step towards the ultimate goal: harmonising usage and converging on formal or de-facto standards. As has been identified by the SCHEMAS project from its inception, any particular project or product has specific requirements that cannot be fully met by standards 'straight from the box'. Almost all practical implementations will have to mix and match elements from various schemas and have a potential need to define additional elements of their own. This mechanism of mixing and matching and defining private elements results in what is now called an application profile. Baker, in a 'strawman proposal' to the Dublin Core Registry Working Group defines application profiles as entities that declare which elements from which namespaces underlie the local schema used in a particular application or project. In his view, application profiles 're-use' semantics from namespaces and repackage them for a particular purpose... In his 'strawman proposal' and in subsequent discussions, Baker laid out a number of functional requirements for application profiles. These requirements fall into four categories: (1) Definition of entity classes in the data model that underlies the application, identifying the type or types of resources the application profile schema applies to, e.g. people, Web pages, books, image galleries; (2) Formal declarations of elements and their semantics used by the application, including rules for their usage, e.g. declaring which elements are mandatory, optional, repeatable, which element combinations are allowed or mandated and what allowable formats for the values of elements are; (3) Expression of controlled vocabularies for the value of elements, e.g. specifying which controlled vocabulary, classification scheme or thesaurus may be used as values for a particular element or restricting the allowable values for a particular element to a enumerated set; (4) Human readable information about the application and usage guidance..." See: "Dublin Core Metadata Initiative (DCMI)." • [February 03, 2001] "Application Profiles: Mixing and Matching Metadata Schemas." By Rachel Heery and Manjula Patel (UK Office for Library and Information networking [UKOLN], University of Bath. UK). In Ariadne [ISSN: 1361-3200] Issue 25 (September 2000). With 17 references. ['Rachel Heery and Manjula Patel introduce the 'application profile' as a type of metadata schema.'] "This paper introduces application profiles as a type of metadata schema. We use application profiles as a way of making sense of the differing relationship that implementors and namespace managers have towards metadata schema, and the different ways they use and develop schema. The idea of application profiles grew out of UKOLN's work on the DESIRE project, and since then has proved so helpful to us in our discussions of schemas and registries that we want to throw it out for wider discussion in the run-up to the DC8 Workshop in Ottawa in October. We define application profiles as schemas which consist of data elements drawn from one or more namespaces, combined together by implementors, and optimised for a particular local application.The experience of implementors is critical to effective metadata management, and this paper tries to look at the way the Dublin Core Metadata Element Set (and other metadata standards) are used in the real world. Our involvement within the DESIRE project reinforced what is common knowledge: implementors use standard metadata schemas in a pragmatic way. This is not new, to re-work Diane Hillmann's maxim there are no metadata police', implementors will bend and fit metadata schemas for their own purposes. This happened (still happens) in the days of MARC where individual implementations introduce their own 'local' fields by using the XX9 convention for tag labelling. But the pace has changed. The rapid evolution of Rich Site Summary (RSS) has shown how quickly a simple schema evolves in the internet metadata schema life cycle. The Warwick Framework gave an early model for the way metadata might be aggregated in packages' in order to combine different element sets relating to one resource. The work on application profiles is motivated by the same imperative as the Warwick Framework, that is to provide a context for Dublin Core (DC). We need this context in order to agree on how Dublin Core can be combined with other metadata element sets. The Warwick Framework provided a container architecture for metadata packages' containing different metadata element sets. Application profiles allow for an unbundling' of Warwick Framework packages into the individual elements of the profile with an overall structure provided externally by namespace schema declarations. The Resource Discovery Framework (RDF) syntax has provided the enabling technology for the combination of individual elements from a variety of differing schemas, thus allowing implementors to choose which elements are best fit for their purpose... Taking existing implementation of metadata schema one recognises that rarely is the complete standard schema' used. Implementors identify particular elements in existing schemas which are useful, typically a sub-set of an existing standard. Then they might add a variety of local extensions to the standard for their own specific requirements, they refine existing definitions in order to tailor elements to a specific purpose, and they may want to combine elements from more than one standard. The implementor will formulate local' rules for content whether these are mandatory use of particular encoding rules (structure of names, dates) or use of particular controlled vocabularies such as classification schemes, permitted values. We see application profiles as part of an architecture for metadata schema which would include namespaces, application profiles and namespace translations. This architecture could be shared by both standards makers and implementors. This architecture reflects the way implementors construct their schemas in practice as well as allowing for the varied structures of existing metadata schemas. We believe by establishing a common approach to sharing information between implementations and standards makers will promote inter-working between systems. It will allow communities to access and re-use existing schemas. And by taking a common approach to the way schemas are constructed we can work towards shared metadata creation tools and shared metadata registries." • [February 03, 2001] "RDF Representation of Wordnet Available for Download." By Sergey Melnik and Stefan Decker. Posting 2001-02-02. "An RDF-Representation of Wordnet 1.6 (a Lexical Database for English) is available for download. WordNet is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. Please read the license before downloading the RDF representation of Wordnet. (1) Nouns (10MB), (2) Glossary (15MB), (3) SimilarTo-Definitions (2MB ), (4) Hyponym Definitions (8MB). A Schema (Ontology) [KB] defining the terms used to represent the RDF version of Wordnet is available at: http://www.semanticweb.org/library/wordnet/wordnet-20000620.rdfs." See "Resource Description Framework (RDF)." • [February 03, 2001] "How Would You Like That Served?" By Didier Martin. From XML.com. January 31, 2001. ['Our intrepid explorer of specifications, Didier Martin, investigates CC/PP, an RDF application for describing and exchanging device capabilities.'] "XML is often used for structured documents like XHTML, rendering or transformation languages like XSLT and XSLFO, and as the basis for extensible network protocols like SOAP. A less well known example of the use of XML in network protocol design is Composite Capabilities/Preference Profiles (CC/PP). The potential impact of CC/PP on the next generation of the Web is substantial. One important characteristic of the next generation Web is the proliferation of types of client access device, often by the same user. Some people access email with a desktop computer, a laptop computer, and a handheld, as well as a WAP-enabled cell phone. Each of these devices has a different set of capabilities. To help Web servers keep pace with heterogenous clients, the CC/PP protocol provides a way for clients and servers to exchange information about the client's capabilities and preferences... RDF is seen by some as a cure for all ills. Others see it as too complex to be useful. In fact RDF is a kind of record containing properties about a web resource, which is why it was chosen by the CC/PP working group to describe device capabilities. In this context, a device equipped with a browser is considered a web resource, one which has properties like screen size, rendering capabilities, etc. Details about CC/PP are publicly available, including details about the protocol itself. The existing CC/PP spec was submitted to the W3C as a NOTE and forms the basis of the CC/PP group's ongoing efforts. I won't describe the full protocol, but I will present a simple example to illustrate how a client tells a server about its capabilities. CC/PP is an extension of the HTTP protocol. A client obtains a resource from a server with a GET request in which the client includes the device capabilities and user preferences..." • [February 03, 2001] "The Politics of Schemas: Part 1." By Kendall Grant Clark. From XML.com. January 31, 2001. ['As the world is codified one schema at a time, what are the consequences and implications? This first half of a two-part essay examines why schemas are essentially political.'] "In the first part of this two-part essay, I examine the ways in which the Semantic Web may be political by focusing on the ways in which schemas may be political. But first I need to lay out some background knowledge, describe how I construe some familiar terms, and how I define some other, perhaps unfamiliar, terms. Institutions and individuals create schemas in working groups of programmers, analysts, domain specialists and others. The schemas they create are usually formalized in a meta-languages like DTD or the W3C's XML Schema Definition Language. Some schemas become de facto or de jure standards. The widespread use of a standard empowers those who control it; the more it is used, the less reason to develop or to use an alternative. Schema making requires the skill and expertise of many expert practitioners and is costly. Schemas are often created by institutions whose sole purpose is financial profit. Institutions create private schemas to use internally, within the institution itself; and they create public schemas to use externally. The number of public schemas increases daily. Schemas are being created that will structure exchange between people, between people and institutions, and between institutions, including all levels of government, corporations, and non-governmental organizations." • [February 03, 2001] "XML Q&A: Entities: Handling Special Content." By John E. Simpson. From XML.com. January 31, 2001. ['This month's XML Q&A column tackles the issues of including "special characters" and non-XML content in your XML documents.'] "If the specific tools you're using include a fully XML-compliant parser, then the answer is yes, the ampersand will be seen as a special character because it is a special character. At the lowest levels an XML parser is just a program that reads through an XML document a character at a time and analyzes it in one way or another, then behaves accordingly. It knows that it's got to process some content differently than other content. What distinguishes these special cases is the presence of such characters as "&" and "<". They act as flags to the parser; they delimit the document's actual content, alerting the parser to the fact that it must do something at this point other than simply pass the adjacent content to some downstream application... [How can I insert multimedia files into XML documents?] The bad news is that you can't. XML documents contain text and text only. However, you can insert references to multimedia files in your documents. Of course, these references must also be text -- commonly in the form of a Uniform Resource Identifier (URI)..." • [February 03, 2001] "[Review of] Applied XML Solutions, by Benoît Marchal." By Dianne Kennedy. In GCA XML Files Issue 26 (January 2001). "Applied XML Solutions is a 'solutions book.' This means that the book is designed to teach you how to solve common problems encountered when developing typical XML applications. The idea here is to organize information around a specific solution. To quote the author, 'the main problem for developers is not a lack of information but too much of it!' This book does not attempt to teach the basics of XML and XML syntax such as DTDs, elements, attributes, etc. Nor does it focus on vocabularies or the wealth of related technologies such as XSLT, DOM, XHTML, etc. Rather the book is organized in 'projects' and just the information that is required for that project is presented... If you know the basics of XML, have a basic understanding of Java, and are ready to apply those basics in a real-life situation, this is a good book for you. The project approach of the book is quite unique and most useful. The author tackles tricky, unexpected situations that face an XML developer. This book provides those new to XML development with the tips and tricks of an expert!" • [February 03, 2001] "XML Standards News. W3C Kicks off 2 New Activities for 2001." By Dianne Kennedy. In GCA XML Files Issue 26 (January 2001). "By the end of January 2001, two significant new W3C Activities were launched. These activities include the Device Independence Activity and the XML Encryption Activity. The goal of the Device Independence Activity is to promote single authoring for the Web for all access devices from desktop PCs to in-car computers, TV, digital cameras, and mobile phones... The goal of the XML Encryption Activity is to specify the necessary data model, syntax, and processing to encrypt XML content. Encryption transforms plain text-data into confidential, cipher-text data. This renders the data into a form that can be safely stored or transmitted. Only the intended recipients, with the matching decryption method can restore the data to its original form..." • [February 02, 2001] "Human-Robot Interface Using Agents Communicating In An XML-Based Markup Language." By Maxim Makatchev (Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong) and S. K. Tso (Centre for Intelligent Design Automation and Manufacturing, City University of Hong Kong). Pages 270-275 (with 25 references) in Proceedings the Ninth IEEE International Workshop on Robot and Human Interactive Communication [IEEE RO-MAN 2000, Osaka, Japan, September 27-29, 2000]. Piscataway, NJ, USA: IEEE Computer Society, 2000. Abstract: "The paper concerns an agent-based human-robot interface via the Internet. A user client and an embedded software are viewed as agents with limited computational and communication resources. To facilitate the communication between the real-time embedded agents and the user interface agents via the communication channel of uncertain quality the proxy agent is proposed as a mediator. The functions assigned to the proxy agent target reduction of inter-agent communication load and minimization of computational resources taken by the embedded agents and user interface agents for communication-related tasks. An XML-based language, RoboML, is designed to serve as a common language for robot programming, agent communication and knowledge representation. The human-robot interface software prototype is developed for an autonomous guided vehicle to evaluate the proposed techniques." See: "Robotic Markup Language (RoboML)." [cache] • [February 02, 2001] "Modeling the UDDI Schema with UML." By David Carlson (CTO, Ontogenics Corp., Boulder, Colorado). A white paper and example of XML modeling from XMLModeling.com. (February 2001). 12 pages. "Complex XML vocabulary definitions are often easier to comprehend and discuss with others when they are expressed graphically. Although existing tools for editing schemas provide some assistance in this regard (e.g., Extensibility XML Authority) they are generally limited to a strict hierarchical view of the vocabulary structure. More complex structures are often represented in schemas using a combination of containment and link references (including ID/IDREF, simple href attributes, and more flexible extended XLink attributes). These more object-oriented models of schema definition are more easily represented using UML class diagrams. The Unified Modeling Language (UML) is the most recent evolution of graphical notations for object-oriented analysis and design, plus it includes the necessary metamodel that defines the modeling language itself. UML has experienced rapid growth and adoption in the last two years, due in part to its vendor-independent specification as a standard by the Object Management Group (OMG). This UML standard and metamodel are the basis for a new wave of modeling tools by many vendors. For more information and references to these topics, see the Web portal at XMLModeling.com, which was created for the purpose of communicating information about the intersection of XML and UML technologies. The OMG has also adopted a standard interchange format for serializing models and exchanging them between UML tools, called the XML Metadata Interchange (XMI) specification. (XMI is actually broader in scope than this, but its use with UML is most prevalent.) Many UML modeling tools now support import/export using the XMI format, so it's now possible to get an XML document that contains a complete UML model definition. This capability is the foundation for the remainder of this white paper. Using an XMI file exported from any UML tool, I have written an XSLT transformation that generates an XML Schema document from the UML class model definition. As an example that demonstrates the benefits of modeling XML vocabularies with UML, I have reverse-engineered a substantial part of the UDDI specification. The current UDDI specification (as of January 2001) includes an XML Schema definition that is based on an old version of the XML Schema draft, well before the out-dated April 7th 2000 draft. In contrast, the UDDI schemas described here are compliant with the XML Schema Candidate Recommendation, dated October 24, 2000. The reverse engineering was accomplished manually, by reading the UDDI specification and creating a UML model in Rational Rose..." On XMI, see "Object Management Group (OMG) and XML Metadata Interchange Format (XMI)"; on modeling, see "Conceptual Modeling and Markup Languages." [cache] • [February 02, 2001] "The Design and Performance Evaluation of Alternative XML Storage Strategies." By Feng Tian, David J. DeWitt, Jianjun Chen, and Chun Zhang (Department of Computer Science, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI, 53706; Phone: (608)2626622; email: {ftian, dewitt, jchen, czhang}@cs.wisc.edu). 19 pages, with 18 references. "XML is an emerging Internet standard for data representation and exchange. When used in conjunction with a DTD (Document Type Definition), XML permits the execution of a rich collection of queries using a query language such as XML-QL. This paper describes five different approaches for storing XML documents including two strategies that use the OS file system, one that uses a relational database system, two that use an object-oriented storage manager. We implemented each of the five approaches and evaluated its performance using a number of different XML-QL queries in conjunction with XML data from the DBLP web site. We conclude with some insights gained from these experiments and a number of recommendations on the advantages and disadvantages of each of the five approaches... [Conclusion:] This paper explores several different strategies for storing XML documents: in the file system, in a relational database system, and in an object storage manager and evaluated the performance of each strategy using a set of selection, join, and update queries. Our results clearly indicate that storing the XML documents as ASCII files is not a viable option for a query intensive environment. The results of our experiments also show that using an object manager to store XML documents provides the best overall performance, while using a relational database system has the worst performance. Generally the relational database system is about half or less as fast due to the overhead of the relational database layer above the storage manager. For certain applications this performance difference may be tolerable, especially since this approach makes it easy to support queries that span both XML documents and existing operational data sets stored in a relational DBMS. On the other hand, if one wants to build an XML database system with performance as the primary goal using an object manager is clearly the best." Rick Jelliffe: "Note that they are concerned with querying databases for branches, and so their comments on file-system storage don't apply to where your data is already sitting in nice, friendly, small by coarse-grained documents (i.e., when there is no parsing involved, just raw retrieval by service over the WWW.) Related publications include (1) "On the Use of a Relational Database Management System for XML Information Retrieval" [by Chun Zhang, Qiong Luo, David DeWitt, Jeffrey Naughton, Feng Tian, February 2000], and (2) "Relational Databases for Querying XML Documents: Limitations and Opportunities" [by Jayavel Shanmugasundaram, H. Gang, Kristin Tufte, Chun Zhang, David DeWitt, and Jeffrey F. Naughton, Proceedings of the 1999 VLDB Conference, September 1999]. See also "XML and Query Languages" and "XML and Databases." [cache] • [February 02, 2001] "The Niagara Internet Query System." By Jeffrey Naughton, David DeWitt, David Maier, et al. Computer Sciences Department University of Wisconsin-Madison, with Computer Science Department, Oregon Graduate Institute. Submitted for publication. 22 pages, with 22 references. [February 2001.] "Many projections envision a future in which the Internet is populated with a vast number of Web-accessible XML files -- a 'World-Wide Database'. Recently, there has been a great deal of research into XML query languages to enable the execution of database-style queries over these XML files. However, merely being an XML query-processing engine does not render a system suitable for querying the Internet. A truly useful system must provide mechanisms to (a) find the XML files that are relevant to a given query, and (b) deal with remote data sources that either provide unpredictable data access and transfer rates, or are infinite streams, or both. The Niagara Internet Query System was designed from the bottom-up to provide these mechanisms. It finds relevant XML documents by using a novel collaboration between the Niagara XML-QL query processor and the Niagara 'text-in-context' XML search engine. To handle infinite streams and data sources with unpredictable rates, it supports a 'get partial' operation on blocking operators in order to produce partial query results, and inserts synchronization packets at critical points in the operator tree to guarantee the consistency of (partial) results... In our view, a query system for web-accessible Internet data should have the following characteristics. First, the query itself need not have to specify the XML files that should be consulted for its answer. This flexibility is a radical departure from the way current database systems work; in SQL terminology, it amounts to supporting a 'FROM *' construct in the query language. However, we think this is essential because a truly useful system will allow the user to pose a query, and get an answer if the query is answerable from any combination of XML files anywhere in the Internet. Secondly, a useful query system cannot assume that all the streams of data feeding its operators progress at the same speed, or even that all of the data streams feeding its operators will terminate. XML files fetched from some sites may come much more slowly than files fetched from other sites; furthermore, some of these 'files' may actually be streams (consider a stock ticker news feed). These possibilities mean that the system must not 'hang' waiting for a slow site, and that users must be able to request the result 'so far' without having to wait for all the input streams to terminate, even in the presence of inherently blocking operations such as average, sum, nest and negation. Once again, this is a radical departure from the way conventional DBMS operate. The Niagara Internet Query System is designed to have these characteristics. To support the 'From *' clause, Niagara uses a novel collaboration between its XML query processor and its 'text-in-context' XML search engine. When the XML query processor receives a query, expressed in a modified version of the XML-QL query language (in XML-QL terminology, we have an 'IN *' construct), it first parses the query to extract a search engine query. The search engine query is expressed in the Search Engine Query Language (SEQL, pronounced 'seek-ul'), which supports Boolean combinations of predicates that specify containment relationships between XML elements and their contents. The SEQL query extracted from the XML-QL query is passed to the Niagara Search Engine. The Niagara Search Engine is an inverted list system optimized for evaluating SEQL. It works by crawling the web off-line, and building an index of all the XML files it encounters. The Search Engine evaluates the SEQL query utilizing the index and returns to the XML-QL query engine a list of URLs for the XML files that could possibly contribute answers to the original XML-QL query... The Niagara Internet Query System is designed to enable users to pose XML queries over the Internet. It differs radically from traditional database systems in (a) how it decides which files to use as input, and (b) how it handles input sources that have unpredictable performance or may be infinite streams or both. We have completed the prototypes described in this paper and made them available from our web site. A great deal of future work remains. We are in the process of translating the prototypes from Java to C++. Our experience with the Java versions have convinced us that we cannot get the performance we desire without making the change. We are also investigating building parallel versions of the search engine and query engine, in order to handle very large data volumes and large numbers of queries." The Niagara Internet Query System is public domain software that can be found at http://www-db.cs.wisc.edu/niagara/. See "XML and Query Languages" • [February 02, 2001] "The Status of Schemas." By Steve Gillmor and Sean Gallagher. In XML Magazine (February/March 2001). ['The XML Schemas specification, now one step closer to finalization, will enhance XML document exchange on the Web'] "The W3C XML Schema specification has advanced to candidate recommendation status after several years of effort. Editorial director Sean Gallagher and editor in chief Steve Gillmor talked with IBM E-Business Standards & Technology Lead David Fallside, IBM representative to the W3C Schema Working Group and the W3C Advisory Committee and chair of the XML Protocol Group, and Lotus Distinguished Engineer Noah Mendelsohn, IBM and Lotus representative to the W3C Schemas Working Group and the W3C Advisory Committee. Mendelsohn: [an overview of the significance of the XML Schema specification as it reaches CR] 'XML provides a standard means of interchanging data and documents, especially on the World Wide Web. XML Schemas is a standard way of interchanging descriptions of those documents, and that's important not only because it gives you a way to validate that the document you receive is in some sense correct -- that it meets at least the minimum standards for format and content -- but Schema descriptions will be extremely important in supporting a variety of tools. There will be Schema-aware editing systems that use the Schema to help create a better editing experience. There will be tools for mapping XML into various database systems that will use the information in the Schema to find out what needs to be mapped, or will produce Schemas to expose to the world the kind of information that they're making available. Schemas will be key for building XML queries -- before you know what you're querying, you have to know what it looks like. Schemas as a whole are a core foundation technology that moves XML forward for all of these applications.' Fallside: 'Candidate Recommendation [CR] is the stage where the W3C solicits implementation experience from other W3C members and the world at large on the XML Schema specification. Achieving candidate recommendation status is a way for the W3C to say that we believe that this is a stable draft of the specification; it's stable to the extent that people should be comfortable building implementations using this specification, and we would like your feedback based on those implementations.'..." For schema description and references, see "XML Schemas." • [February 02, 2001] "OASIS Starts On XML Spec For Business Transactions." By Tom Sullivan. In InfoWorld (February 01, 2001). "OASIS (Organization for the Advancement of Structured Information Standards) on Wednesday formed a technical committee to develop a specification for XML message interfaces that will support the coordination and processing of Web services from different organizations. Dubbed the OASIS Business Transactions Technical Committee, the group initially plans to base its work on the Business Transaction Protocol (BTP) specification, submitted to the consortium by San Jose, Calif.-based BEA Systems. 'The protocol is responsible for managing the lifecycle of a transaction,' said Rocky Stewart, CTO of BEA and chairman of the OASIS committee. BTP, for instance, allows complex XML message exchanges to be tracked and managed as loosely coupled 'conversations' between businesses, according to a statement. Boston-based OASIS said the goal is to develop a protocol that works with existing business messaging standards, specifically ebXML (electronic business XML) and RosettaNet. Stewart said OASIS also plans to extend support to BizTalk, Microsoft's data integration server. Stewart added that the goals for XAML and BTP overlap. XAML is Transaction Authority Markup Language, an emerging standard for coordinating and processing multiparty transactions led by Bowstreet, Hewlett-Packard, IBM, Oracle, and Sun..." See also (1) "Transaction Authority Markup Language (XAML)", and (2) the announcment of 2001-01-18 for the formation of the OASIS Business Transactions Technical Committee. • [February 02, 2001] "Trademark Office Produces Its First XML Documents." By Wilson Jackson. From Newsbytes.com. (January 31, 2001). "For 30 years, the Patent and Trademark Office has been looking for a way to make patent documents easier to convert and search. It will begin a transition to Extensible Markup Language publishing this year. PTO already accepts electronic patent applications in XML and will begin issuing most documents, including patent grant copies, in Standard Generalized Markup Language this year. Early next year, SGML will give way to the more widely supported XML. PTO in 1999 considered going directly to XML, but the standard was not then mature enough, said Dave Abbott, vice president in charge of technology and development for Reed Technology and Information Services Inc. the contractor that converts electronic documents for PTO. With the older SGML as an interim step, "transition to XML will be fairly painless," Abbott said. Reed Technology will begin producing patent applications in XML next month as the result of a treaty that reconciles U.S. patent practices with those of other nations. Rather than remaining confidential until patents are granted, the applications will be available for public comment 18 months after filing, as they are in many countries. Reed Technology will produce the new documents using XMetaL 2.0, an XML conversion tool from SoftQuad Software Ltd. of Toronto. Reed has been converting PTO documents since 1970. The company, a member of the Reed Elsevier PLC group, scans and converts about 20,000 patent files per week at its main facility in Horsham, Pa., and satellite facilities in Alexandria, Va...." ### January 2001 • [January 31, 2001] ITML Distributed Session Management Specification Working Draft. Edited by David Orchard, David Tarrell,and Gilbert Pilz. Last Updated: 01/30/01. Version: 0.5 [ITML Session Management.doc] See also cache; (unofficial, .zip) See notes/caveats in "Jamcracker submits ITML Session Management for review by OASIS Security Services TC (David Orchard, 2001-01-31). • [January 31, 2001] Ray Fergerson and members of the Protégé Project at Stanford Medical Informatics (SMI) have announced the release of Protégé-2000 Version 1.5 which supports RDF and RDF Schema. "Protégé-2000 is open-source software and is written in Java. It provides an integrated knowledge-base editing environment and an extensible architecture for the creation of customized knowledge-based tools... The Protégé-2000 tool allows the user to construct a domain ontology, customize knowledge-acquisition forms, and enter domain knowledge; it provides a platform which can be extended with graphical widgets for tables, diagrams, animation components to access other knowledge-based systems embedded applications; it constitutes a library which other applications can use to access and display knowledge bases... Protégé-2000 is a knowledge-based systems development environment that can be used to edit RDF and RDF Schema (and other languages). We have enhanced our RDF support in this release in a number of ways. We have eliminated the need to use RDF specific metaclasses inside of Protégé and automatically convert Protégé concepts to the closest equivalent RDF concept. This gives a much more natural way of working with RDF in the Protégé environment. For those modeling elements in Protégé that do not have a corresponding RDF mapping (such as a restriction that a property be single-valued), we give users the option of either (1) discarding this information on save to produce very clean RDF, or (2) keeping the information as additional Protégé specific properties. We are also using Sergey's latest RDF API implementation which is quite a bit faster for large projects. A simple guide to how to work with RDF in Protégé-2000 is available from the 'RDF Support' link in our User's Guide, A more detailed look at RDF and Protégé-2000 is also available online. The new RDF backend has been designed specifically to be extensible for translation to other RDF based languages such as OIL and DAML. We are exploring the possibility of supporting these languages at a later date. If you would be particularly interested in such support, please send us a message to let us know." • [January 31, 2001] "Reuters Announces New XML Initiative." By Cristina McEachern. In Wall Street and Technology [Online] (January 26, 2001). ['Reuters Takes the Next Step in Simplifying the Information Process with MarketsML. The new XML initiative aims to bring together various XML standards for a more efficient end user experience.'] "Taking the next step in its company-wide Internet strategy, Reuters is embarking on a broad reaching XML initiative aimed at bringing together its various financial market related XML projects. Reuters is calling it MarketsML and it is supposed to act as the 'glue' that bonds all of its markup language projects together. In other words, MarketsML will provide an XML-based architectural framework to create interoperability between external and internal Reuters XML standards. Mark Hunt, director of eBusiness capabilities at Reuters, expects that some specification for MarketsML could be available within six months... Reuters is currently participating in various other XML initiatives such as NewsML, FpML (Financial Products Markup Language), XBRL (eXtensible Business Reporting Language) and IRML (Industry Research Markup Language). MarketsML looks to bring together these other XML protocols for a more integrated and efficient end user experience... 'At the moment, Reuters customers have risk management systems, trading systems, market data systems and they're all vertically aligned,' explains Hunt. 'XML breaks down those barriers to get genuine collaboration at a fairly intimate level.' MarketsML will bring together standardized XML formatted information from the various Reuters data outlets so users can access all of the information in a seamless format... MarketsML also aims to lower costs associated with gathering and distributing data for Reuters. 'Reuters collects and distributes data in a huge number of different formats to thousands of different end points and we spend a lot of money and resources doing that,' says Hunt. 'But with XML and the Internet it becomes much cheaper.' Hunt adds that the MarketsML initiative will be constantly changing with the various industry standards. Reuters also plans to leverage its technology partners, including TibCo, which recently purchased an XML tools provider, in order to provide the MarketsML framework." See: "MarketsML Initiative." • [January 31, 2001] "XPath. [Chapter 9 in XML in a Nutshell.]" Excerpt from the book XML in a Nutshell. A Desktop Quick Reference. By Elliotte Rusty Harold and W. Scott Means. O'Reilly, January 2001. ISBN: 0-596-00058-8. "XPath is a non-XML language used to identify particular parts of XML documents. XPath lets you write expressions that refer to the document's first person element, the seventh child element of the third person element, the ID attribute of the first person element whose contents are the string 'Fred Jones,' all xml-stylesheet processing instructions in the document's prolog, and so forth. XPath indicates nodes by position, relative position, type, content, and several other criteria. XSLT uses XPath expressions to match and select particular elements in the input document for copying into the output document or further processing. XPointer uses XPath expressions to identify the particular point in or part of an XML document that an XLink links to. XPath expressions can also represent numbers, strings, or Booleans, so XSLT stylesheets carry out simple arithmetic for numbering and cross-referencing figures, tables, and equations. String manipulation in XPath lets XSLT perform tasks like making the title of a chapter uppercase in a headline, but mixed case in a reference in the body text..." Note publisher's blurn for the book: "XML in a Nutshell is just what serious XML developers need in order to take full advantage of XML's incredible potential: a comprehensive, easy-to-access desktop reference to the fundamental rules that all XML documents and authors must adhere to. This book details the grammar that specifies where tags may be placed, what they must look like, which element names are legal, how attributes attach to elements, and much more..." • [January 31, 2001] Security Services, Jamcracker Inc. presented draft documents for an Information Technology Markup Language (ITML). The two draft documents are said to "contain much of Jamcracker's collective thoughts on session management and assertions associated therewith..." The Information Technology Markup Language "is a set of specifications of protocols, message formats and best practices in the ASP and ASP aggregation market to provide seamless integration of partners and business processes. It is based on open standards, particularly XML and HTTP. It also uses emerging standards, particularly SOAP and XML Schema." The ITML Message And Protocol Specification Working Draft [11/22/00. Version: 0.8], as a technical specification intended for developers and architects, "provides a framework for specific interactions to occur between Jamcracker and an ASP. An example interaction is a User Provisioning request. Each set of interactions is known as an ITML Best Practice. This document is a companion document to each Best Practice specification. This specification describes the following key decisions: (1) XML Schema is the type specification language, (2) message format is SOAP, (3) a SOAP error structure, (4) a set of protocol errors, (5) encoding rules for graphs of data, (6) encoding rules for methods, (7) namespaces standards in messages, (8) multi-part message encoding, and (9) HTTP Binding including Authentication." The ITML Distributed Session Management Specification Working Draft [01/30/01. Version: 0.5] "provides a framework for specific interactions around session management to occur between Jamcracker and a partner. A typical use of this is for the synchronization of a single sign-on assertion. It addresses the needs of single sign-on, single sign-off, single time-out, and single maintain session. This specification describes the following key decisions: (1) ITML Message and Protocol is the transmission format, (2) Jamcracker pulls state changes from ASPs, (3) Jamcracker can pull from many, (4) Times of actions on state changes are deltas from last pull, not absolute, and (5) The entire session for a given user is sent for each request." On these draft documents, see the URLs with notes/caveats in "Jamcracker submits ITML Session Management for review by OASIS Security Services TC" (David Orchard, 2001-01-31). • [January 31, 2001] "Reuters Integrates Financial Data Using XML." By [Staff]. From ComputerWeekly.com (January 30, 2001). "Financial information company Reuters is to create an overarching XML framework that will create interoperability between external and internal Reuters XML standards. The initiative is the latest step in the technology strategy needed to provide platform-independent financial information to its customers... Mark Hunt, director of XML strategy at Reuters, said, "At present, to access information traders must examine five or six vertical systems. Using XML allows traders to cut across these vertical systems to access news and video feeds, validate business, trade faster and settle accounts within 24 hours." The use of XML is expected to enable the company to integrate data from its systems at a cheaper cost than was previously possible..." See: "MarketsML Initiative." • [January 31, 2001] ADL Sharable Courseware Object Reference Model. Editor: Philip Dodds. Version 1.0. January 31, 2000. Comments to: secretariat@adlnet.org. 219 pages. Chapter 5 presents "Course Structure Format." CSF is "An Extensible Markup Language (XML)-based representation of a course structure that can be used to define all of the course elements, structure, and external references necessary to move a course from one LMS environment to another." • [January 31, 2001] XML Schema FAQ. By Francis Norton. 2001-01-31. See the supporting documents on the www.SchemaValid.com web site. The source for the FAQ document is XML conforming to a FAQ Schema; it has been formatted into HTML with FAQ Stylesheet. "[XML] DTDs have several limitations, one of which is the fact that they are not written in standard XML data syntax. This means, for instance, that while it is quite possible to write an XSLT transform to document an XML Schema, there are far fewer tools to process DTDs. XML Schema also offers several features which are urgently required for data processing applications, such as a sophisticated set of basic data types including dates, numbers and strings with facets that can be superimposed - inlcuding regular expressions and minimum and maximum ranges and lengths..." [cache from new URL] • [January 30, 2001] "Reuters Announces New XML Initiative." By Cristina McEachern. In Wall Street and Technology [Online] (January 26, 2001). ['Reuters Takes the Next Step in Simplifying the Information Process with MarketsML. The new XML initiative aims to bring together various XML standards for a more efficient end user experience.'] "Taking the next step in its company-wide Internet strategy, Reuters is embarking on a broad reaching XML initiative aimed at bringing together its various financial market related XML projects. Reuters is calling it MarketsML and it is supposed to act as the 'glue' that bonds all of its markup language projects together. In other words, MarketsML will provide an XML-based architectural framework to create interoperability between external and internal Reuters XML standards. Mark Hunt, director of eBusiness capabilities at Reuters, expects that some specification for MarketsML could be available within six months... Reuters is currently participating in various other XML initiatives such as NewsML, FpML (Financial Products Markup Language), XBRL (eXtensible Business Reporting Language) and IRML (Industry Research Markup Language). MarketsML looks to bring together these other XML protocols for a more integrated and efficient end user experience... 'At the moment, Reuters customers have risk management systems, trading systems, market data systems and they're all vertically aligned,' explains Hunt. 'XML breaks down those barriers to get genuine collaboration at a fairly intimate level.' MarketsML will bring together standardized XML formatted information from the various Reuters data outlets so users can access all of the information in a seamless format... MarketsML also aims to lower costs associated with gathering and distributing data for Reuters. 'Reuters collects and distributes data in a huge number of different formats to thousands of different end points and we spend a lot of money and resources doing that,' says Hunt. 'But with XML and the Internet it becomes much cheaper.' Hunt adds that the MarketsML initiative will be constantly changing with the various industry standards. Reuters also plans to leverage its technology partners, including TibCo, which recently purchased an XML tools provider, in order to provide the MarketsML framework." • [January 30, 2001] "Reuters Integrates Financial Data Using XML." By [Staff]. From ComputerWeekly.com (January 30, 2001). "Financial information company Reuters is to create an overarching XML framework that will create interoperability between external and internal Reuters XML standards. The initiative is the latest step in the technology strategy needed to provide platform-independent financial information to its customers... Mark Hunt, director of XML strategy at Reuters, said, "At present, to access information traders must examine five or six vertical systems. Using XML allows traders to cut across these vertical systems to access news and video feeds, validate business, trade faster and settle accounts within 24 hours." The use of XML is expected to enable the company to integrate data from its systems at a cheaper cost than was previously possible..." • [January 30, 2001] "XML Structures for Existing Databases. Eleven rules for moving a relational database to XML." By Kevin Williams and nine other database developers (Professional XML Databases authors). From IBM developerWorks, XML library. January 2001. ['Learn how to convert an existing database into XML data in XML Structures for Existing Databases, a preview chapter from Wrox's Professional XML Databases.]' "This book chapter, excerpted from the just-published Wrox Press book Professional XML Databases, offers clear, authoritative guidance for how to deal with an existing database that you need to move to XML, from modeling the tables and keys to dealing with orphaned elements. The chapter provides an overview of the issues involved and details 11 rules for creating XML data structures for data in a relational database. The article includes suggestions for creating data structures that can be processed rapidly. Used with the permission of the publisher. In this chapter, we will examine some approaches for taking an existing relational database and moving it to XML. With much of our business data stored in relational databases, there are going to be a number of reasons why we might want to expose that data as XML: (1) Sharing business data with other systems. (2) Interoperability with incompatible systems. (3) Exposing legacy data to applications that use XML. (4) Business-to-business transactions. (5) Object persistence using XML. (6) Content syndication. Relational databases are a mature technology, which, as they have evolved, have enabled users to model complex relationships between data that they need to store. In this chapter, we will see how to model some of the complex data structures that are stored in relational databases in XML documents. To do this, we will be looking at some database structures, and then creating content models using XML DTDs. We will also show some sample content for the data in XML to illustrate this. In the process, we will come up with a set of guidelines that will prove helpful when creating XML models for relational data... Summary: In this chapter, we've seen some guidelines for the creation of XML structures to hold data from existing relational databases. We've seen that this isn't an exact science, and that many of the decisions we will make while creating XML structures will entirely depend on the kinds of information we wish to represent in our documents. If there's one point in particular we should come away with from this chapter, it's that we need to try to represent relationships in our XML documents with containment as much as possible. XML is designed around the concept of containment -- the DOM and XSLT treat XML documents as trees, while SAX and SAX-based parsers treat them as a sequence of branch begin and end events and leaf events. The more pointing relationships we use, the more complicated the navigation of your document will be, and the more of a performance hit our processor will take -- especially if we are using SAX or a SAX-based parser. We must bear in mind as we create these structures that there are usually many XML structures that may be used to represent the same relational database data. The techniques described in this chapter should allow us to optimize our documents for rapid processing and minimum document size. Using the techniques discussed in this chapter, and the next, we should be able to easily move information between our relational database and XML documents. Here are the eleven rules we have defined for the development of XML structures from relational database structures..." • [January 30, 2001] "Understanding RDF: The Resource Description Framework in Context." By Dan Brickley. 1999/2001. "This presentation of RDF attempts to provide a high level overview of W3C's Resource Description Framework, while grounding the discussion in enough technical detail to highlight RDF's novel and interesting characteristics. For those who have not heard of RDF before, RDF is a relatively new framework for metadata interoperability devised by the World Wide Web Consortium, home of HTML and other Web standards. This overview does not assume any special knowledge of RDF, but does assume some basic familiarity with concepts such as 'metadata', 'URI', 'XML'." [Draft 'online notes from "Understanding RDF" overview' - "XHTMLisation based on notes from a (hastily assembled) talk I gave in Luxemburg a couple of years ago; not yet quite integrating the (GIF image; ugh) content from my original overhead slides. I've lofty ambitions to tidy this up further and extend to reflect recent apps and the 'semantic web' side of things. Time will tell if that happens, so here's the raw doc in its unpolished state. Feedback / suggestions welcomed..."] • [January 29, 2001] Meeting Minutes. Federal CIO Council XML Working Group, Meeting Minutes. January 17, 2001. American Institute of Architects (Board Room). [cache] • [January 29, 2001] "Technology of the Year: XML. XML enlivens e-biz transactions." By Tom Yager. In InfoWorld (January 29, 2001). ['Thanks to support from the open-source community and a slew of new tools, 2000 was the year XML blossomed.'] "XML gained considerable ground in 2000 and, as the reigning media darling, also had plenty of attention from the press. But unlike other heavily hyped but underused technologies, businesses actually put XML through its paces: The markup language found its way into a wide range of commercial solutions for everything from database management to e-commerce personalization. Recent updates to development tools, especially from the open-source community, are also feeding the drive to make everything XML-enabled. Sometimes fame is well-deserved. Before XML took hold, it was expensive and risky to move data between applications. Costly middleware adapters converted data from one proprietary format to another, but you couldn't upgrade your applications or choose a different solution unless a matching adapter were available. The common alternative was to develop application-specific data interchange solutions. But that's a tedious, error-prone exercise. ... XML is adaptable. You can change applications, operating systems, programming languages, database managers, and data layouts, and your XML files will still be readable without much recoding effort. Fifth, XML is standardized. It can't be patented, its use requires no license, and no corporation can make it incompatible with other applications. And finally, XML presents plain, human-readable text. It can be edited in any text editor, modified by any application that can write text files, and stored in any database. Despite those advantages, XML has its detractors. One of its more likable qualities, the fact that it's represented in text files, raises performance issues. If you want to find data in an XML document, you have to scan it line by line -- an intensive operation that can bog down enterprise applications. However, two recent advances help compensate for this shortcoming. First, memory is less expensive than ever, so many companies are routinely packing their servers with enormous amounts of RAM. A modern server with a lot of memory and several lightning-fast CPUs has no trouble scanning XML. In addition, document indexing systems are being used to make vast libraries of XML data-searchable. Although a database manager can only perform rapid searches on a limited category of data, indexed XML documents can quickly return a detailed list of matches for practically any search phrase. Still, there is no end in sight for the XML onslaught. In the coming year, the language will be near ubiquitous in enterprise applications, and emerging XML-related standards will continue to take hold. 2001 will also see the emergence of many industry-specific XML vocabularies, which will ease data exchange between vertical applications and business partners. In short, XML isn't going away any time soon, and companies that don't jump aboard may find themselves with a lot of catching up to do..." • [January 29, 2001] "Three to e-tango to a business tune. XML, personalization, and content management made e-commerce happen last year." By James R. Borck. In InfoWorld (January 29, 2001). "Our [InfoWorld] Technology winners for e-commerce represent advances that came of age in 2000, helping to advance e-business despite the numerous twists and turns in last year's long overdue market revaluation. The three technologies were born from the necessities of a maturing e-business model, which shapes and drives e-commerce. It is important to note that the capabilities of each technology winner also advanced the potential of the next. XML benefited content management which in turn benefited the fluidity of personalization engines, enabling businesses to eke out a competitive advantage in a cluttered e-commerce market. XML has become the defining standard for streamlining data interchange. Infusing the e-business industry with adroitness and extensibility for cross-platform communication and partnering, XML has offered a shortcut for data exchange that would have otherwise involved expensive, often hand-tooled, development efforts. Although 2000 was not the first year XML was on the scene, last year saw a mad dash as vendors scrambled to ensure XML's prominence in their architecture. By mid-year, most vendors were delivering XML support in their offerings, allowing even small and midsize businesses to begin capitalizing on its mass-market availability. The importance of XML will continue in products such as Microsoft's BizTalk Server and budding service-based initiatives such as UDDI (Universal Description Discovery and Integration) and e-speak, which rely on XML-enabled auto-discovery of e-services. Driven by intense market saturation and e-business competition, many businesses in 2000 chose personalization technology to help close sales to online customers... Personalization helped to enable prioritization of product listings and presentation of alternative up-and cross-sell items on-the-fly, which in turn helped companies better anticipate buying habits, thereby bolstering the bottom line... Hand in hand with the movement in personalization come improvements in content management. The ability to serve personalized, real-time data, services, and information to partners and customers demands capping the runaway information glut that inhibits efficient data organization and workflow. The use of content management systems has enabled companies to build highly organized information delivery mechanisms that can be maintained by low-cost, data entry clerks." • [January 27, 2001] "The Net Plumber. [Technology Report.]" By William Jackson. In Government Computer News Volume 20, Number 2 (January 22, 2001), pages 27, 30. [Read it and weep...] "The problem [with the Web and its 19 degrees of separation] is that up to 10 percent of the embedded links are broken at any given time. A recent survey by Jupiter Media Metrix of New York of 28 federal Web sites and 53 state and local sites found that 84 percent had broken links. Links break when uniform resource locators change, and the average life span of a URL is short: just 44 days. 'Every six weeks, on average, every link gets broken,' Jeannin said. Jeannin founded LinkGuard to fix all those broken links on the Web... By March, Jeannin expects to complete the first step in an ambitious project to map every link on the Web. It will take the form of a 40T distributed database, and it will let LinkGuard go beyond fixing outbound links to repair inbound links, too. The database, called LinkMap, will reside on PowerVault 650F and 630F Fibre Channel storage devices from Dell Computer Corp., accessed through Dell PowerEdge 6450 enterprise servers. Dell recently contracted to supply at least 1,000 terabytes -- a petabyte -- of storage for the Navy-Marine Corps Intranet..." • [January 27, 2001] "Enabling Electronic Business with ebXML." From the ebXML Initiative. White paper. December 2000. "The vision of ebXML is to create a single global electronic marketplace where enterprises of any size and in any geographical location can meet and conduct business with each other through the exchange of XML based messages. ebXML enables anyone, anywhere, to do business with anyone else over the internet. ebXML is a set of specifications that together enable a modular, yet complete electronic business framework. If the Internet is the information highway for electronic business, then ebXML can be thought of as providing the on ramps, off ramps, and the rules of the road. The ebXML architecture provides: (1) A way to define business processes and their associated messages and content. (2) A way to register and discover business process sequences with related message exchanges. (3) A way to define company profiles. (4) A way to define trading partner agreements. (5) A uniform message transport layer. The ebXML initiative is designed for electronic interoperability, allowing businesses to find each other, agree to become trading partners and conduct business. All of these operations can be performed automatically, minimizing, and in most cases completely eliminating the need for human intervention. This streamlines electronic business through a low cost, open, standard mechanism. ebXML is global in support, scope and implementation. It is a joint initiative of the United Nations (UN/CEFACT) and OASIS, developed with global participation for global usage. Membership in ebXML is open to anyone and the initiative enjoys broad industry support with over 75 member companies, and in excess of 2,000 participants drawn from over 30 countries. Participants represent major vendors and users in the IT industry and leading vertical and horizontal industry associations. ebXML is evolutionary, not revolutionary. It is based on Internet technologies using proven, public standards such as: HTTP, TCP/IP, mime, smtp, ftp, UML, and XML. The use of public standards yields a simple and inexpensive solution that is open and vendor-neutral. ebXML can be implemented and deployed on just about any computing platform and programming language. Electronic commerce is not a new concept. For the past 25 years, companies have been exchanging information with each other electronically, based on Electronic Data Interchange (EDI) standards. Unfortunately, EDI currently requires significant technical expertise, and deploys tightly coupled, inflexible architectures. While it is possible to deploy EDI applications on public networks, they are most often deployed on expensive dedicated networks to conduct business with each other. As a result, EDI adoption has been limited to primarily large enterprises and selected trading partners, which represents a small fraction of the world's business entities. By leveraging the efforts of technical and business experts, and applying today's best practices, ebXML aims to remove these obstacles. This opens the possibility of low cost electronic business to virtually anyone with a computer and an application that is capable of reading and writing ebXML messages. Businesses of all sizes will adopt ebXML for reasons of lower development cost, flexibility, and ease of use..." • [January 26, 2001] "XML: The Search For Standards." By Clive Davidson. In Risk Magazine. August 2000. "Extensible Markup Language (XML) can offer the best route to the consistent data formatting standards desperately needed for straight-through processing and enterprise-wide risk management. But the industry still lacks a comprehensive and compatible set of specifications. There is currently a proliferation of initiatives to create subsets of XML standards for things such as over-the-counter derivatives and risk data. At the same time, various organisations, including Microsoft and the United Nations, are attempting to establish frameworks for co-ordinating standards across industry sectors. So while on one level progress towards XML-based standards for straight-through processing (STP) and risk appear slow, there is considerable activity towards establishing the groundwork on which these standards must be based... while several institutions and their software suppliers have seen the potential of XML to solve internal problems and have already deployed it in proprietary ways within their technology infrastructures, they are also supporting initiatives to create industry-wide standards. Among the first to seize upon XML for their internal use were software suppliers California-based Integral Development, and New York-based SunGard Trading and Risk Systems and JP Morgan. Integral and SunGard created sets of XML specifications for their own systems, called FinXML and Network Trade Model respectively, while JP Morgan got together with PricewaterhouseCoopers to sketch out XML specifications for OTC derivatives, which they called FpML. Integral made FinXML freely available, and has used it in building its CFOweb.com financial portal, while SunGard has contributed NTM to Microsoft's DNAfs architecture for financial services (Risk August 1999, page S19). Meanwhile, other XML initiatives have appeared. The consortium of broker-dealers and investment firms developing the FIX protocol for sharing pre-deal information has been working to incorporate XML into a new version of its protocol called FIXML. New York-based data and technology supplier Bridge Information Systems created MDML, a market data specification for XML, while Toronto-based risk management systems supplier Algorithmics has begun working on an XML specification for risk data, called RDML...But while the industry generally recognises the need for diverse and complementary efforts at this early stage of the standards-making process, some people in the industry felt there was a need to pull the various strands of activity together. In February, representatives from the FIX organisation, the Industry Standardisation for Institutional Trade Communication-International Operations Association (ISITC-IOA) and the American National Standards Institute hosted a meeting that included representatives from all the vertical and horizontal XML initiatives, as well as many from traditional standards-making bodies such as the International Organisation of Standardisation (ISO), the World-Wide Web Consortium (W3C -- the originator of XML), and financial industry service providers and consortia such as the communications network and messaging organisation Swift, and the Global Straight-Through Processing Association (GSTPA). The purpose of the meeting was 'to share information and identify commonality in terms of approach, framework and agreed principles, as well as potential points of convergence', with the overall aim of promoting 'interoperability' between the emerging XML specifications. Following presentations on ebXML, FpML, FIX, the GSTPA, the ISO and the ISITC-IOA, the meeting identified three areas that required co-ordinated attention -- modelling (of the business requirements to be captured in the standards), semantics and schemas (the syntax of the individual standards), and interoperability (the ability of the standards to work together). There was a further meeting in June, where the participants established the ISO XML Advisory Group. The participants chose ISO as the umbrella organisation because of its long experience in international standardisation efforts, and its established role in the financial industry -- the securities industry has adopted its ISO 15022 data dictionary as the standard for messaging in confirmation, clearing and settlement. John Goeller, director at Salomon Smith Barney and chair of the new group, says its aims are to 'serve as the co-ordination point for XML initiatives, provide a framework for standardised use of XML in the securities industry, provide XML recommendations for the ISO 15022 data dictionary, and co-ordinate with ebXML'." [This article originally appeared in the Technology Risk supplement to the August 2000 issue of Risk magazine published by Risk Waters Group Ltd.] • [January 25, 2001] Tutorials: Using XSL Formatting Objects, Part 2." By J. David Eisenberg. From XML.com. January 24, 2001. ['The second part of our XSL Formatting Objects tutorial explains how to use lists and tables in documents.'] " This article is the second part of our series on using XSL Formatting Objects. You should read the first article before proceeding with this one. Having tackled the cover and contents page in the previous article, we're now ready to put the main content into the Spanish handbook..." • [January 25, 2001] "Tutorials: What is RDF?" By Tim Bray and Dan Brickley. From XML.com. January 24, 2001. ['An introduction to the W3C's Resource Description Format, a standard for exchanging metadata, and a key technology for the W3C's "Semantic Web".'] "This article was first published as "RDF and Metadata" on XML.com in June 1998. It has been updated by ILRT's Dan Brickley, chair of the W3C's RDF Interest Group, to reflect the growing use of RDF and updates to the specification since 1998. RDF stands for Resource Description Framework. RDF is built for the Web, but let's leave the Web behind for now and think about how we find things in the real world. Scenario 1: The Library. You're in a library to find books on raising donkeys as pets. In most libraries these days you'd use the computer lookup system, basically an electronic version of the old card file. This system allows you to list books by author, title, subject, and so on. The list includes the date, author, title, and lots of other useful information, including (most important of all) where each book is..." • [January 25, 2001] "Microsoft Seeks Vertical Markets With BizTalk." By Antone Gonsalves. In TechWeb News (January 24, 2001). "Microsoft, focusing on ease of use as a major strength of its BizTalk integration server, plans to release by early next quarter pre-built kits that help configure the product for a particular vertical market. The BizTalk Server 2000, which started shipping earlier this month, includes an orchestration designer tool that enables a user to create a visual representation of a business process, which could include moving a purchase order to ERP systems and requesting approval by business managers through e-mail. The server executes the business process. To quicken the design process, Microsoft Corp. will offer vertical kits that include pre-built adapters, business processes, and XML-based document frameworks for specific vertical industries, said David Wascha, BizTalk product manager. Wascha said users will be able to easily import the kits' components into BizTalk and access them through the orchestration designer. He said the first kits will target a couple of markets and will be available by early next quarter, though he would not identify the markets... In addition, WebMethods and others who have been in the market longer than Microsoft have had more time to build adapters for their products to various ERP and legacy systems. Wascha said Microsoft, Redmond, Wash., plans to let third-party companies build adapters for BizTalk, which ships with a software development kit for building connectors..." • [January 25, 2001] "A business blueprint for standards. SWIFTStandards: An Update." From SWIFT. "The development of standards is still a highly technical business. But it is also a business issue. That synergy is being embodied in the new SWIFTStandards that are being developed for the XML environment. As Martine de Weirdt, senior manager, standards, at SWIFT, reminded delegates, 'we don't build standards for the sake of doing so, but for your business.' De Weirdt traced the development of the new approach to standards from Sibos at Helsinki and Munich. This has been a response to the emergence of a diverse user community and the fact that proprietary standards are reaching their technical limits. The ultimate goal is still 'one standard' at the end of the day, but the emphasis in the interim will be on interoperability. There was reassurance for all the delegates present from Klaus-Dieter Naujok, senior business developer at NextERA and chair of the ebXML joint initiative of UN/CEFACT and Oasis. 'SWIFT is state-of-the-art and it has always shared our vision of the future,' said Naujok. 'What it is doing is not exotic and CEFACT would not have been possible without SWIFT.' Naujok pointed out that the telecoms, automobile and retail industries, for example, are all moving in the same direction. And, he said, 'if you're part of it now, the cost may not be so great as it would be if you had to bear the costs of migration in the future.' With so many industries and standards bodies working together 'there is no danger of being isolated.' ebXML aims to create a framework for different industries to work with an interoperable form of XML. Its 10 project teams are working within an 18-month timeframe to establish requirements, broaden awareness, and develop architecture, core competencies, registries and a business process. Microsoft - one of the few big technology players not involved in ebXML - is now talking to the organisation. Business information modelling lies at the heart of the development of SWIFTStandards. In the case of the GSTPA, for instance, business information modelling is being used to develop a model for the post-trade, pre-settlement environment. In the e-trust area, an interactive message is being developed to query trust certificates. Frank Vandamme from SWIFT standards reiterated the point that 'you don't need to be a modelling expert to work on standards development. We need to capture the information from your business experience.' Unified Modelling Language (UML) is then used to produce a model which is put to the users. The other crucial ingredient in future standards will be XML. Naujok emphasised that XML is not a universal panacea to all syntax problems. What it does offer is the ability to create flexible document structures. Vandamme described in more detail how swiftML is evolving - along with a number of other XML formats, such as FIXML, FpML, Bolero XML and STPML. The important point is that SWIFT will not replace all FIX syntax messages with XML syntax - especially not, as in the payments area, where FIN works very well. swiftML does have all the benefits of XML and it will be possible to create a lot of tools around it, especially as XML is becoming the standard in e-business..." • [January 25, 2001] "Microsoft Announces Java User Migration Path to Microsoft .NET. JUMP to .NET Facilitates Java Developers' Transition to Web Services On the Multi-Language .NET Platform." - "Microsoft Corp. today announced the Java User Migration Path to Microsoft .NET (JUMP to .NET), a set of independently-developed technologies and service offerings to enable programmers to preserve, enhance and migrate Java language projects onto the Microsoft .NET Platform. JUMP to .NET enables Microsoft Visual J++ customers and other programmers who use the Java language to take advantage of existing skills and code investments while fully exploiting the Microsoft platform today and into the future. JUMP to .NET provides the easiest transition for Java developers into the world of XML-based Web services and dramatically improves the interoperability of the Java language with software written in a variety of other programming languages. 'With JUMP to .NET, the Java language joins over twenty other programming languages from Microsoft and third party vendors supporting the .NET Platform,' said Sanjay Parthasarathy, vice president of platform strategy at Microsoft. 'The principle of integration is fundamental to Microsoft .NET. JUMP to .NET further underscores our commitment to interoperability and choice of programming language for building Web services.' JUMP to .NET gives customers a number of paths for migrating their Java language investments to the .NET Platform. Existing applications developed with Visual J++ can be easily modified to execute on the .NET Platform, interoperate with other .NET languages and applications and incorporate new .NET functionality. Further, developers familiar with the Java language syntax can use it to create new .NET applications or migrate existing code entirely to the C# language. JUMP to .NET consists of three sets of tools and a service offering: Interoperability support -- a set of tools that enables many existing applications built with Visual J++ to be easily and mechanically modified to work with the .NET Platform. Once modified, applications can be easily extended to take advantage of new .NET functionality, such as native Web Services support. Programming tools support -- a set of tools, hostable in the Visual Studio.NET integrated development environment, that allows developers to use the Java language syntax to directly target the .NET Platform. Automated conversion from Java source code to C# -- a tool that automatically converts existing Java source code into C#, migrating both language syntax and library calls. Any code that cannot be converted is flagged within the Visual Studio.NET integrated development environment to help developers easily find and quickly address any remaining conversion issues. Migration services -- paid consulting services offered by Microsoft to apply the JUMP to .NET technologies to specific customer projects. Microsoft will also support other consulting and integration organizations that wish to provide service offerings using JUMP to .NET..." • [January 25, 2001] "The More the Merrier." By JP Morgenthal. In Intelligent Enterprise Volume 4, Number 2 (January 30, 2001), pages 46-47. ['The need for pervasive supply chain automation is fueling various e-procurement scenarios -- but which is right for you?'] "If you review procurement automation over the past 20 years, a noticeable pattern emerges: Attempts to automate procurement are only as successful as the number of sellers or suppliers that can participate in the automated transaction. Even today, many large companies can only claim 20 to 30 percent automation across their supply chain. The costly hurdles involved with participating prevent a fully automated supply chain. However, an industry movement to lower trade barriers so more suppliers can participate in e-procurement is emerging. In many cases, the buyers themselves are leading this effort as they attempt to reach greater levels of productivity within their own organizations. Central to this effort is the ubiquity of the Internet and the widespread availability of tools and products that support extensible markup language (XML) and XML-based e-business vocabularies. The rise of these two intertwined technologies results in a low-cost transactional network that any computer with a modem can access. In this column, I will examine the new e-procurement environment and its requirements, architectures, benefits, and pitfalls... the Internet and XML are two key technologies that provide low-cost alternatives to expensive EDI systems. However, groups like RosettaNet and ebXML are creating the open standards that leverage these technologies, which enable affordable software and self-implemented solutions. However, as more suppliers participate in e-procurement transactions, buyers must increasingly manage hundreds of additional relationships. Therefore, they have a great need for software that will help them manage their trading network. With the transition from a homogenous EDI format to a potentially unlimited number of XML vocabularies and various delivery transports (HTTP, FTP, or SMTP), the software solution must manage a significant amount of additional data for each individual relationship. Additionally, e-procurement of direct materials is increasingly different from EDI. E-procurement transactions will soon be multipoint transactions. In contrast to traditional EDI, a single transaction will now be distributed to multiple parties simultaneously. For example, a purchase order will still be issued to the supplier, but now it will also be delivered to the trucking company to arrange transportation and to the bank for funds to pay for the transaction." • [January 25, 2001] "Inside B2B." By Jeanette Burriesci. In Intelligent Enterprise Volume 4, Number 2 (January 30, 2001), pages 9-10. ['Tilion is among the first entries in the B2B analytics derby, but can it finish what it started?'] "The way things usually go, transactional systems under construction today will soon be complemented by business intelligence (BI) services. According to Hurwitz Group analyst Philip Russom, the Internet-enabled supply chain is likely to be particularly fertile ground... Tilion will capture XML-wrapped data from various sources (such as a participating corporation, a public Internet exchange platform, or a private EDI hub), transport it via the Internet to the Tilion Intelligence Center, aggregate the data appropriately, perform analysis, and generate reports. Analytics can include fill rate performance, advance ship notices, and variances between items ordered vs. items shipped. Reports are rendered in HTML from the XML source. XML, the ASP model, the emergence of B2B trading exchanges, and the knowledge that these exchanges would eventually seek analytic tools were the magic combination that inspired the Tilion concept. But at first, it looked as if Stone may have been too prescient for his own good. According to Russom, when the company formed in January 2000, Tilion's revenue model was based on transaction volume and focused on public exchange hubs as its prime customers or partners. However, transaction volume is still very low on public exchanges. According to Russom, 'Public trade exchanges for B2B are going to ramp up; in two to three years, there will be impressive volumes of sales going on there. But today, it's just too early. That's a dilemma for Tilion.' Tilion executives, knowing that indefinite rounds of VC financing are no longer a certainty, rethought their strategy and decided to target private EDI networks as well as open B2B exchanges. One of Tilion's first customers, which Stone would not identify by name but described as "the most valuable company in the world" (in terms of market capitalization), is an EDI-based hub. The company does not perform EDI-to-XML transformations for customers. Rather, it works with partners such as Netfish Technologies Inc., XML Solutions Corp., and Mercator Software Inc. to put data into the right package. Russom says that writing the parsing instructions to transform a set of EDI document types to a given XML format takes about a day..." • [January 25, 2001] "IT and the NOW Economy. XML technologies can provide more options and flexibility in enterprise messaging." By Michael Hudson and Craig Miller (Blueprint Technologies). In Intelligent Enterprise Volume 4, Number 2 (January 30, 2001), pages 24-29. "...No single standard for exchanging data exists, and most of the solutions you've considered purchasing are really batch-oriented systems that lack the direct interactivity you require. One system may directly invoke the functionality of an application on another in what is known as a remote procedure call (RPC), or two systems may collaborate by exchanging data through brokering mechanisms that are often referred to as message-oriented middleware (MOM). Traditionally, however, solutions in this market have usually been much more at home in homogenous, closed LAN-based environments. Solutions have often been expensive, mainframe-oriented, or bound to a specific platform or programming language. RPC mechanisms illustrate the point vividly. Traditional RPC mechanisms included large-scale solutions such as the Distributed Computing Environment (DCE), and later, object-oriented RPC architectures such as Microsoft's Distributed Component Object Model (DCOM), the Common Object Request Broker Architecture (CORBA), or the Remote Method Invocation (RMI) APIs supported by Java. All of these technologies have great merit and have served as the basis of many enterprise success stories. However, there are significant obstacles to their widespread adoption in open environments. Each component architecture is commonly tied to either a platform (DCOM), a programming language (RMI), or a small number of solution vendors. Reliable deployment of these technologies often requires installation of large, complex client-side libraries in the case of DCOM and CORBA. The greatest challenge for many of these technologies, however, are firewalls and proxy servers. Many firewall solutions employ a combination of packet filtering and network address translation to provide security. For instance, your company may block all traffic except that on the default HTTP port 80; moreover, many firewalls inspect the packets themselves to determine whether they constitute valid HTTP traffic. Finally, the actual target of an RPC call may have no publicly routable IP address; instead, packets destined for it are translated by a proxy mechanism to a private network address behind the firewall. These security mechanisms often foil RPC mechanisms, which often rely on other ports (typically, port 135 and a range of higher-numbered ports) and direct linkage to a known routable IP address. On the other hand, extensible markup language (XML) is extremely well-suited for expressing complex data types, such as arrays, master-detail relationships, records, and the like. Like HTTP, it is an open and platform-independent standard that has rapidly achieved widespread adoption. Perhaps it is not surprising that efforts are underway to link the two standards. The best known initiative to date is the Simple Object Access Protocol (SOAP). The SOAP protocol is the brainchild of many organizations, notably Microsoft, IBM, and others. Like XML, which is in part a greatly simplified descendant of the older SGML language, SOAP embodies a 'less is more' viewpoint. One of its architects, Don Box, describes it as a 'no new technology' approach. SOAP is merely a specification -- to be precise, an XML implementation language with a specific schema. It packages the information contained within a remote procedure call and provides standards for error handling. The SOAP content specifies the resource on the server responsible for performing the invoked functionality, known as an endpoint. But how it is carried out is entirely at the server's discretion. As with all XML content, the schema defining the structure of the specific RPC call between client and server is referenced as an XML namespace. The content of the call (to get the last trading price of a stock with the symbol DIS) is contained within the SOAP 'body', which is in turn contained within a SOAP 'envelope' that provides the XML schema reference for the SOAP standard itself... Microsoft has made SOAP a critical technology for its .Net initiative; SOAP is the underlying RPC protocol that links events on HTML forms to specific triggers on Microsoft servers. Java and Apache tools for SOAP now exist as well. Whether SOAP will ultimately prevail as the dominant RPC mechanism in the Internet era remains to be seen..." • [January 25, 2001] "Leveraging the UML Metamodel: Expressing ORM Semantics Using a UML Profile." By David Cuyler (Sandia National Laboratories). In Journal of Conceptual Modeling Issue Number 17 (December, 2000) "This paper is a proposal for a UML Profile to facilitate expression of Object Role Modeling semantics in terms of the UML 1.3 Metamodel. The profile uses the extension mechanisms inherent to UML to clarify usage and semantics where necessary, and it proposes the use of the XML Metadata Interchange (XMI) specification for model exchange. Once expressed in terms of the UML Metamodel, ORM models can then be shared among UML-based tools and can be stored, managed and controlled via UML-based repositories. The paper provides an example of an ORM model fragment converted to the XMI format, in accordance with the profile.... Object Role Modeling (ORM), as defined by the work of Dr. Terry Halpin and with a heritage in Natural Language Information Analysis Method (NIAM), has one of the richest content models of any persistent modeling grammar. ORM is unique among information modeling techniques as it can be used to document a persistent data model for both relational and object schemata. Dr. Halpin has recently published several works comparing ORM with UML, and in them has implied that conversion of an ORM model to UML might be possible. This paper provides a definition, in the form of a UML Profile, that provides the extensions necessary to perform this conversion and to accurately reflect the semantic content of an ORM model. ORM semantics and usage differ from those typically associated with UML primarily in the following areas: [1] What would normally be considered an Attribute in UML is represented in ORM as an Association (FactType). [2] A typical ORM Constraint restricts the allowed population of an AssociationEnd (Role) or a set of AssociationEnds. This contrasts with the UML, where constraints typically govern whole Associations, Classes, or Behavioral Features. [3] The ORM analysis process relies heavily on sample populations of associations (Links) to assist in the determination of Constraints. This is not consistently used in UML techniques. [4] ORM methods are typically used to model persistent data stores, helping to optimize the data structure and reduce the incidence of anomalies in the population of the data store. UML is typically used to model run-time characteristics of software..." See Appendix D: "Semantic Content of the Example ORM Source Model Expressed in XMI 1.1 According to this Profile." • [January 25, 2001] "Reasoning about XML Schema Languages using Formal Language Theory." By Dongwon Lee UCLA/CSD), Murali Mani (IBM Almaden Research Center), and Makoto Murata (IBM Tokyo Research Lab). Technical Report, IBM Almaden Research Center, RJ#10197, Log#95071, November 16, 2000. 37 pages. "A mathematical framework using formal language theory to describe and compare XML schema languages is presented. Our framework uses the work in two related areas -- regular tree languages and ambiguity in regular expressions. Using these work as well as the content in two classical references, we present the following results: (1) a normal form representation for regular tree grammars, (2) a framework of marked regular expressions and model groups, and their ambiguities, (3) five subclasses of regular tree grammars and their corresponding languages to describe XML content models: regular tree languages, TD(1) (top-down input scan with 1-vertical lookahead), single-type constraint languages, TDLL(1) (top-down and left-right input scan with 1-vertical and 1-horizontal lookaheads), and local tree languages, (4) the closure properties of the five language classes under boolean set operations, (5) a classification and comparison of a few XML schema proposals and type systems: DTD, XML-Schema, DSD, XDuce, RELAX, and (6) properties of the grammar classes under two common operations: XML document validity checking and type resolution (i.e., XML document interpretation)... As the popularity of XML increases substantially, the importance of XML schema language to describe the structure and semantics of XML documents also increases. Although there have been about a dozen XML schema language proposals made recently, no comprehensive mathematical analysis of such schema proposals has been available. We believe that providing a framework in abstract mathematical terms is important to understand various aspects of XML schema languages and to facilitate their efficient implementations. Towards this goal, in this paper, we propose to use formal language theory, especially regular tree grammar theory, as such a framework... Our work relies largely on work in two related areas - regular tree languages and ambiguous regular expressions. Tree languages and regular tree languages have a long history dating back to the late 1950s. One of the main reasons for the study in this period was because of their relationship to derivation trees of context-free grammars. A description of tree languages, and their closure properties can be obtained from the book available online. Our contributions to this field consist of defining a normalized grammar representation for regular tree languages and subclasses of regular tree languages based on the features of most of the XML schema proposals. A field related to regular tree languages is regular hedge languages studied in [Tak75, Mur00a]. A regular hedge language defines a set of hedges, where a hedge is an ordered list of trees. A regular hedge language is essentially similar to a regular tree language because we need to only add a special root symbol to every hedge in a regular hedge language to make it a regular tree language... Ambiguity in regular expressions is described in [BEGO71]. Here, the authors give several results relevant to our paper: (1) every regular language has a corresponding unambiguous regular expression, (2) we can construct an automaton called Glushkov automaton in [BKW98] correspond- ing to a given regular expression that preserves the ambiguities, and (3) given a non-deterministic finite state automaton, we can obtain a corresponding regular expression with the same ambiguities. Ambiguous regular expressions and model groups are studied in the context of SGML content models in [BKW98]. Here the authors describe the concept of fully marked regular expressions, and 1-ambiguous regular expressions and languages. Further, they construct Glushkov automaton that preserves the 1-ambiguity in regular expressions. Our work includes several extensions to the existing work on ambiguity in regular expressions: (1) we extend the notion of a marked regular expression to allow two or more symbols in a marked regular expression to have the same subscript, (2) we define ambiguities in such marked regular expressions and model groups, (3) we extend Glushkov automaton to our marked regular expres- sions, and (4) we define prefix and suffix regular expressions or model groups, that can be used to introducing element-structure of a document which is the topic of this paper... [Conclusion:] A mathematical framework using formal language theory to compare various XML schema languages is presented. In our framework, a normal form representation for regular tree grammars, ambiguities in general marked regular expressions and model groups, and various subclasses of regular tree languages are defined. Further, a detailed analysis of the closure properties and expressive power of the different subclasses is presented. Finally, results on the complexity of membership checking and type resolution for various XML schema languages are presented. One class of grammars which we did not describe in great detail is TDLL(1) grammars without deterministic constraint. However, we believe that this class of grammars will play an important role because of its several useful features -- (a) membership checking and type assignment can be done in linear time in the event model, and (b) it is strictly more expressive than TDLL(1) grammars with deterministic constraint and single-type constraint grammars, though strictly less expressive than TD(1) grammars. We expect a future XML processing system will behave as follows: A server does XML processing, and the result of XML processing is an XML document and a regular tree grammar, as in XDuce. The server will try to evaluate if the regular tree grammar has an equivalent representation as, say, a TDLL(1) grammar. If yes, it will convert the grammar to that form. The server will then send the document and the grammar to the client. Now the client gets a document and a TDLL(1) or regular tree grammar. The client might wish to do more processing, but might be limited by memory considerations. If the grammar the client gets is a regular tree grammar, and the client has memory limitations, it will try to convert the grammar into the 'tightest' possible TDLL(1) grammar, and then do the processing..." Paper also in Postscript format. • [January 25, 2001] Object Role Modelling and XML-Schema. By Linda Bird and Andrew Goodchild (Distributed System Technology Center - DSTC), and Terry Halpin (Microsoft Corporation). Presented at ER 2000 (19th International Conference on Conceptual Modeling, October 9-12, 2000). 14 pages. Abstract: "XML is increasingly becoming the preferred method of encoding structured data for exchange over the Internet. XML-Schema, which is an emerging text-based schema definition language, promises to become the most popular method for describing these XML-documents. While text-based languages, such as XML-Schema, offer great advantages for data interchange on the Internet, graphical modelling languages are widely accepted as a more visually effective means of specifying and communicating data requirements for a human audience. With this in mind, this paper investigates the use of Object Role Modelling (ORM), a graphical, conceptual modelling technique, as a means for designing XML-Schemas. The primary benefit of using ORM is that it is much easier to get the model 'correct' by designing it in ORM first, rather than in XML. To facilitate this process we describe an algorithm that enables an XML-Schema file to be automatically generated from an ORM conceptual data model. Our approach aims to reduce data redundancy and increase the connectivity of the resulting XML instances... ORM was chosen for designing XML schemas for three main reasons. Firstly, its linguistic basis and role-based notation allows models to be easily validated with domain experts by natural verbalization and sample populations. Secondly, its data modeling support is richer than other popular notations (Entity-Relationship (ER) or Unified Modeling Language (UML)), allowing more business rules to be captured. Thirdly, its attribute-free models and queries are more stable than those of attribute-based approaches..." On 'Object Role Modelling,' see www.orm.net • [January 25, 2001] "Commentary: Java, .Net Rivalry Still On. [Gartner Viewpoint.]" By David Smith, Daryl Plummer and Mark Driver (Gartner Analysts). In CNET News.com (January 25, 2001). "Microsoft's .Net vision has given the company a direction in which to focus its efforts at driving the future of software development and deployment. .Net, along with XML, has made the confrontation between Microsoft's offerings and Sun Microsystems' Java much less important for overall future success in e-business development. As this has occurred, Microsoft has recognized that continuing as an overt and legal antagonist to Java has had diminishing returns for the past few years. Since Java has gained enough momentum to ensure its place as a primary language and platform for business applications, Microsoft has increasingly been cast as an outsider in the quest for enterprise platform solutions. The settlement paves the way -- legally -- for Microsoft to enhance its Java strategy by adding support for the language to its .Net framework, as Gartner predicts it will by the end of this year. The C# language, which hasn't seen the official light of day yet, will likely suffer as a result of Microsoft's increased focus on Java. Uncertainty over C#'s future because of difficult positioning -- coupled with the fact that those who would likely use it are also those likely to use Java -- means that the success of C# will fall victim to the extreme pain Microsoft is feeling from its lack of Java support. Gartner continues to believe that Microsoft will introduce more Java support in the .Net platform by year's end. However, it is important to note that without significant support for Java 2 platforms (specifically Java Server Pages and "servlets"), Microsoft will use non-Sun (but Java-like) technology to provide a Java-like language for .Net.... • [January 25, 2001] "IBM Technology Lets B2B Fingers Do the Walking." By Wylie Wong. In CNET News.com (January 25, 2001). "IBM is giving the open-source community a Java technology that will allow businesses to connect to a giant online directory for conducting e-commerce transactions. IBM is donating the software code for a Java application programming interface (API), or a set of instructions, that connects businesses to a giant online 'Yellow Pages' created by Microsoft and IBM. The online directory, called the Universal Description Discovery and Integration (UDDI) Business Registry, will help companies advertise their services and find one another so they can conduct Web transactions. The project is supported by more than 100 companies, including Hewlett-Packard, Intel, Sun Microsystems and Nortel Networks. Big Blue is giving away the Java technology to its own open-source effort called the IBM developerWorks Open Source Zone. Open-source efforts allow anyone to modify and redistribute the software. Bob Sutor, IBM's program director for e-business standards strategy, said companies can use the donated technology, called UDDI for Java, so they can link their services to the online registry. Companies supporting the open-source project include Compaq Computer, Bowstreet, CrossGain, DataChannel. Sutor said the group will soon meet to discuss how to improve the product in the future. Sutor added that the Java technology and UDDI project are all part of IBM's efforts to create the technology for Web-based software and services..." • [January 25, 2001] "ebXML Registry Services." By ebXML Registry Project Team. Working Draft 1/20/2001. Version 0.84. 55 pages. "This document defines the interface to the ebXML Registry Services as well as interaction protocols, message definitions and XML schema. A separate document, ebXML Registry Information Model [RIM], provides information on the types of metadata that is stored in the Registry as well as the relationships among the various metadata classes... The ebXML Registry provides a set of services that enable sharing of information between interested parties for the purpose of enabling business process integration between such parties based on the ebXML specifications. The shared information is maintained as objects in a repository and managed by the ebXML Registry Services defined in this document. The ebXML Registry architecture consists of an ebXML Registry and ebXML Registry clients. Clients communicate with the Registry using the ebXML Messaging Service in the same manner as any two ebXML applications communicating with each other. Future versions of this specification may extend the Registry architecture to support distributed Registries. This specification defines the interaction between a Registry client and the Registry. Although these interaction protocols are specific to the Registry, they are identical in nature to the interactions between two parties conducting B2B message communication using the ebXML Messaging Service as defined by [MS] and [CPA]. As such, these Registry specific interaction protocols are a special case of interactions between two parties using the ebXML Messaging Service. Appendix A supplies the Schemas and DTD Definitions (ebXMLError Message DTD, ebXML Registry DTD). Appendix B explains the UML class and sequence diagrams. UML diagrams are used as a way to concisely describe concepts. They are not intended to convey any specific implementation or methodology requirements. The term 'managed object content' is used to refer to actual Registry content (e.g. a DTD, as opposed to metadata about the DTD). The term 'ManagedObject' is used to refer to an object that provides metadata about a content instance (managed object content)... [cache] • [January 25, 2001] "ebXML Registry Information Model." By ebXML Registry Project Team. Working Draft 1/19/2001. Version 0.55. 39 pages. "This document specifies the information model for the ebXML Registry. A separate document, ebXML Registry Services Specification [RS], describes how to build Registry Services that provide access to the information content in the ebXML Registry. The Registry provides a stable store where content submitted by a Submitting Organization is persisted. Such content is used to facilitate ebXML-based business to business (B2B) partnerships and transactions. Submitted content may be XML schema and documents, process descriptions, UML models, information about parties and even software components. A set of Registry Services that provide access to Registry content to clients of the Registry is defined in the ebXML Registry Services Specification [RS]. This document does not provide details on these services but may occasionally refer to them. The Registry Information Model provides a blueprint or high-level schema for the ebXML Registry. Its primary value is for implementers of ebXML Registries. It provides these implementers with information on the type of metadata that is stored in the Registry as well as the relationships among metadata classes. The Registry information model: (1) Defines what types of objects are stored in the Registry; (2) Defines how stored objects are organized in the Registry; (3) Is based on ebXML metamodels from various working groups. Implementers of the ebXML Registry may use the information model to determine which classes to include in their Registry implementation and what attributes and methods these classes may have. They may also use it to determine what sort of database schema their Registry implementation may need. The Registry Information Model may be implemented within an ebXML Registry in form of a relational database schema, object database schema or some other physical schema. It may also be implemented as interfaces and classes within a Registry implementation..." [cache] • [January 25, 2001] "ebXML Registry [Proposed XML syntax]." Posted by Len Gallagher. 12-Jan-2001. "As requested by Scott Nieman, attached is a proposal for XML syntax that could be supported in Phase I of the ebXML Registry. It is based very closely on XML syntax specified in the OASIS Registry/Repository technical specification... This is a proposal to ebXML Regrep that it adopt an XML-based syntax to support simple requests to an ebXML Registry. Following the definitions of Focused Query, Totally Ad Hoc Query, Constrained Ad Hoc Query, and Content Based Query given in the proposed requirements document distributed by Farrukh Najmi earlier this week, this proposal is somewhere between a Focused Query and a Constrained Ad Hoc Query. This proposal is much more flexible than a Focused Query because it allows Boolean predicates on the visible attributes of each class specified in the ebXML Registry Information Model (RIM). We are assuming that ebXML Regrep can agree on a simple syntax for Boolean predicates over a pre-determined set of character string, integer, or date attributes! The proposal is much easier to implement than Constrained Ad Hoc Query because it does not require the implementation to parse an unfamiliar Query language. The proposed syntax is more like a script to be followed rather than a language that must be parsed. The script depends on the classes and relationships defined in the RIM, so even very simple syntax may result in quite powerful requests to the Registry. This is demonstrated in the Example section below, which shows how each of the example OQL queries in Farrukh's proposed requirements document can be expressed in this simple XML..." cache • [January 25, 2001] "X-Ray - Towards Integrating XML and Relational Database Systems." By Gerti Kappel, Elisabeth Kapsammer, Stefan Rausch-Schott, and Werner Retschitzegger (Institute of Applied Computer Science, Department of Information Systems IFS, University of Linz, Altenbergerstrasse 69, A-4040 Linz, Austria). Email: {gk, ek, srs, wr}@ifs.uni-linz.ac.at. Pages 339-353 (with 29 references) in Proceedings of the 19th International Conference on Conceptual Modeling, (ER'2000). Salt Lake City, USA, October 9-12, 2000. Springer Verlag: LNCS 1920 [ISSN 0302-9743]. Abstract. "Relational databases get more and more employed in order to store the content of a web site. At the same time, XML is fast emerging as the dominant standard at the hypertext level of web site management describing pages and links between them. Thus, the integration of XML with relational database systems to enable the storage, retrieval and update of XML documents is of major importance. This paper presents X-Ray, a generic approach for integrating XML with relational database systems. The key idea is that mappings may be defined between XML DTDs and relational schemata while preserving their autonomy. This is made possible by introducing a meta schema and meta knowledge for resolving data model heterogeneity and schema heterogeneity. Since the mapping knowledge is not hard-coded but rather reified within the meta schema, maintainability and changeability is enhanced. The meta schema provides the basis for X-Ray to automatically compose XML documents out of the relational database when requested and decompose them when they have to be stored... The main contribution of this paper is to describe X-Ray, an approach for mapping between XML DTDs and relational schemata. The mapping knowledge is not hard-coded but rather reified in terms of instances of a meta schema thus supporting autonomy of the participating DTDs and relational schemata as well as a generic integration thereof. On the basis of the meta schema, XML documents may be automatically composed out of data stored within an RDBS and vice versa decomposed into relational data without any loss of information. The X-Ray prototype builds on former experience in the area of data model heterogeneity and schema heterogeneity , and is currently used for case studies to investigate the validity of the developed meta schema. Future work comprises short-term tasks such as supporting the whole set of XML concepts like implicit ordering and entity definitions, as well as long-term tasks such as integrating the XML Linking Language (XLink) and the XML Pointer Language (XPointer). The latter will support the mapping of several XML documents and links between them to relational structures and vice versa. Another important aspect will be the investigation for simplifying the mapping between heterogeneous DTDs and relational schemata by, e.g., simplifying the given DTDs before mapping them. In this respect it will be also analyzed, how far the definition of the mapping knowledge may be automated on the basis of the reasonable mapping patterns de-scribed above. Leaving optimization issues aside, an automatically generated default mapping should be possible. If both legacy DTDs and legacy relational schemata are involved, however, schema heterogeneity will impede an automatic mapping." See also the associated technical report. [cache] • [January 24, 2001] TMQL Requirements. Ann Wrightson recently announced the availability of a draft specification for TMQL requirements. The draft is referenced from the 'Topic Maps Query Language' resources on eGroups. TMQL - Topic Maps Query Language is the public discussion group working on the design and development of TMQL. TMQL is a new standardization project of ISO (JTC1 SC34 WG3) and topicmaps.org - the organizations which developed ISO Topic Maps and XTM (XML Topic Maps). The goal of this working group is the development of a 'SQL' for Topic Maps. Standard co-editors are: H. Holger Rath (empolis) and Ann Wrightson (Ontopia).' The requirements document (Version 0.4) is 'a working draft of a document in preparation concerning requirements for TMQL, circulated following the first (informal) editing meeting for TMQL Requirements, January 2001. This editing meeting agreed that the scope of TMQL needed to be clarified, and in particular that 'TMQL' had hitherto been used as a catch-all for a range of topic map related operational concepts which now needed to be clarified and worked out in detail. To this end, this document is structured as follows: (1) Introductory/general stuff, including a draft for a formal 'introduction and scope' to the requirements document as it will go to WG3. (2) Topic Maps in Modern Distributed Systems: covers usecases and technical infrastructure considerations; intended to allow the question 'are these the right requirements?' to have a rational answer. (3) Requirements for a Reference Abstract Data Model: as much as the TMQL abstraction needs to standardize, about the nature of topic map data - currently a bunch of ideas, not a proposal. (4) Requirements for a Topic Map Querying Language: see also the accompanying 'Examples' document. Responses and comments are invited, by Feb 02, 2001 if possible..." See the 'tmql-wg' group ('tmql-wg@egroups.com') at eGroups; the list is administered by H. Holger Rath. • [January 24, 2001] "Migrating from XML DTD to XML-Schema using UML." A Rational Software White Paper. Reference: TP-189 10/00. 11 pages. "Today, in developing XML-enabled applications there is a need to ensure that XML documents are constrained to a specific application-defined schema. With the XML 1.0 recommendation from the World Wide Web Consortium (W3C) we are provided the Document Type Definition (DTD) schema language. While the XML 1.0 DTD technology offers a solution today, there is no doubt that the upcoming XML-Schema language will play an important role in the future of XML and will likely eventually replace XML DTDs in the mainstream. Hence the dilemma: should you invest in an XML DTD solution today or wait for the XML-Schema specification to be finalized? For those who have already invested in XML, the issue is more about protecting their investment by successfully migrating from XML DTDs to XML Schemas at an appropriate time. This paper outlines one solution to this issue by demonstrating a set of rules developed to automate the generation of W3C XML-Schema from a Unified Modeling Language (UML) model representing the contents of an XML DTD. This white paper also outlines the modeling of W3C XML 1.0 DTD schemas using the UML and provides an overview of related functionality provided in Rational Rose. The paper assumes familiarity with the UML language, XML, XML DTDs and introduces W3C XML-Schema... This work also builds upon a previous white paper, co-authored with CommerceOne that describes an approach to modeling SOX based XML schemas. That paper did not go so far as to describe a UML Profile as it is considered a tactical effort, SOX as a technology is also expected to be superceded by XML-Schema. Once XML-Schema recommendation is finalized by the W3C, it is the intent of Rational Software to update the UML Profile described here to target XML-Schema directly and obviate the need for such a conversion script." See the PDF version for the full text. Note: "W3C XML 1.0 DTD Support was added to the 'Rational Rose 2000e, Enterprise Edition' product; this was the industry's first UML-based tool for developing DTDs and demonstrates Rational Software's commitment to make XML-enabled applications a reality for our customers. The 2000e product was a landmark in that it also included the first UML-based tool for analyzing web artifacts and the industry's first UML-based data modeling solution." [cache] • [January 24, 2001] "XML-based Components for Federating Multiple Heterogeneous Data Sources." By Georges Gardarin, Fei Sha, and Tuyet-Tram Dang-Ngoc. Pages 506-519 in Conceptual Modeling - ER '99. Proceedings of the 18th International Conference on Conceptual Modeling [Paris, France, November, 15-18, 1999]. Lecture Notes in Computer Science #1728, edited by Jacky Akoka, Mokrane Bouzeghoub, Isabelle Comyn-Wattiau, and Elisabeth Métais. "Several federated database systems have been built in the past using the relational or the object model as federating model. This paper gives an overview of the XMLMedia system, a federated database system mediator using XML as federating model, built in the Esprit Project MIRO-Web. The system is composed of four main components: a wrapper generator using rule-based scripting to produce XML data from various source formats, a mediator querying and integrating relational and XML sources, an XML DBMS extender supporting XML on top of relational DBMSs, and client tools including a Java API and an XML query browser. The results demonstrate the ability of XML with an associated query language (we use XML-QL) to federate various data sources on the Internet or on Intranets.... In the past few years, research in semistructured data has become a hot topic. Different approaches have been proposed for efficiently managing semistructured data from diverse sources. Significant experiences are the Lore system developed at Stanford University built from scratch for managing semistructured data, Ozone built on top of the O2 object-oriented DBMS, and STRUDEL developed at AT&T for managing web-based data. In this paper we present our experience in federating multiple data sources using XML and XML-QL, a query language jointly developed by AT&T and INRIA. The model is a slight variation of OEM... The goal of the XMLMedia system is to provide in an efficient way integrated access to multiple data sources on the Internet or on intranets using XML protocols and tools based on Java components. The components can be assembled together or used separately in different application contexts. Possible applications are the constitution of electronic libraries from existing sources, the collect of information for trading systems, the assembling of fragments of data for technology watch, and more generally knowledge management and information diffusion in the enterprise. Used together, the components form a unique solution to support iterative definition and manipulation of semistructured databases derived from existing data sources or loosely structured files in Internet/Intranet environments..." See also the document references. • [January 23, 2001] Versioning Extensions to WebDAV, which is intended to be moved into the standards track as a "Proposed Standard." The last call comment period ends February 1, 2001. Reference: IETF Internet-Draft 'draft-ietf-deltav-versioning-12'. January 20, 2001. 95 pages. By Geoffrey Clemm (Rational Software), Jim Amsden (IBM), Chris Kaler (Microsoft), and Jim Whitehead (U.C.Irvine). The resource properties defined in the versioning extensions to WebDAV are expressed in XML notation. Document abstract: "This document specifies a set of methods, headers, and resource types that define the WebDAV Versioning extensions to the HTTP/1.1 protocol. WebDAV Versioning will minimize the complexity of clients that are capable of interoperating with a variety of versioning repository managers, to facilitate widespread deployment of applications capable of utilizing the WebDAV Versioning services. WebDAV Versioning includes: (1) version history management, (2) automatic versioning for versioning-unaware clients, (3) workspace management, (4) baseline management, (5) activity management, (6) variant management, and (7) URL namespace versioning." • [January 23, 2001] "Java API for XML Registries (JAXR)." Proposed by Sun Microsystems, JAXR "provides an API for a set of distributed Registry Services that enables business-to-business integration between business enterprises, using the protocols being defined by ebXML.org, OASIS, ISO 11179. In addition, the JAXR specification assumes that all communication between registry and registry clients will be based on the Java API for XML Messaging (JAXM) specification. The Java API for XML Messaging 1.0 defines how XML messages are exchanged between a registry client and a registry implementation. This specification is key to ensuring interoperable communication between any ebXML registry client and any ebXML Registry implementation. The goal is to leverage the security services of the Java platform, Standard Edition and Java 2 platform, Enterprise Edition where possible..." • [January 23, 2001] "Latest Java Spec Boosts XML Support." By Roberta Holland. In eWEEK (January 21, 2001). "Java developers are welcoming the stronger XML support expected in the next version of the Java 2 Enterprise Edition standard. Sun Microsystems Inc. expects J2EE 1.3 to debut in the third quarter of this year, with increased support for Extensible Markup Language and upgrades to Enterprise JavaBeans, JavaServer Pages and servlets. 'The point of using Java is it's platform-independent; the point of using XML is it's universal,' said Anthony Siciliani, a developer with Digitas Inc., of Boston, at a Sun event here last week. Siciliani added that support is important because XML and Java need to work together for e-business applications. Using both, 'you can transport your data and communicate between applications,' he said. 'That's the way of the future.' Rich Green, Sun's vice president of Java software, said J2EE is gaining adoption because the standard's portabil ity frees users from a particular vendor or platform. The next version is currently under review. Sybase Inc., Art Technology Group Inc., BEA Systems Inc., Bluestone Software Inc. -- whose acquisition by Hewlett-Packard Co. was finalized last week -- Borland Software Corp., SilverStream Software Inc., iPlanet E -commerce Solutions, Iona Technologies plc. and Hitachi America Ltd. have each released application servers that are compliant with J2EE Version 1.2. The vendors agreed that J2EE compliance is one of the first questions customers ask about. 'You've got to have the [J2EE] brand to even go in and talk to customers,' said Scott McReynolds, senior systems consultant for Sybase, in Emeryville, California..." • [January 23, 2001] "Level 1 DOM Compatibility Table. Methods and Properties." By Peter-Paul Koch. Includes notes on IE5 Win and Mac, Netscape 6, Opera 5 and Konqueror. "These methods and properties only work in browsers that support the W3C Level 1 DOM. This page is based on Explorer 5 and Netscape 6, but many other browsers support bits of the Level 1 DOM. See the browsers page for details... This page notes the new methods and properties that Version 5 browsers should support and the (in)compatibilities of Netscape 6 and Explorer 5, who (surprise!) don't support everything as well as theory says they should. I wrote an introduction to the Level 1 DOM so that you understand what a node is and why you need them. On this page, first a large table of methods and properties and then four cases of trying to make generated events work (none cross-browser)..." See also the description of the W3C DOM mailing list (wdf-dom): "This list studies the JavaScript implementation of the W3C DOM in the various browser..." • [January 23, 2001] "Xparse-J User Documentation. [EXPLORING XML.]" By Michael Classen. From WebReference.com. January 2001. ['Need a small XML parser to embed in your Java applet? Have a look at Xparse-J, the smallest Java parser on the planet.'] "Xparse-J aspires to be the smallest Java XML parser on the planet. Xparse-J favors compactness over conformance, so it is mainly useful for being embedded in Java applets for simple XML processing tasks, such as parsing RSS as demonstrated in RSSApplet. Xparse-J is a literal translation of Xparse, an XML parser in 5k of JavaScript, into Java. I added a helper class JSArray to mimic the Javascript array built-in data structure. Xparse-J reads XML documents from a given string and presents the resulting document as a tree structure comprised of com.exploringxml.xml.Node and com.exploringxml.xml.JSArray. Both Node and JSArray have basic functionality for navigating the document tree and accessing its data. All character encodings that are supported by Java can be processed. Xparse-J does not conform to any standard XML API such as SAX or DOM. Most of their functionality is not needed for simple XML processing tasks, and would substantially increase the size of the parser. If you favor conformance over compactness, have a look at Aelfred and TinyXML, among others. The parser does not read DTDs and the error handling is minimal, so presenting it with documents that have been checked before, e.g., on the server side, is recommended..." See source and binaries on the WebReference.com web site. • [January 23, 2001] "Netscape 6, Part IV: DOM Differences and Commonalities with IE5.x." By Yehuda Shiran. From WebReference.com. January 2001. ['Netscape 6 and IE5.x are both based on the W3C's Document Object Model (DOM) but there are some differences. Learn how to find common ground to quilt DOM-based cross-browser scripts.'] "This column is the fourth part of our series on Netscape 6. In this column we'll start looking at the differences between Netscape 6's Document Object Model (DOM) and Internet Explorer 5.x's DOM. The DOM gives scripters a standardized way to access various elements of an HTML page, and should greatly simplify writing cross-browser scripts in the long run. Both browsers support a basic set of the W3C's methods and attributes, as well as the innerHTML property (not included in the W3C DOM, but included by popular demand). The differences are found in the second tier of functionality. A good example is the difference in modeling of the root nodes of the DOM tree, document and documentElement objects. Netscape 6 also models the ownership relationship in a document, while Internet Explorer 5.x does not. Another advantage of Netscape 6's DOM is its ability to model fragment nodes, which Internet Explorer does not support. In this column, you will learn: (1) How to distinguish between the two root nodes (2) How to use the lower root node, browser-independently (3) How to use the ownership relationship (4) How to reference objects in Netscape 6 (5) How to create and remove attributes on the fly (6) How to create document fragments (7) How to use the innerHTML property, browser-independently..." • [January 23, 2001] "Microsoft Officials Hail 'Orchestration'." By Paul Krill. In InfoWorld (January 23, 2001). "Microsoft officials, touting their BizTalk Server 2000 business process integration package, hailed 'orchestration' as the next critical phase in integrating software systems. Orchestration, enabled by BizTalk, 'lets you rapidly build and define a business process,' said Michael Risse, general manager of Microsoft's .NET enterprise servers group, during an informational session at corporate branch offices in Mountain View, Calif. The concept entails building a mechanism for directing system interaction once systems have been integrated, said Dave Wascha, product manager for BizTalk Server 2000. Orchestration, for example, could enable companies to better deal with merging different systems after a merger or acquisition via XML (extensible markup language) or software conversions, according to Microsoft... Redmond, Wash.-based Microsoft's BizTalk Server 2000 enables enterprise application integration, business-to-business integration, and business process automation, company officials said. XML is a key part of the BizTalk platform, company officials said. This quarter the company will begin rolling out XML-based, industry-specific tools for process automation, Wascha said. Tools for industries such as health care are expected. Although the product currently enables documentation of XML instructions for integrating with electronic data interchange (EDI) systems, a future version will more tightly integrate with EDI, by turning XML into EDI instructions, Wascha said..." See documentation in "Orchestrating Business Processes with Microsoft BizTalk Server 2000": "Microsoft BizTalk Server 2000 provides business-to-business integration services and the advanced BizTalk Orchestration technology to build dynamic business processes that span applications, platforms, and businesses over the Internet. This article explains the concepts that a system developer or architect must understand to integrate BizTalk Orchestration Services and BizTalk Messaging Services in a business-to-business scenario..." • [January 23, 2001] Java API for XML Registries 1.0. JAXR "provides an API for a set of distributed Registry Services that enables business-to-business integration between business enterprises, using the protocols being defined by ebXML.org, OASIS, ISO 11179. In addition, the JAXR specification assumes that all communication between registry and registry clients will be based on the Java API for XML Messaging (JAXM) specification. The Java API for XML Messaging 1.0 defines how XML messages are exchanged between a registry client and a registry implementation. This specification is key to ensuring interoperable communication between any ebXML registry client and any ebXML Registry implementation. The goal is to leverage the security services of the Java platform, Standard Edition and Java 2 platform, Enterprise Edition where possible." Details: "This JSR requests the creation of the Java API for XML Registries 1.0 specification (JAXR). JAXR may be viewed as analogous to Java Naming and Directory Interface (JNDI) but designed specifically for internet sharing of XML-related business information. This specification will describe Java API's designed specifically for an open and interoperable set of registry services that enable sharing of information between interested parties. The shared information is maintained as objects in a compliant registry. All access to registry content is exposed via the interfaces defined for the Registry Services. Currently there are numerous open standards for distributed registries. Examples include OASIS, eCo Framework, ebXML. In addition there also exists industry consortium led efforts such as UDDI which may eventually be donated to a standard body. JAXR will provide a uniform and standard API for accessing information from these registries within the Java platform. It is planned that this JSR will leverage work currently under way in the ebXML Registry Working Group, OASIS, ISO, W3C, IETF and potentially other relevant open standards bodies. This JSR does not aim to define either business Registry standards, XML messaging standards or XML schemas for particular tasks. These standards belong in standards bodies such as OASIS or IETF. Instead this JSR aims to define standard Java APIs to allow convenient access from Java to emerging open Registry standards, such as the ebXML Registry standard. The JAXR 1.0 specification will be provided initially as an optional package, but may be incorporated into the Java 2 Enterprise Edition platform as soon as this is practical and there is sufficient demand to warrant such integration. JAXR 1.0 will specify API's enabling the Java Community to develop portable eBusiness applications and tools that support emerging industry standards for XML registries on the internet. Among candidate capabilities are: support for industry standard XML registry functionality, support for registration of member organizations and enterprises, support for submission and storing of arbitrary registry content, support for life cycle management of XML and non-XML registry content, support for user-defined associations between registry content, support for user-defined multi-level classification of registry content along multiple user defined facets, support for registry content querying based on defined classification schemes, support for registry content querying based on complex ad hoc queries, support for registry content querying based on keyword based search, support for sharing of web services, support for sharing of business process between partners, support for sharing of schemas between partners, support for sharing of business documents between partners support for trading partner agreement assembly and negotiation, support for schema assembly, support for heterogeneous distributed registries support for enabling publish/subscribe XML Messaging between parties..." • [January 22, 2001] "Penguin signs up for Texterity's PDF-to-XML conversion. First production-level contract for breakthrough technology." By Mark Walter. In The Seybold Report on Internet Publishing Volume 5, Number 5 (January 2001), page 34. "Is it possible to automatically convert PDF files into valid XML documents? Texterity believes the answer is yes, and book publisher Penguin Putnam is putting it to the test. The two companies recently announced that Penguin Putnam will convert existing author's works from PDF format into the Open E-Book XML format using Texterity's fully automated TextCafe service. Texterity, founded in 1991 by consultant Martin Hensel, started as specialists in SGML and XML DTD development, document analysis, composition and conversion. Over the course of many projects, it has developed considerable expertise in document conversion, a thankless task that inevitably accompanies any SGML/XML project...The Penguin conversion project is the first major production contract for TextCafe. The deal calls for Texterity to convert thousands of backlist Penguin Putnam books, primarily novels, into Open E-Book format. Penguin Putnam is the U.S. affiliate of Pearson's Penguin Group and owner of a variety of children's and adult trade imprints, from Avery to Viking... From a technical standpoint, TextCafe is an achievement: the first automated and commercially available PDF-to-XML conversion service." • [January 22, 2001] "XML: New formula for E-Learning." By Cheryl Gerber. In Federal Computer Week (January 22, 2001). "As the electronic-learning market matures, a growing number of vendors and federal agencies are embracing XML -- Extensible Markup Language -- to streamline the way e-learning software is built and handles information. XML provides a standard way to tag or mark up information, such as student data and course material, so that it is easy to read and exchange. Among its many uses, XML helps e-learning ven-dors develop applications faster, reuse course content more easily, and smooth data exchange between the Web-based courseware, or content, and the learning management system, which is the student administration system. It also allows agencies to make their e-learning systems more useful through tighter integration with other software, such as human resources management systems and e-commerce Web sites. Although surely gaining in popularity, the use of XML is still not universal in the e-learning market. And when it is used, it is not always done so in a consistent manner. For now, the lesson for agencies interested in using XML-enabled e-learning products is to understand clearly the benefits they want to obtain and choose products carefully to make sure they can deliver those benefits. Interior is waiting to buy a learning management system until it finds one that will work with the enterprise resource planning software from SAP Public Sector and Education Inc., which the department is in the midst of deploying. 'We want to use XML to link e-learning to employee records, to link training management to HR management,' said Ross Allan, a computer specialist at DOI University, Washington, D.C. Allan said he is also hoping to use XML to tie the department's bureaus together. Learnframe uses XML in three areas of software development, Gavin said. First, it is used to communicate a mapping of user requests to server-side requests. Second, it defines reporting queries, and third, XML is used as a data input and export mechanism. E-learning content provider SkillSoft, Nashua, N.H., began using XML in the past year. 'When we moved from our first-generation to our second-generation tool development, we changed the design to use XML internally instead of using old database formats,' said Mark Townsend, SkillSoft's vice president of product development. Sun Educational Services also embraced XML last year. 'We decided in the last six months to move from a proprietary format to XML to store user profile and personal preference information,' said Chuck Young, data architect for Sun Educational Services, Broomfield, Colo., and the Sun technical representative on the Instructional Management Systems standards board. The IMS Global Learning Consortium, run by Eduprise in Boston, originally developed the IMS standard from the academic community. It has been using XML for two years. Although most learning management system providers use the oldest e-learning specification, created by the Aviation Industry Computer-Based Training Committee (AICC), the standard gaining the most traction today is the Shareable Courseware Object Reference Model (SCORM), which is based mostly on XML. To encourage the widespread use of e-learning, the White House Office of Science and Technology Policy developed SCORM three years ago. 'AICC has made it possible to use XML files, although it's a retrofit,' said Bryan Chapman, an e-learning analyst at brandon-hall.com, a research firm in Sunnyvale, Calif. The AICC standard 'doesn't take advantage of how XML is structured to show relationships and subrelationships in groupings. The standard existed long before XML was available. On the other hand, SCORM is centered around how XML is structured.' The SCORM standards group hopes its basis in XML will encourage not only cross-application but also cross-industry data interchange. 'We are trying to adopt in the e-learning world some of the successes we have seen with XML in the e-business world. It's a user-friendly, easy language for e-business transactions between suppliers and their customers,' said Jerry West, technical director at the Advanced Distributed Learning Co- Laboratory, the organization leading SCORM development." • [January 20, 2001] "XML-Enabling CICS Software to Ship." By Tom Sullivan. In InfoWorld (January 19, 2001). "A Stillwater, OA.-Based startup, Hostbridge Technology, is poised to release a product on Monday that lets customers XML-enable CICS (Customer Information Control System) transactions. ['CICS is an application server that provides industrial-strength, online transaction management for mission-critical applications.'] According to market research firm IDC, in Framingham, Mass., CICS comprises the most common category of host-based e-business transactions within an IBM enterprise server environment. Furthermore, these still power 80 percent of the world's mission-critical financial and other banking applications. The product, which takes the company's name, Hostbridge, runs on the mainframe under CICS. By virtue of the CICS interface, it can handle all transactions before the 3270 data stream starts, generates the transactions as XML, then spits them out to a middle-tier server that understands XML, such as IBM's WebSphere or BEA Systems' WebLogic. As a product, Hostbridge differentiates itself from other Web-to-host software in that there is no screen-scraping involved. An IDC report written by analyst Sally Cusack stated that this gives customers a simpler means to integrate CICS data into e-business applications. Screenscraping, on the other hand, is limited in its fragility and lack of flexibility... 'One of the ways we'll promote the product in the future is as a BizTalk connector to CICS,' Tuebner said." • [January 20, 2001] "CICS and XML: Weaving a New Web." From IBM ['What is B2B (Business to Business) data processing? Why is XML (eXtensible Markup Language) so important for B2B? And how does it relate to CICS? Geoff Sharman reports.'] "... Since many enterprises run commercial systems based on CICS, a common requirement is to transform an XML message into a request to a CICS system with data from the message being passed in a COMMAREA. It's relatively straightforward to implement this by writing a Java servlet which accepts a message from a browser and creates an ECI (External Call Interface) request which is passed to CICS via the CICS Transaction Gateway. XML parser technology to validate the incoming message and build an XML response message is already available from IBM. Larger enterprises may have to deal with hundreds of different message types and will want a much higher degree of automation in processing XML messages, plus the assurance that messages will not be lost in transit. To meet these needs, IBM is developing program generation tools to automate the task of creating transformations and advanced message parsers which are designed to work with CICS.These will enable customers to exploit their investments in CICS-based business systems, exchanging information automatically and integrating it directly with their business systems..." See "What is CICS?" • [January 19, 2001] Introduction to CSS3. Reference: W3C Working Draft, 19-January-2001, edited by Eric A. Meyer. This ("informative") working draft document describes details of modularization in the future Cascading Style Sheets Level 3 (CSS3) specification and describes the CSS test suite. The 'Module Overview" section supplies a list of all the CSS3 modules, together with names of the document editors and deadlines marking the time at which modules should be ready for Working Draft publication. Excerpt: "The members of the CSS&FP Working Group have decided to modularize the CSS specification. This modularization will help to clarify the relationships between the different parts of the specification, and reduce the size of the complete document. It will also allow us to build specific tests on a per module basis and will help implementors in deciding which portions of CSS to support. Furthermore, the modular nature of the specification will make it possible for individual modules to be updated as needed, thus allowing for a more flexible and timely evolution of the spcification as a whole... As the popularity of CSS grows, so does interest in making additions to the specification. Rather than attempting to shove dozens of updates into a single monolithic specification, it will be much easier and more efficient to be able to update individual pieces of the specification. Modules will enable CSS to be updated in a more timely and precise fashion, thus allowing for a more flexible and timely evolution of the spcification as a whole. For resource constrained devices, it may be impractical to support all of CSS. For example, an aural browser may be concerned only with aural styles, whereas a visual browser may care nothing for aural styles. In such cases, a user agent may implement a subset of CSS. Subsets of CSS are limited to combining selected CSS modules, and once a module has been chosen, all of its features must be supported." [cache] • [January 19, 2001] CSS3 Module: Multi-column layout. Revised working draft. Reference: W3C Working Draft, 18-January-2001, edited by Håkon Wium Lie. This specifiction "builds upon the box model module, and provides a facility whereby stylesheet authors can allow content to flow from one column to another, specify column width, and allow the number of columns to vary, all depending on available space." • [January 19, 2001] Software Development Magazine Volume 9, Number 2 (February 2001), page 32-34. Part of the author's "Share and Share Alike: Will P2P technology transform the B2B world?" Barnhart gives Groove's GDK product "four out of five stars." • [January 19, 2001] "SAX2: THE SIMPLE API FOR XML." By Eldar A. Musayev. In Dr. Dobb's Journal Issue 321 (February 2001), pages 130-133. SAX, the 'Simple API for XML,' is an efficient and high-performance alternative to the Document Object Model. In this article, I'll describe SAX, then show you how you can use it in Visual Basic applications via the Microsoft (MSXML) parser." Additional resources include sax2.txt (listings) and sax2.zip (source code). • [January 19, 2001] "Riding the XML Tiger." "[Review of XML Spy.]" Chris Minnick. In Software Development Magazine Volume 9, Number 2 (February 2001), pages 23-27. Feature Review. "XML Spy does an admirable job of keeping up with the ever-changing standards and vocabularies..." [See other XML articles by the author on the Minnick Web Services web site.] • [January 19, 2001] "Transora And GlobalNetXchange Form MegaHub." By Steve Konicki. From TechWeb News. January 16, 2001. "Wal-Mart had better watch out. Competing retailers may soon match the supply-chain leader in its ability to directly connect with consumer-packaged goods companies. GlobalNetXchange, the retail marketplace founded by Sears, Carrefour, and Oracle, and Transora, the consumer packaged-goods exchange formed by 54 companies, including Coca-Cola and General Mills, are teaming up to let retailers send purchase orders directly to manufacturers. The joint interoperability venture, dubbed MegaHub, will handle standards-based XML data exchange and integration to users' back-office systems. 'Our industry has several major exchanges, and there was a fair amount of anxiety among manufacturers and retailers about whether exchanges were going to spend time warring,' Transora CEO Judy Sprieser says. 'Transora and GlobalNetXchange decided to build the MegaHub to make our services complementary.' CPGMarket.com, a European consumer packaged-goods exchange founded by SAPMarkets, Nestle, and Pernod Ricard Group, has already signed up to be MegaHub's first customer. Transora, which went live late last year, said recently that it will integrate with Novopoint.com, an exchange for raw materials suppliers and manufacturers, and Foodtrader.com, an exchange for grocery store product buyers and sellers. It also has revealed plans to work with the WorldWide Retail Exchange, whose members include 31 leading retailers. A Transora spokesman says MegaHub will likely be used to integrate these exchanges with Transora..." See the recent announcement, "Globalnetxchange And Transora Form Joint Enterprise to Enable Inter-Exchange Communication. New Entity Will Provide Connectivity and Translation Services To Major Exchanges." - "Transora, the global eMarketplace for the consumer packaged goods industry, and GlobalNetXchange (GNX), the global business-to-business online marketplace for the retail industry, announced today the joint formation of a 'megahub' that will make it possible for companies to collaborate with multiple trading partners via a single exchange connection. Beginning immediately, Transora and GNX will invite several major exchanges in diverse industries to become equity participants in the new entity... The megahub will provide exchanges with low cost EDI and XML transport and translation services over the Internet. The megahub supports exchange-to-exchange interoperability, which facilitates cross-value chain applications such as joint promotions management, Collaborative Planning, Forecasting and Replenishment (CPFR) and other services between member companies. Exchanges that join the megahub can bundle connectivity and translation capabilities into their service offerings for resale to customers." • [January 19, 2001] "Microsoft's VSA To Ease Web App Development." By Barbara Darrow. From TechWeb News. January 16, 2001. "Microsoft hopes to bring the point-and-click programming model it pioneered for desktop applications with Visual Basic for Applications (VBA) to back-end applications available over the Web. On Tuesday, the company will unveil Visual Studio for Applications (VSA) in San Francisco and will release it to beta. ISVs, integrators, and corporate IT shops will be able to use the Integrated Development Environment to easily write a VSA runtime that will -- with as little as three lines of code -- let users from their browsers access business logic running on servers. Then, with the developer's workbench, they can customize their own server-based application code. Microsoft is pointing VSA as a key part of the company's proposed .Net Framework. Critics will no doubt paint VSA as Windows-centric. Indeed, the runtime and VB.Net server applications are reliant on Windows, but will be accessible from any browser and can interoperate with outside applications via any XML language, Microsoft executives said. The user's browser sends a request to server-based business logic, which in turn loads VSA custom code. That code is compiled using the same compiler as Visual Basic.Net and is cached so the server need not bring down for updates, they said." See the recent announcement from Microsoft: "Microsoft Announces Visual Studio for Applications. First Technology that Enables Easy Customization for Web Applications." - "... VSA provides corporations and ISVs that want to take full advantage of the business opportunities presented by the Web with the ability to tailor Web applications with specialized business logic. ISVs and corporations can easily integrate VSA customization technology into their homegrown and packaged distributed applications. This enables customers to use the familiar Visual Basic event driven model to customize core business logic running on the server without intimate knowledge of the application. By integrating VSA, ISVs and corporations can enable end users to customize those applications. VSA, through its built-in support for Visual Studio.NET and the Microsoft .NET Framework, enables customers and ISVs to harness the power of the .NET platform. This powerful combination provides the scalability and performance necessary for customized business logic and simplifies the process of developing, deploying and maintaining enterprise-level applications. • [January 19, 2001] "JavaTalk: Parsing XML using the Java API for XML Processing (JAXP)." By John Wetherill (Staff engineer, Sun Microsystems). In SunServer Magazine Volume 15, Number 1 (January 2001), pages 5-6. ['If you're an enterprise Java developer, chances are good that you will at one point be required to incorporate XML in a Java program. Naturally, there are many Java XML APIs available today -- one of these is JAXP, the Java API for XML Processing.'] "XML is platform-neutral, hierarchical, human-readable, easily parsable by machine, and thus is the ideal format data exchange in e-commerce and B2B applications. If you're an enterprise Java developer, chances are good that you will at one point be required to incorporate XML in a Java program. Naturally, there are many Java XML APIs available today, with more coming. Let's dive right in and take a look at one of these: JAXP, the Java API for XML Processing, recently developed under the Java Community Process. JAXP is an abstract layer that sits on top of two existing XML APIs: SAX (Simple XML API) and DOM (Document Object Model). Here we will focus only on using JAXP with SAX. A followup article will look at the use of JAXP with DOM, as well as other APIs to process XML from Java... SAX, developed by participants in the XML-DEV mailing list, uses a callback approach to serially parse an XML document. A parser engine interprets the XML content and makes callbacks on methods in a document handler. One drawback of the original SAX specification is that the vendor-provided parser engine must be obtained in a vendor-specific manner, and programs written to that engine are not portable across arbitrary parsers. This is where JAXP comes in. JAXP is an abstract layer that sits on top of SAX, and uses the factory-design pattern to provide vendor-neutral access to the SAX parser. SAX defines two components which handle the interpretation of an XML document: a parser, which is a vendor supplied engine that does the grunt-work of reading and interpreting the XML, and a DocumentHandler, which receives callbacks from the parser during the parsing process. In general, writing a SAX-based Java program to parse XML content involves a DocumentHandler class containing handler methods that respond to the callbacks generated by the parser, obtaining the parser, and telling the parser which DocumentHandler to use during parsing... JAXP supports the input of XML from InputStreams, Files, URIs and SAX input sources, providing a basis for accessing XML from a huge variety of application contexts. For more information, and to obtain Sun's JAXP reference implementation, visit: java.sun.com/xml." • [January 19, 2001] "Standardized Communications To Mobile Devices: The First Step In Turning Hype To Reality." By Kristopher Tyra (HiddenMind Technology). In SunServer Magazine Volume 15, Number 1 (January 2001), pages 10-11, 19. ['Wireless solutions must offer the enterprise increased productivity for its mobile professionals by maximizing the device capabilities without changing the structure of the back-office logic.'] "Wireless application protocol (WAP) is the early, simple leader for developing wireless data applications. Many players in the wireless space are currently supporting WAP and its server/browser concept built around Handheld Device Markup Language (HDML) and/or Wireless Markup Language (WML). While these technologies are providing initial means for accessing the Internet from wireless devices, they have limitations that need to be addressed before wireless interactive services can move to the next level. Markup languages and browsers require a constant connection thus devouring limited bandwidth and rendering the device useless when out of range. Also, WAP-enabled devices have a limited user interface and lack compatibility between microbrowsers and gateways from different vendors... A distributed computing model that maximizes both RPC and XML for mobile devices is ideal to satisfy this type of communications. This framework would allow a mobile device to make a function call into the backend system and in turn, the back office can make function calls into the device. This would allow the development and deployment of wireless workflow applications that offer PC-type functionality to mobile professionals via a handheld device... [see figure 'Wireless architecture connecting a mobile device to a enterprise back office system using XML in a wireless RPC model']... Okay, so what is necessary for this to take place? What is needed for this to become a reality? We know that XML documents can be transmitted across the Internet or intranet to and from back-office applications. To perform this function to a wireless device, we must be able to easily translate an XML document to and from a forms-based application and also establish and maintain network-based connections to a backend system. XML is a natural choice for this function because it provides a common method for identifying data for business-to-business transactions and is expected to become the dominant format for electronic data interchange. But to tie both the wireless device and the backend server together in the same network, we must package XML with an RPC solution that is lightweight enough for devices, yet substantial enough for servers. A Java profile sits on the device and makes RPC calls in and out, allowing mobile devices to connect and disconnect from the back office at will. A wireless server will be needed to ensure all communications and transactions between devices are completed in an orderly manner. A Jini-based architecture can act seamlessly as a directory of services and devices -- managing all programs and applications. When coupled with an information router, Jini enables users to access and share services and resources while allowing the network location of the user to change. This ability to manage multiple services and sessions in an RPC model allows advanced two-way communication of data. The RPC engine must support asynchronous operations and the communication between servers and devices must be independent. This functionality will allow the end user to toggle between multiple sessions on a mobile device..." • [January 19, 2001] "XML - Not Just Another Hurdle. [Butler Group 'Contra Column' - January 2001.]" By Ken Gibbs. In SunServer Magazine Volume 15, Number 1 (January 2001). "Two years ago, it would have been difficult to understand the desperate need for a common information communication text-driven language in place of Electronic Data Interchange (EDI) -- the need, that is, for a language that is capable of carrying both data and messages, has the flexibility to be styled for any market, easy to use, and is instantly readable. EDI has been used successfully for more than a decade between partners, and indeed forms the basis for many trade communication networks. Its only complaints were the lack of human readability and that 'it takes an expert to use it.' Tools were created to translate the EDI information from alliance companies into usable data for the application. Today we are seeing the arrival of market-specific eXtensible Markup Language (XML) protocols for inter-company as well as intra-company enterprise application integration. These protocols, known as schemas, are taking over from and extending the EDI application arenas. However, it does not mean the death of EDI -- EDI will be with us for years to come -- but the so-called 'superior' XML language will supersede EDI. It is said that 'XML is much more flexible and easier to implement than EDI.' IT Week, in November 2000, carried an article about Distributed Datanet, a brand new infrastructure to support e-business in the plant market for growers and cultivators. It is EDI-based and not XML-based. When I first used XML, it was very simple to understand in the form of data elements in between two tags (data descriptors). This made it easy for a person to clearly read and interpret the information. With the introduction of further facets, for example attributes, an element can now consist of other 'related' elements as 'a nesting' and be given a particular format. This and other more powerful syntax, causes the XML to look very syntactical and, to the non-programming eye, confusing. To program interfaces to XML, the industry has/is developing tools to cater for this more complex, more flexible, more powerful syntax. These include tools for XML modeling, XML caching, XML style sheets, XML transformation engines, XML syntax checkers, XML editors, and XML trace tools. As with projects of old, using the wrong tool can crucify a project. One team's successful toolset can mean disaster for another. Tools should be selected with care to obtain the tool whose power matches that which is required for the development. It's important to match the roles of the players and their task with their relevant necessary tools, just as we did 10 years ago for a development project using 4GL toolsets. We are told that when looking at our XML application and business partner interoperation strategies, it is important to understand what we will be doing with XML, and then identify the tasks that go into achieving those goals. XML, hailed as the simple approach to desperate integration, has created the need for yet another new skill; not a five-minute 'yes got it' skill, but a more involved 'how else can I do the same thing' type skill. Schemas are themselves presented as yet more XML describing the schema elements and attributes. We are in danger of replacing the three letters of EDI with XML, but the rest stays the same. The probable message is that we do not need EDI skills anymore if using XML, but we will need to retrain to use XML. The business analyst is still responsible for modeling the business process, no matter which language is being used. The model reflects the flow of XML documents across systems or partners based on a complete end-to-end integration. Of course, using XML does not mean using schemas, but the information format still requires them to be issued as a standard to the application. XML applications still need to be carefully planned and agreed upon." • [January 19, 2001] "XML-RPC: It Works Both Ways." By Dave Warner (Federal Data Corporation). From the O'Reilly Network Python DevCenter (January 17, 2001). "In my previous XML-RPC article, I showed how to access the Meerkat XML-RPC server (written in PHP) from Python. This time around, we will go behind the scenes to view the structure of an XML-RPC method call and response. Then we will develop an XML-RPC server using Fredrick Lundh's xmlrpclib. Lastly, we'll touch on some real-world concerns about XML-RPC, SOAP, and look at the W3C's working group that is driving development of XML-based distributed application standardization. A peek under the hood So how does xmlrpclib work? Its approach to the construction of method calls and responses is quite simple. A method called dumps (dump string) turns native Python objects into XML. The process is called 'marshalling'. The reverse process, called 'unmarshalling', is performed by loads (load string). Although transparent to the casual user, we can access these methods to peek under the hood. The dumps method takes three arguments: a tuple containing the parameters, the method name, and a method response. The last two are optional. Following good object-oriented design, xmlrpclib's dumps insists that the parameters are passed in the form of a tuple. This prevents modification of the parameters past this point. Let's examine the structure of a method call used with Meerkat. At this point, no connection is needed. We are simply examining what will be sent, not actually communicating with a server... For ease of use, xmlrpclib wins hands down for communicating with Meerkat. But does XML-RPC have a future? Yes! Athough the lure of wide-spread corporate backing is pushing many toward the Simple Object Access Protocol (SOAP), XML-RPC is in use today in some surprising places. Red Hat uses xmlrpclib.py in its new subscription service, and also internally. Their implementation of xmlrpclib ships with Red Hat 7 and includes enhancements by Cristian Gafton that include https support. Additionally, Userland's Frontier is fully XML-RPC aware, as is the Apache Java Project's Turbine framework and, by extension, Jetspeed. So, where does that leave SOAP? Recently, the W3C established a working group on XML protocols. They have published a draft requirements document. Although they are using SOAP as their starting point, it is unlikely that SOAP will remain unchanged by this standardization process. I don't expect SOAP to go away any time soon, but I do expect it to be in flux for the next year or so. Hopefully, the example above has shown that XML-RPC is not just a step on the way to SOAP and beyond, but a useful, simple tool in its own right. When XML-RPC was first introduced, Jon Udell asked the question: 'Does distributed computing have to be any harder than this?' Much of the time, no." See also "XML-RPC." • [January 19, 2001] "Groove Signs 100 Partners In Three Months." By Ed Scannell. In InfoWorld (January 18, 2001). "Groove Networks, a peer-to-peer startup founded by Lotus Notes creator Ray Ozzie, appears to be off to a fast start. The company announced Wednesday it has signed up 100 partners in its first three months of operation. The range of partners signing on have all agreed to develop, deploy, and support a variety of peer-computing solutions designed for business use. These solutions, generally, are intended to reduce time-to-decision and time-to-problem resolutions, according to Groove officials. At the time of its introduction last October, Groove officials said the platform was the first in the industry designed to accommodate peer-computing applications specifically for business use. The platform allows users to communicate over the Internet without a central server present...Groove's partners -- which are mostly small companies -- include BAE Systems, eoffice, Full Moon Interactive, fusionOne, Perot Systems, STM Wireless, and Zero Gravity Technologies..." See "A New Groove," by Steve Gillmor and Ray Ozzie. • [January 19, 2001] XML Media Types." Network Working Group, Request for Comments: 3023. January 2001. By MURATA Makoto, Simon St.Laurent, and Dan Kohn. Abstract: "This document standardizes five new media types: (1) text/xml, (2) application/xml, (3) text/xml-external-parsed-entity, (4) application/xml- external-parsed-entity, and (5) application/xml-dtd. [They are designed] for use in exchanging network entities that are related to the Extensible Markup Language (XML). This document also standardizes a convention (using the suffix +xml) for naming media types outside of these five types when those media types represent XML MIME (Multipurpose Internet Mail Extensions) entities. XML MIME entities are currently exchanged via the HyperText Transfer Protocol on the World Wide Web, are an integral part of the WebDAV protocol for remote web authoring, and are expected to have utility in many domains. Major differences from RFC 2376 are: (1) the addition of text/xml- external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd, (2) the +xml suffix convention (which also updates the RFC 2048 registration process), and (3) the discussion of 'utf-16le' and 'utf-16be'." Discussion: "XML entities are currently exchanged on the World Wide Web, and XML is also used for property values and parameter marshalling by the WebDAV [RFC2518] protocol for remote web authoring. Thus, there is a need for a media type to properly label the exchange of XML network entities. Although XML is a subset of the Standard Generalized Markup Language (SGML) ISO 8879 [SGML], which has been assigned the media types text/sgml and application/sgml, there are several reasons why use of text/sgml or application/sgml to label XML is inappropriate. First, there exist many applications that can process XML, but that cannot process SGML, due to SGML's larger feature set. Second, SGML applications cannot always process XML entities, because XML uses features of recent technical corrigenda to SGML. Third, the definition of text/sgml and application/sgml in [RFC1874] includes parameters for SGML bit combination transformation format (SGML- bctf), and SGML boot attribute (SGML-boot). Since XML does not use these parameters, it would be ambiguous if such parameters were given for an XML MIME entity. For these reasons, the best approach for labeling XML network entities is to provide new media types for XML. Since XML is an integral part of the WebDAV Distributed Authoring Protocol, and since World Wide Web Consortium Recommendations have conventionally been assigned IETF tree media types, and since similar media types (HTML, SGML) have been assigned IETF tree media types, the XML media types also belong in the IETF media types tree. Similarly, XML will be used as a foundation for other media types, including types in every branch of the IETF media types tree. To facilitate the processing of such types, media types based on XML, but that are not identified using text/xml or application/xml, should be named using a suffix of +xml as described in Section 7. This will allow XML-based tools -- browsers, editors, search engines, and other processors -- to work with all XML-based media types..." [cache] • [January 19, 2001] "The 'application/xhtml+xml' Media Type." IETF 'draft-baker-xhtml-media-reg-00.txt'. By Mark A. Baker (Sun Microsystems Inc.). "This document defines the application/xhtml+xml MIME media type for XHTML based markup languages; it is not intended to obsolete any previous IETF documents, in particular RFC 2854 which registers text/html. This document was prepared by members of the W3C HTML working group based on the structure, and some of the content, of RFC 2854, the registration of text/html. Please send comments to www-html@w3.org, a public mailing list with archives at http://lists.w3.org/Archives/Public/www-html/... This document only registers a new MIME media type, application/xhtml+xml. It does not define anything more than is required to perform this registration. The HTML WG expects to publish further documentation on this subject, including but not limited to, information about rules for which documents should and should not be described with this new media type, and further information about recognizing XHTML documents..." [cache] • [January 19, 2001] "XML Schema Extension Mechanisms." By David E. Cleary. December 2000. "This presentation documents the two methods for extending XML Schema with application specific information. It includes examples of real world uses of these extensions today." Examples: "(1) XMLSchema-hasFacetAndProperty.xml - This is the schema for the facet and property appinfo extension. This interesting thing to note is that the human readable documentation in the schema specifies how an application should use this extension. For instance, it specifies how you determine what facets and properties user defined datatypes supports via walking the basetype chain. (2) enum2.xml - This is an annotated version of TextNumbers that includes localized enumerations. Applications that support this extension can use these localized enumerations for their UI instead of relying on the English version. (3) appinfo.xml - This is the XML Schema definition for the appinfo element taken directly out of the Schema for Schemas. It shows it allows an attribute called source that is of type uriReference. It also supports mixed content (i.e., both character data as well as child elements) and uses the "any" wildcard to specify it can have any content. Make note of the processContents attribute that is set to lax, which sets validation rules. (4) string.xml - A fragment of the XML Schema datatypes schema. This schema uses an appinfo extension that specifies what property and facets a datatype supports. This extension is also used in generating the datatypes specification." [cache] • [January 19, 2001] "4th Generation XML Application Development." By David E. Cleary. December 2000. "This presentation discusses an application development methodology that relies on molding XML instance data to your application as opposed to writing your application based on the XML vocabulary used. It details how Progress Software uses schema annotation to map XML data to business logic and includes a example of using this methodology to map XML instance data to existing Java classes... The SchemaMapper application requires the Xerces-J Parser from Apache to be in your classpath. If you are using Microsoft's JVM, you can do this by adding an entry for xerces.jar in the registry. To use the SchemaMapper, you just give it a qualified filename of an XML instance document. The instance document must conforn to a schema located in the Schemas directory." See the announcement. [cache] • [January 19, 2001] "Six Steps to a Successful XML Integration Project." By Tony Stewart (GoAmerica Communications). In Dr. Dobb's Journal Issue 321 (February 2001). ['Is XML the key to your integration project? XML is versatile, but often it's also hyped as a magic bullet. Tony Stewart offers six tips to follow when adding XML to your arsenal.'] "Without question, the emerging data-description standard known as XML has received considerable fanfare as the technology of choice for implementing Web-based business-to-business transactions, and as the glue to connect internal systems that previously operated as independent islands. Yet, based on my own experiences with XML, as well as discussions I participated in and a paper I presented at a recent international workshop on 'Making Best Use of XML in the Enterprise,' it is obvious that any XML integration project is littered with potential land mines for the inexperienced and unwitting. Here then is a six-step process that is intended to improve the odds for a successful XML integration project: 1. Focus on addressing internal business issues. 2. Avoid the HTML trap. 3. Recognize that system integration is a major undertaking. 4. Decide which type of XML application(s) you are building. 5. Use prototypes and pilot projects to reduce risk and build support. 6. Don't do it alone or restrict yourself to your own organization... XML offers great potential for integrating independent systems. Yet, adapters need to walk in with their eyes wide open. Rather than being led astray by the hype, focus on the core issues, the same key success factors that we've known about for years. Plan carefully, address your real business issues, build internal support, mitigate the technical risks, leverage the work that's been done by others, and find appropriate partners. Follow these steps, and you'll be well on your way to a successful XML integration project." • [January 19, 2001] "The Missing Link: Will ebXML Bring True Global B2B?" By Natalie Walker Whitlock (Casaflora Communications). From IBM developerWorks XML library. January 2001. "The United Nation and the OASIS group have joined together as unlikely allies in the ongoing turf war over B2B standards. Offering a new set of XML-based specification, ebXML makes the bold promise to enable a single global electronic marketplace. XML, XTML, SOAP, xCBL, SGML, TCP/IP -- the Internet has become a veritable alphabet soup of languages and standards. Do we really need yet another? According to OASIS and the United Nations, the answer is 'yes.' Sponsored by the Organization for the Advancement of Structured Information Standards (OASIS) in partnership with the United Nations itself (under the Center for Trade Facilitation and Electronic Business, or CEFACT), ebXML hopes to be the definitive standard that will unite all existing and emerging XML standards. 'I'm excited about the ebXML effort because it's our only chance in this decade to establish an international e-commerce standard,' says ebXML Transport/Routing and Packaging Chairman and Executive Steering Committee member Rik Drummond. Indeed, enthusiasm for the standard is running at an all-time high after OASIS and UN/CEFACT presented a proof-of-concept demo of the ebXML technology at a news conference in San Francisco on December 12th, 2000. A similar demonstration was held the previous month in Tokyo. According to ebXML leaders, the final core technical infrastructure is near completion and is expected to be ratified next March -- two months ahead of schedule... ebXML was designed to address one of the key failings of e-commerce, interoperability. The wide adoption of XML has helped the cause, but with it comes another hodgepodge of standards. For example, there are currently some 500 XML schema "standards" -- and counting. ebXML intends to rein in the melange, and provide a single lingua franca for business-to-business communications, thus taking worldwide e-commerce to the next level... The potential benefits to electronic business are certainly promising. For example, the use of ebXML would standardize trading partnerships by classifying the technical boundaries of trading partner profiles and agreements using TPA. It would also standardize contracts and enable extensive supply chain information to be included in a standard electronic business document. ebXML will not provide any sort of reference implementation of its specifications, however. That will be left up to the companies involved to adopt the standards and provide their own implementations. Says Drummond, 'Our focus is to construct a general-purpose message, with a header that supports multiple payloads, while allowing sufficient digital signatures within and among related messages'." Also available in PDF format. [cache] • [January 19, 2001] "Sun to lose key player in Web software push." By Mary Jo Foley. In CNET News.com (January 17, 2001). "A key player behind Sun Microsystems' software strategy is poised to resign on the eve of the company's Feb. 5 Web-services strategy launch. Marco Boerries, Sun's vice president of Webtop and application software, has submitted his resignation, effective Jan. 26, according to several sources close to the company... Sun is gearing up to unveil its battle plan for combating archrival Microsoft in the Web services arena. The Sun-America Online joint venture iPlanet is expected to launch new Webtop remote-access software, one of the elements upon which Sun is building its hosting infrastructure. This infrastructure will form Sun's backbone over which service providers will deliver Web services, according to sources... On Feb. 5, Sun is expected to position its Java-based iPlanet suite of software as the technology upon which developers can build 'Smart Services' -- one of the brand names for Web services Sun is kicking around--said sources familiar with the company's plans. Sun is expected to contrast its 'open' approach to Microsoft's Windows-centric .Net software-as-a-service initiative, which Microsoft first detailed last June. As evidence of its openness, Sun will talk about Web services as interchangeable, almost plug-and-play-like applications comprising multiple components from multiple companies... One analyst noted that Sun faces a formidable challenge in presenting a coherent Web services plan to customers and developers. Sun Research has developed a Web services toolkit, code-named Brazil, that could fill one such hole -- if and when Sun delivers a commercial version of it. Sun executives have said Brazil is one element of the company's long-term Web services strategy. 'Sun has the pieces to put together a compelling Web services strategy,' said Uttam Narsu, Giga Information Group senior industry analyst. 'But right now, their products are like a jigsaw puzzle whose pieces don't quite fit. There are lots of gaps in their lineup, and it's too heavily Java-based.' Narsu noted that Sun hasn't been vocal enough in backing evolving Web services standards, such as the Simple Object Access Protocol (SOAP) transport mechanism. And even though a Sun employee is credited as being the father of XML (Extensible Markup Language), Sun has become a 'laggard' in backing the data-sharing protocol that is integral to the Web services model, Narsu said. To date, 'Sun has adopted a spritzer strategy for XML,' Narsu said. 'They spray a little here and a little there. They've ended up with a bit of a credibility gap on this front'." • [January 19, 2001] "Introducing Visual Studio for Applications." By Andrew Clinick. MSDN Online. January 16, 2001 "Those of you who follow Windows Script have probably been wondering what role script plays in the .NET world, and perhaps even what the script team has been doing for the last year or so. After all, the script team has released major upgrades to the script engines nearly every six to 12 months, but it's been a long time since Version 5.0 shipped, which was the last major upgrade to Windows Script technologies. Today is a very exciting day for the Windows Script team, and more importantly the bigger team that we are now part of, with the announcement of Visual Studio for Applications (VSA). VSA represents two years of work from the existing Windows Script team and the Visual Basic for Applications (VBA) group to create a product that takes the best of Windows Script and VBA to create an application-customization technology designed specifically to take advantage of .NET... Prior to VSA, Microsoft had two technologies for integrating a language into applications that would allow the application to be customized after the application had shipped. This was achieved by hosting a language engine, be it a Windows Script engine (such as Visual Basic Scripting Edition or JScript) or VBA. Once the engine had been loaded, the application would load the appropriate customization code (a.k.a. script) and provide an object model for the script writer to use. While the concept was identical, the implementation for a Windows Script engine or VBA was very different. So if you integrated Windows Script, there was no easy way to upgrade to VBA or vice versa. You had to integrate VBA using completely different integration interfaces, so you'd be starting from scratch. In addition, you'd probably have to integrate both so that existing script code would still work alongside the new VBA code. As we looked into the new features people were asking for -- types and better error handling for example -- it became clear that VBScript would eventually evolve into real VB. Rather than try to implement two versions of the same language, we sat down with the Visual Basic team and worked out how to converge the languages. This means that the new Visual Basic Script engine is true VB. It has exactly the same compiler as Visual Basic, so any code you can use in Visual Basic you will be able to use in the script engine. In addition to language features that use the Visual Basic compiler, that means that script code is now fully compiled, as is any Visual Basic.NET code..." • [January 18, 2001] Proposed Draft Unicode Technical Report #27: Unicode 3.1. Reference: Version 1.0, 'http://www.unicode.org/unicode/reports/tr27/tr27-1, 2000-01-17; edited by Mark Davis, Michael Everson, Asmus Freytag, Lisa Moore, et al. Document summary: "This document defines Version 3.1 of the Unicode Standard. It overrides certain features of Unicode 3.0.1, and adds a large numbers of coded characters. This draft is for review with the intention of it becoming a Unicode Standard Annex." The specification has been approved by the Unicode Technical Committee for public review; it is a 'Proposed Draft', to be taken as "a work in progress." Details: "The primary feature of Unicode 3.1 is the addition of 44,946 new encoded characters. These characters cover several historic scripts, several sets of symbols, and a very large collection of additional CJK ideographs. For the first time, characters are encoded beyond the original 16-bit codespace or Basic Multilingual Plane (BMP or Plane 0). These new characters, encoded at code positions of U+10000 or higher, are synchronized with the forthcoming standard ISO/IEC 10646-2. Unicode 3.1 and 10646-2 define three new supplementary planes. Unicode 3.1 also features corrected contributory data files, to bring the data files up to date against the much expanded repertoire of characters. All errata and corrigenda to Unicode 3.0 and Unicode 3.0.1 are included in this specification. Major corrigenda and other changes having a bearing on conformance to the standard are listed in Article 3, Conformance. Other minor errata are listed in Article 5, Errata. Most notable among the corrigenda to the standard is a tightening of the definition of UTF-8, to eliminate a possible security issue with non-shortest-form UTF-8." The TR provides charts which contain the characters added in Unicode 3.1. They are shown together with the characters that were part of Unicode 3.0. New characters are shown on a yellow background in these code charts. They include: (1) Greek and Coptic; (2) Old Italic; (3) Gothic; (4) Deseret; (5) Byzantine Musical Symbols; (6) Musical Symbols; (7) Mathematical Alphanumeric Symbols; (8) CJK Unified Ideographs Extension B; (9) CJK Compatibility Ideographs Supplement; (10) Tag Characters. Note Section '13.7 Tag Characters', which provides clarification on the restricted use of 'Tag Characters U+E0000-U+E007F: "The characters in this block provide a mechanism for language tagging in Unicode plain text. The characters in this block are reserved for use with special protocols. They are not to be used in the absence of such protocols, or with protocols that provide alternate means for language tagging, such as markup. The requirement for language information embedded in plain text data is often overstated...This block encodes a set of 95 special-use tag characters to enable the spelling out of ASCII-based string tags using characters which can be strictly separated from ordinary text content characters in Unicode. These tag characters can be embedded by protocols into plain text. They can be identified and/or ignored by implementations with trivial algorithms because there is no overloading of usage for these tag characters--they can only express tag values and never textual content itself. In addition to these 95 characters, one language tag identification character and one cancel tag character are also encoded. The language tag identification character identifies a tag string as a language tag; the language tag itself makes use of RFC 1766 language tag strings spelled out using the tag characters from this block...Because of the extra implementation burden, language tags should be avoided in plain text unless language information is required and it is known that the receivers of the text will properly recognize and maintain the tags. However, where language tags must be used, implementers should consider the following implementation issues involved in supporting language information with tags and decide how to handle tags where they are not fully supported. This discussion applies to any mechanism for providing language tags in a plain text environment...Language tags should also be avoided wherever higher-level protocols, such as a rich-text format, HTML or MIME, provide language attributes. This practice prevents cases where the higher-level protocol and the language tags disagree." See: (1) Unicode in XML and other Markup Languages [Unicode Technical Report #20 == W3C Note 15-December-2000], and (2) "XML and Unicode." • [January 18, 2001] "Tutorial: Using XSL Formatting Objects." By J. David Eisenberg. From XML.com. January 17, 2001. ['The W3C's XSL Formatting Objects technology provides an XML language for specifying the layout of documents. In the first article of our XSL FO tutorial series we show you how to set up your pages.'] "The World Wide Web Consortium's specification for Extensible Stylesheet Language (XSL) comes in two parts: (1) XSLT, a language for transforming XML documents, and (2) XSL Formatting Objects (XSL FO), an XML vocabulary for specifying formatting semantics... . XSL Formatting Objects is itself an XML-based markup language that lets you specify in great detail the pagination, layout, and styling information that will be applied to your content. The XSL FO markup is quite complex. It is also verbose; virtually the only practical way to produce an XSL FO file is to use XSLT to produce a source document. Finally, once you have this XSL FO file, you need some way to render it to an output medium. There are few tools available to do this final step. For these reasons, XML FO has not caught on as quickly as XSLT. Rather than explain XSL FO in its entirety, this article will give you enough information to use the major features of XSL FO. Our case study will be a short review handbook of Spanish that will be printed as an insert for a Spanish language learning CD-ROM. We'll use the Apache Software Foundation's FOP tool to convert the FO file to a PDF file... In the next article, we'll show you how to use XSLT to make it much easier to create the FO elements. You'll also learn how to put lists and tables into your documents." • [January 18, 2001] "XML-Deviant: XPointer and the Patent." By Leigh Dodds. From XML.com. January 17, 2001. ['Does a Sun patent threaten the future of hypertext on the web, or are XML developers getting unnecessarily alarmed by the licensing terms on the XPointer spec? The XML-Deviant reports.'] "The issuing of patents on software has been an industry hot topic for a while. Now it's the XML community's turn to get burned. The XML-Deviant reports this week on a Sun patent causing consternation on XML-DEV. In 1997 Sun Microsystems and Jakob Nielson, the noted web design and usability guru, were granted a patent on a 'Method and system for implementing hypertext scroll attributes' by the US Patent Office. The patent describes the process of using a string to define an external anchor for an HTML document. The string is defined in the link to the HTML document, and the web browser, on loading the document defined by the link, will scroll to the first occurrence of the text string within the document -- hardly an innovation. The first sign that this, like many similar software patents, was going to be an issue was in June 2000, when XPointer moved to Candidate Recommendation. At that time it was noted that the Sun patent may affect the XPointer specification. The issue didn't resurface until December 2000 when Sun published the terms and conditions under which they would allow the community to develop XPointer applications. This appeared to remove one of the issues holding up progress on both the XPointer and XLink specifications, and so last week the W3C were able to publish a new revision of XPointer as a second Last Call Working Draft. A notable addition to the specification are the following comments: XPointer is affected by a technology patent held by Sun Microsystems. The legal terms and conditions offered by Sun to XPointer implementors can be found in the archives of the public comments list. This would appear to be the first time that reference to such terms and conditions has appeared in a W3C working draft. Not surprisingly it has generated some comment on XML-DEV... [For example:] (1) Drawing attention to the comments in the new draft, Elliotte Rusty Harold called foul on both Sun's patent and their terms and conditions: 'I recommend complete rejection of this specification until such time as Sun's patent can be dealt with more reasonably.' (2) Len Bullard also expressed surprise that the W3C had not contested the issue more firmly with Sun: 'I guess I am mystified that something this basic could have been allowed for so long. Why has the W3C not made more trouble for Sun on this one?' (3) Posting to the XMLhack comments board, Rick Jelliffe believed that Sun should feel ashamed: 'I find it very frustrating to see simple uses of standards subverted in this way. When we make standards (ISO, IETF, even W3C) surely it is to make technology available to people, not to merely provide a fresh crop of ideas for bandits to plunder.' (4) Tim Bray went as far as to pronounce XPointer 'Dead On Arrival' if Sun did not relax their terms: 'The responsible thing for Sun to do would be to issue an official declaration that the patent has no standing in respect of XPointer, and that to the extent that it does, Sun grants an unrestricted, free, license to anyone to implement and use it without incurring any obligations of any kind on account of the patent... [elsewise] XPointer, D.O.A.' Alongside XLink, XPointer is a key component of next generation web technologies, and as such is too important to suffer 'death on arrival.' Len Bullard [independently and simultaneously declaring XPointer D.O.A.] summed up the potential outcome in his own inimitable style: 'We have to take this very seriously. We are being drawn toward an event horizon of a black hole of singular patents that can paralyze the evolution of infrastructure of the Internet for decades. This is very real, very bad, and must be propagated to as many lists where technical discussions are held quickly. Sun may be asserting what they consider to be a valid patent. Que bueno. We know for a fact there is prior art. We know the patent covers a vital part of the web document design. This must be overturned and a case made clear both to corporations and individuals that work for these corporations that pursuing such patents will face patient and persistent opposition and may cost them considerable business. It has to stop...'" • [January 18, 2001] "A Scalable Process for Information Standards." By Jon Bosak. From XML.com. January 17, 2001. ['The Chair of the OASIS Process Advisory Committee explains how OASIS has developed a standards process to cater for the fast-moving world of XML.'] "It is becoming increasingly evident that the primary role of XML is to provide a syntactic framework for the development of standard semantics to enable the exchange of human-readable structured data in corporate contexts. Developers of corporate XML applications will often benefit from the ability to confer upon certain XML specifications the status of formal standards, or, in other words, to develop certain specifications under the aegis of a formal standards development organization (SDO). An SDO for XML specifications should ideally have the following qualities. (1) It should legally exist. In practice, this means that it should be incorporated, preferably as a nonprofit organization. (2) All intellectual property developed by the organization should be made available to the public under a nondiscriminatory license allowing free use. (3) Authority for governance of the organization should be vested in its voting members. (4) The criteria for membership should be such as to encourage the participation of all interested parties, with reasonable and nondiscriminatory dues set at a level appropriate to the maintenance of the organization. (5) The organization should be governed by a formal democratic process. (6) The formal deliberations of the organization should be archived and publicly visible. OASIS, the Organization for the Advancement of Structured Information Standards, is a nonprofit corporation founded in 1993 whose already open and formal process has recently been revised to speed it up and implement some procedures electronically. The designers of the new OASIS technical committee (TC) process have attempted to capture the most effective features of standards development groups ranging from traditional bodies such as ISO and IEEE to web-oriented groups like IETF and W3C. These features have been combined in a set of rules that attempts to use technology to bridge the contradictory goals of speed and formality. The intention has been to create a process that operates at warp speed while retaining the audit trail, IPR policies, and formal authority of a legally chartered nonprofit organization. The OASIS TC process is intended to be as efficient as possible while remaining legally accountable. Its implementation constitutes OASIS as an organization for the rapid development of open XML standards and puts the present and future control over standards developed under the process firmly in the hands of those who will have to use them. It must be remembered, however, that the explicit trade-off of central control for speed and scalability puts the burden of overseeing the development of such standards on those same users, including in particular the commercial organizations that have an interest in deploying them." • [January 18, 2001] "Web App Server Vendors Look Beyond Java." By Antone Gonsalves. From TechWeb News. January 15, 2001. "Sun Microsystems Inc. will showcase next week the latest version of the Java enterprise platform -- but the star technology might have to share the spotlight with new e-commerce technologies. Java-creator Sun and its partners will meet with media and analysts in San Francisco Tuesday to showcase Java 2 Enterprise Edition (J2EE) v1.3, which encompasses the many specifications that define the services and technologies available within the Java enterprise platform. Those services include transaction management and messaging, as well as the Enterprise JavaBean component model, and Java Server Pages for dynamic Web page generation... But while stressing the importance of the J2EE, application server vendors who will join Sun next week said the future of Web development will also include newly emerging technologies based on Extensible Markup Language (XML). Collectively, those technologies will provide the glue for connecting businesses over the Web by exposing applications as Web services, so they can be accessed irrespective of the underlying platform. The emerging technologies include UDDI, SOAP, WSDL, and XAML. SOAP, or Simple Object Access Protocol, has been submitted to the World Wide Web Consortium. The others are under development by various vendor groups, which have all promised to turn their work over to an independent standards body... IBM, a major Java development house, also has announced its support for the emerging XML-based technologies, as well as Microsoft Corp. The concept of Web services is core to Microsoft's Internet strategy, called Microsoft.Net. The strategy is focused on XML and the Windows operating system, not Java. 'There will be two competing stacks of functionality for implementing Web services,' said Mike Gilpin, industry analyst for Giga Information Group, Cambridge, Mass. 'There will be the Microsoft.Net stack and the J2EE stack. Many people will use one or the other, and perhaps both, depending on their particular situation'." • [January 18, 2001] "IPlanet Improves Legacy Integration. [Perspective And Research About The IT Industry.]" By Matthew G. Nelson. In InformationWeek Issue 819 (January 08, 2001), page 129. "IPlanet E-Commerce Solutions this week will bolster its efforts to help companies integrate various commerce and communications systems and languages with two updated products. IPlanet, a Sun-Netscape alliance, is shipping Integration Server 2.1, which helps translate and process legacy information for Internet systems. The software is based on enterprise application integration technology that Sun Microsystems got when it acquired Forte Software Inc. in 1999 for540 million. IPlanet also offers expanded XML support in a new version of its ECXpert software, which is designed to help companies integrate their commerce systems with those of their buyers and suppliers. ECXpert 3.5 also supports Secure HTTP, a new standard that helps protect data as it traverses the Internet, and includes the ability to prioritize documents as they're processed through the queue. That's an added benefit for commerce-service providers that want to charge special rates for priority service as well as track usage. The new Integration Server 2.1, which uses XML as a backbone for messaging, now uses the more robust Java Messaging Service as an additional level of transport protocol support. A previous version of the Integration Server used only HTTP as a transport protocol. Sanjay Sarathy, director of product marketing for iPlanet Application Services, says the upgraded iPlanet products will make it easier for customers to integrate legacy applications with Internet systems rather than having to build new ones. "Companies are trying to broaden the range of the documents with which they work, and XML is becoming a language-neutral system for the document type," he says. The iPlanet Integration Server 2.1 is available now for $39,500 per CPU. It supports Solaris, Windows NT, HP-UX, AIX, and OS/390. IPlanet ECXpert 3.5 is also available, priced at$50,000 per CPU with a two-CPU minimum. It runs on Solaris, AIX, and Windows NT."

• [January 17, 2001] "XHTML Basic: Modularization in Action." By Molly E. Holzschlag. In WebTechniques Volume 6, Issue 2 (February 2001), pages 36-39. ['Molly Holzschlag declares, HTML is dead. It has passed on. It's pushing up daisies. Long live XHTML Modularization!'] "XHTML 1.0, as a reformulation of HTML as an XML application, was the first to move away from the limitations of HTML toward the extensibility offered by XML. But XHTML 1.0 is limited in that it allows only three document type definitions (DTDs), each modeled after those found in HTML 4.0: Strict, Transitional, and Frameset. While XHTML 1.0 is well structured and well formed, you can't tap into some of the most powerful aspects of XML and related technologies. You can't write your own DTDs in XHTML 1.0. You can't use schemas. So extensibility really doesn't exist yet in the XHTML game, and XHTML 1.0 remains geared toward Web browsers. XHTML 1.0 must be viewed as a means of allowing HTML authors -- many of whom bootstrapped their way into the field -- to painlessly gain entrance to the world of XML. Combining the familiar vocabulary of HTML with the strong syntactical influence of XML, XHTML 1.0 provides authors with a means of working with XML that makes perfect sense. Move to another XML application such as SMIL, SVG, or WML, and authoring those applications becomes much less daunting. An author can immediately see how XML, as a metalanguage for creating applications, can influence a wide range of languages. Learn one, and another becomes accessible. This relationship is the most compelling argument that XHTML 1.0 is not only reasonable, but necessary... XHTML includes a list of defined modules that express specific aspects of functionality. Then, these modules can be implemented using a DTD, which can, in a sense, be seen as the core. You can combine one, or two, or five, or more. You can even write your own additions, provided that you follow the recommended DTD and driver syntax. XML schemas are also expected to be implemented into this model, which means there's more than one way to approach a given challenge. Hence, if you want to write documents for a PDA, you can choose only those modules that let you do that. Extend that to a Web page, and you may want to add some extra modules to support your needs. In turn, these DTDs create XHTML subsets. Subsets can be shared by many (as is the case with XHTML Basic, which I'll discuss shortly), or completely customized for a given application. If you're following along with this concept, you should begin to see the method to the madness, that herein lies the 'X' in XHTML. Finally, extensibility has arrived via the power an author has over DTDs and the potential addition of XML schemas.While the uses of modularization will become much more diverse in the future, XHTML Basic is a perfect example of how to make modularization work today. By adding content to the shell I've created in Listing 1, you can create documents that are accessible by Web browsers and existing mobile devices such as cell phones and Palm handhelds. The reason you can do this is that the vocabulary is defined by HTML. But the syntactical rules evolve from XML, which in turn has given us XHTML. And that, in turn, has inspired modularization, taking us to an entirely new level of extensibility."

• [January 17, 2001] The Unicode Consortium has published Proposed Draft Unicode Technical Report #27: Unicode 3.1. Reference: Version 1.0, 'http://www.unicode.org/unicode/reports/tr27/tr27-1, 2000-01-17; edited by Mark Davis, Michael Everson, Asmus Freytag, Lisa Moore, et al. Document summary: "This document defines Version 3.1 of the Unicode Standard. It overrides certain features of Unicode 3.0.1, and adds a large numbers of coded characters. This draft is for review with the intention of it becoming a Unicode Standard Annex." The specification has been approved by the Unicode Technical Committee for public review; it is a 'Proposed Draft', to be taken as "a work in progress." Details: "The primary feature of Unicode 3.1 is the addition of 44,946 new encoded characters. These characters cover several historic scripts, several sets of symbols, and a very large collection of additional CJK ideographs. For the first time, characters are encoded beyond the original 16-bit codespace or Basic Multilingual Plane (BMP or Plane 0). These new characters, encoded at code positions of U+10000 or higher, are synchronized with the forthcoming standard ISO/IEC 10646-2. Unicode 3.1 and 10646-2 define three new supplementary planes. Unicode 3.1 also features corrected contributory data files, to bring the data files up to date against the much expanded repertoire of characters. All errata and corrigenda to Unicode 3.0 and Unicode 3.0.1 are included in this specification. Major corrigenda and other changes having a bearing on conformance to the standard are listed in Article 3, Conformance. Other minor errata are listed in Article 5, Errata. Most notable among the corrigenda to the standard is a tightening of the definition of UTF-8, to eliminate a possible security issue with non-shortest-form UTF-8." The TR provides charts which contain the characters added in Unicode 3.1. They are shown together with the characters that were part of Unicode 3.0. New characters are shown on a yellow background in these code charts. They include: (1) Greek and Coptic; (2) Old Italic; (3) Gothic; (4) Deseret; (5) Byzantine Musical Symbols; (6) Musical Symbols; (7) Mathematical Alphanumeric Symbols; (8) CJK Unified Ideographs Extension B; (9) CJK Compatibility Ideographs Supplement; (10) Tag Characters. Note Section '13.7 Tag Characters', which provides clarification on the restricted use of 'Tag Characters U+E0000-U+E007F: "The characters in this block provide a mechanism for language tagging in Unicode plain text. The characters in this block are reserved for use with special protocols. They are not to be used in the absence of such protocols, or with protocols that provide alternate means for language tagging, such as markup. The requirement for language information embedded in plain text data is often overstated...This block encodes a set of 95 special-use tag characters to enable the spelling out of ASCII-based string tags using characters which can be strictly separated from ordinary text content characters in Unicode. These tag characters can be embedded by protocols into plain text. They can be identified and/or ignored by implementations with trivial algorithms because there is no overloading of usage for these tag characters--they can only express tag values and never textual content itself. In addition to these 95 characters, one language tag identification character and one cancel tag character are also encoded. The language tag identification character identifies a tag string as a language tag; the language tag itself makes use of RFC 1766 language tag strings spelled out using the tag characters from this block...Because of the extra implementation burden, language tags should be avoided in plain text unless language information is required and it is known that the receivers of the text will properly recognize and maintain the tags. However, where language tags must be used, implementers should consider the following implementation issues involved in supporting language information with tags and decide how to handle tags where they are not fully supported. This discussion applies to any mechanism for providing language tags in a plain text environment...Language tags should also be avoided wherever higher-level protocols, such as a rich-text format, HTML or MIME, provide language attributes. This practice prevents cases where the higher-level protocol and the language tags disagree." See Unicode in XML and other Markup Languages [Unicode Technical Report #20 == W3C Note 15-December-2000].

• [January 17, 2001] "Competitive Compatible Marketplace for the Java 2 Platform, Enterprise Edition (J2EE)(TM) Technology Grows to Nine Products From Leading Server Vendors. Leading J2EE Technology Licensees Are Shipping J2EE Compatible Solutions" - "Sun Microsystems, Inc. today announced at its Java 2 Platform, Enterprise Edition (J2EE) press event in San Francisco that nine J2EE technology licensees are now shipping J2EE-technology compatible products. Each of these nine vendors has passed every test in the J2EE Compatibility Test Suite (CTS) and is certified to meet all the requirements for using the J2EE technology brand. There are 25 companies that currently license the J2EE platform, which represents 76% - 90% of the application server market. Today's media event includes demonstrations of shipping J2EE compatible solutions for enterprise applications from major industry vendors: Art Technology Group; BEA Systems, Inc.; Bluestone Software, Inc.; Borland Corporation; IONA Technologies; iPlanet E-Commerce Solutions, a Sun-Netscape Alliance; SilverStream Software; and Sybase. This milestone confirms enterprise customer demand for and the software market's continued adoption of J2EE technology -- the industry standard for enterprise application development -- and reaffirms the J2EE platform's value proposition of faster solution delivery to market, interoperability and streamlined connectivity. Compatibility verifies that J2EE technology based configurations work across a heterogeneous network. It is the cornerstone of J2EE technology's success and a key benefit for end customers. By delivering J2EE compatible products, today's highlighted J2EE technology licensees give their customers the best the J2EE platform has to offer: (1) faster solution delivery time to market: with J2EE technology, enterprise development is simplified through re-usable components and the J2EE blueprints, best practice guides which help developers provide simpler, faster, and more efficient applications to customers; (2) freedom of choice: using J2EE technology makes it possible for J2EE technology based components and applications developed for one application server to run on another vendor's application server as well. For the customers of J2EE technology compatible vendors, this means solutions that are interoperable, portable, scalable, easy to adapt, and manage -- better value for their investment; (3) simplified connectivity to existing applications, with strong support for XML integration, customers achieve platform-independent, standards-based data interchange. Easy connections to existing applications streamline enterprise application integration securing enhanced interoperability and legacy investment protection."

• [January 17, 2001] "Sun promotes server version of Java." By Stephen Shankland. In CNET News.com News (January 16, 2001). "Sun Microsystems on Tuesday touted the popularity of the server version of its Java software, but IBM once again has problems with its rival's plans. Java 2 Enterprise Edition (J2EE) is a collection of software components for running e-commerce software on back-end servers. It provides a level of abstraction that insulates companies writing server software from particular details, such as what e-mail or database software is being used. J2EE is a key part of Sun's ongoing effort to steer the development of the Internet in a way that favors its technology, rather than that of its chief rival, Microsoft. While Sun has been trying to use J2EE to standardize e-commerce, Microsoft has begun creating its system, Microsoft.Net, for assembling e-commerce and other Internet operations. The next version of J2EE is expected to be completed in the third quarter of this year, Rich Green, Sun's vice president of Java software development, said at a news conference Tuesday. The upcoming version 1.3 adds support for XML (Extensible Markup Language), better security, guaranteed server availability and wireless technology. Though J2EE was introduced in December 1999, software built to the J2EE standard has been slow coming because of the complexity of creating it and making sure it passes Sun's rigorous tests. On Tuesday, Sun trotted out eight companies to show off shipping J2EE products: Art Technology Group, BEA Systems, Bluestone Software, Borland, Iona Technologies, the Sun/America Online iPlanet group, SilverStream Software and Sybase... Although IBM objected to Sun's requirements for passing conformance tests and being allowed to label products as J2EE-compliant, Big Blue eventually agreed to pass the tests. The company's next version of WebSphere e-commerce software will be J2EE-compliant, said Scott Hubner, WebSphere marketing director. But Hebner argues J2EE isn't sufficient. For example, it doesn't have enough support for new Web standards such as Universal Description Discovery and Integration, Simple Object Access Protocol, or XML..."

• [January 17, 2001] "OASIS Unites Efforts to Develop XML Security Services Standard." - "Organizations supporting divergent security standards united in an effort to develop a common XML specification through the OASIS Security Services Technical Committee. OASIS, the global XML interoperability consortium, hosted the first meeting of its new technical committee, which will define an XML framework for exchanging authentication and authorization information. Initially formed within OASIS to complete the S2ML security standard, the new committee agreed to accept submissions of other relevant technologies, including AuthXML. 'Our goal is to work together to advance a common security standard,' said Eve Maler of Sun Microsystems, chair of the OASIS Security Services Technical Committee. 'Everyone agrees that consensus is critical. Through its open technical process, OASIS provides the safe environment necessary for real collaboration.' 'The result of our work at OASIS will be a single security services standard that will be widely accepted in the industry,' predicted Marc Chanliau of Netegrity. 'We brought S2ML to OASIS with that objective in mind, and we're confident that the technical committee has the critical mass to achieve our goal.' 'Supporters of AuthXML welcome the opportunity to work within OASIS for the good of true interoperability and the XML community at large,' commented Eric Olden of Securant Technologies. 'By channeling the momentum of AuthXML into the committee, we look forward to advancing the development of a common, unified standard.' The OASIS Security Services Technical Committee includes representatives from Baltimore Technologies, Cisco, Commerce One, DataChannel, Entegrity, Entrust, Hewlett-Packard, IBM, Jamcracker, Netegrity, Oblix, OpenNetwork, Securant, SilverStream, Sun Microsytems, Tivoli, Verisign, Vordel and WebMethods. Membership is expected to increase in the coming months. 'Interest in advancing this work is extremely high,' said Karl Best, director of technical operations for OASIS. He added that record numbers of companies and individuals have joined the Consortium specifically to participate in developing a common security standard. The technical committee plans to publish draft specifications by June 2001 and to submit a formal specification to the OASIS membership by September 2001. Norbert Mikula of DataChannel, member of the OASIS Board of Directors and chair of its technical advisory committee, characterized the development schedule as, 'very aggressive.' He advised, 'Any organization affected by the issue of security should get involved now." See also (1) "AuthXML Standard for Web Security" and (2) "Security Services Markup Language (S2ML)."

• [January 16, 2001] "The W3C XML Schema Specification in Context." By Rick Jelliffe. From XML.com. January 10, 2001. ['This article compares the W3C XML Schema Definition Language with XML document instances and DTDs, SGML DTDs, Perl regular expressions, and alternative schema technologies such as RELAX and Schematron.'] "This article gives simple comparisons between the W3C XML Schemas and [related formalisms:] W3C XML instances, W3C XML DTDs, ISO SGML DTDs, ISO SGML meta-DTDs, Perl regular expressions. And some technologies that have arisen as a response to it: JIS RELAX, Schematron, DSD. It does not provide an exhaustive list of all W3C XML Schemas features. The information was prepared with the October Candidate Recommendation versions in mind. W3C XML Schemas does not operate on marked-up instances per se, but on the information set of a document after it has been parsed, after any entity expansion and attribute value defaulting has occurred. Think of it as if it were a process looking at the W3C DOM API. The result of schema-validating a document is a set of outcomes giving, in particular, any violations of constraints -- there is currently no standard API for this; however the W3C XML Schemas specification gives a complete list of the constraint violations; an enhanced information set, the post-schema-validation information set, which can include various details about type and facets -- there is currently no standard API for this either; however, the W3C XML Schemas specification gives a complete list of the additional information...W3C XML Markup Declarations (DTDs) are geared to provide simple datatyping on attributes sufficent to support graph-structures in the document only. W3C XML Schemas are intended to provide a systematic datatyping capability. W3C XML DTDs provide a basic macro facility, parameter entities, with which many good effects can be achieved. W3C XML Schemas reconstructs the most common of these in various high-level features..."

• [January 16, 2001] "XML-Deviant: Old Ghosts: XML Namespaces." By Leigh Dodds. From XML.com. January 10, 2001. ['The XML Namespaces ghost returned to haunt the XML community this Christmas. However, developers on XML-DEV fought back with a new proposal to bring predictability to the use of URIs as namespace identifiers.'] "While some of us were enjoying holiday celebrations, XML-DEV was haunted once more by that ghostly question, 'what does a Namespace URI resolve to?' This time, however, the community was reluctant to descend into a two thousand message discussion. This article summarizes the promising progress made to date. To say that XML Namespaces have been hotly debated over the last two years is a major understatement. Much confusion has been the result of using URIs, or more commonly URLs, as the unique identifier for a Namespace. For many developers this naturally raises the question: 'What does this URL resolve to?' In the Namespace FAQ you'll find that the answer is undefined. It can be something or nothing. The Namespace Recommendation is not forthcoming on what could or should be placed at a URL used as a Namespace identifier. Much debate has resolved around whether the specification should have gone further and mandated some behavior, or whether its job is complete in simply defining a means of uniquely identifying XML elements... RDDL is an extension of XHTML Basic. It provides a simple way to document your namespaces and provide links to additional resources useful when processing documents containing those Namespaces. As an XHTML-based specification it is directly browseable, so human-oriented documentation can also be included. The resources referenceable from an RDDL document are not restricted in any way and could include schemas (of varying types), CSS documents, stylesheets, executable code, etc."

• [January 16, 2001] "Sun readies latest model of J2EE." By Tom Sullivan. In InfoWorld (January 16, 2001). "Sun Microsystems on Tuesday will throw a coming-out party for J2EE (Java 2 Platform Enterprise Edition) in San Francisco to detail the latest dot release of the Java-based programming environment for enterprise applications. New in Version 1.3 are features that hone the themes of simplified connectivity, faster time to market, interoperability, and freedom of choice to integrate best-of-breed solutions. Also new in J2EE 1.3 are JMF 2.1 (Java Media Framework), EJB 2.0 (Enterprise Java Beans), and enhanced XML support including JAXP (Java API for XML Parsing) and JAXM (Java API for XML Messaging). At the event Sun will also tout its accomplishments with J2EE -- fulfilling promises to deliver the product, bringing to it a value proposition, and creating a market around it -- and will detail a new version. Perhaps the most ambitious of Sun's promises is building a market around J2EE. All told, nearly 25 application server vendors are currently licensees of J2EE. Nine vendors thus far have achieved J2EE certification and brought compliant application servers to market. The application server vendors slated to be in attendance are SilverStream Software, ATG, BEA Systems, Borland, Iona Technologies, Sybase, Bluestone Software (acquired recently by Hewlett-Packard), and iPlanet E-Commerce Solutions. 'Sun has been successful at developing a market. There is without a doubt a market for J2EE and [Version] 1.3 will further that,' said Arny Epstein, CTO of Silverstream Software, a Billerica, Mass.-based application server vendor that is J2EE certified. The notion that Sun has grown a market for J2EE is more true for customers than it is for vendors, said Mike Gilpin, vice president and research leader at analyst house Giga Information Group in Cambridge, Mass..."

• [January 16, 2001] A URN Namespace for Norman Walsh. Network Working Group, Internet Draft 'draft-nwalsh-urn-ndw-01'. By Norman Walsh, URI: http://nwalsh.com/~ndw/. July 2000. 5 pages. Abstract: "This document describes a URN namespace that is engineered by Norman Walsh for naming personal resources such as XML Schema Namespaces, Schemas, Stylesheets, and other documents." Description: "For some years, the author has been producing internet resources: documents, schemas, stylesheets, etc. In addition to providing URLs for these resources, the author wishes to provide location-independent names. In the past, this has been accomplised with Formal Public Identifiers (FPIs). FPIs provided the author with a mechanism for assigning unique, permanent location-independent names to resources. The Extensible Markup Language (XML) requires that all resources provide a system identifier which must be a URI and XML Namespaces require authors to identify namespaces by URI alone (it is not possible to provide an FPI for an XML Namespace identifier). Motivated by these observations, the author would like to assign URNs to some resources in order to retain unique, permanent location-independent names for them. This namespace specification is for a formal namespace..." [cache]

• [January 16, 2001] URN Namespace for Literate Programming: Anthony B. Coates. By Anthony B. Coates. Network Working Group, Internet Draft 'draft-coates-urn-namespace-01.txt'. October 17, 2000. URI: http://www.theoffice.net/abcoates/. "This document describes a URN namespace for use in identifying XML namespaces for use with applications created by the author, Anthony B. Coates. In particular, the author develops applications for literate programming. XML namespaces require a URI to identify them. While URLs have commonly been used for this purpose, the recent controversy over the use of relative URLs for namespaces has highlighted the deficiency in using locators (which are generally assumed to name an actual resource) for disambiguating XML namespaces. This has moved the author to request a URN namespace with the ID 'abc' (the author's initials). For professional reasons, the author finds himself changing continent every few years, and hence uses e-mail and Web redirection services whose domain names are not under his control, and hence are subject to change without notice. The assignment of a persistent URN would remove any inherent dependence on URLs outside of the author's control. This namespace specification is for a formal namespace." See: "Namespaces in XML." [cache]

• [January 16, 2001] "A Roundup of [XML] Editors. XML Matters #6." By David Mertz, Ph.D. (Transformer, Gnosis Software, Inc.). From IBM developerWorks, XML library (January 2001). ['In this column David Mertz gives an up-to-date review of a half-dozen leading XML editors. He compares the strengths, weaknesses and capabilities of each -- especially for handling text-heavy prose documents. The column addresses the very practical question of just how one goes about creating, modifying, and maintaining prose-oriented XML documents.'] "Working with marked-up prose: Perhaps it is obvious enough that the very first requirement of any approach to working with XML documents is assurance that we are producing valid documents in the process. When we use a DTD (or an XML schema), we do so because we want documents to conform to its rules. Whatever tools are used must assure this validity as part of the creation and maintenance process. Most of the tools and techniques I discuss are also a serviceable means of working with more data-oriented XML documents, but the emphasis in this column is working with marked-up prose. A few main differences stand out between prose-oriented XML and data-oriented XML. Some XML dialects, moreover, fall somewhere between these categories, or outside of them altogether (MathML or vector graphic formats are neither prose nor data in the usual ways). Prose-oriented XML formats are generally designed to capture the features one expects on a printed page (and therefore in a word processor). While most such formats aim to capture semantic rather than typographic features (e.g., the concept 'foreign word' rather than the font style 'italic'), their connection to traditional written and read materials is close. On the other hand, data-oriented XML formats mirror more closely the contents of (relational) database formats; the contents can often be thought of as records/attributes (rows/columns), and one expects patterns of recurrent data fields. In prose-oriented XML dialects, one tends to encounter a great deal of mixed content: In data-oriented XML dialects, one tends to encounter little or no mixed content. That is, most 'data' is text that has character-level markup scattered through it. In terms of a DTD, one sees elements that look something like the following example: <!ELEMENT p (#PCDATA | code | img | br | i | b | a)* >... Of course, block-level markup is also used to provide overall organization, but character-level markup is important in prose-oriented XML formats in a way that it rarely is in data-oriented XML formats. In my experience, the dichotomy between these markup levels poses the biggest challenge for XML editing tools to handle gracefully. Let's see how a few tools and approaches hold up under these requirements... Basic XML editing tools effectively add three things to a generic text editor to make them more tailored for XML editing: (1) Integrated validation of documents; (2) Hierarchical (tree) views of XML documents; (3) Integrated 'preview' of transformed XML documents (to HTML, using XSLT or CSS2, generally)..." Article also in PDF format [cache].

• [January 16, 2001] "Microsoft SOAP Toolkit: Version 2.0 and Version 1.0." From Microsoft. January 02, 2001. ['Microsoft SOAP Toolkit Update. Now available are new versions of the SOAP Toolkit for developing Web Services with today's tools.'] "The Microsoft SOAP Toolkit version 1.0 is an unsupported MSDN sample. While some of you have already used the MSDN code to solve real problems you face today, most of you approached the MSDN sample in the way it was intended -- to experiment with Web Services. Over the past few months of successful experimentation with Web Services, you have provided us with feedback that you're ready to move forward from the experimentation phase to begin developing and deploying enterprise-scale applications that implement and consume Web Services. Therefore, starting with the SOAP Toolkit version 2.0, Microsoft will deliver the SOAP infrastructure you need to build production Web Services. The Microsoft SOAP Toolkit version 2.0 is developed and supported by the same team that delivers the Microsoft XML Parser (MSXML), and represents core Web Services technology that will be used across a number of Microsoft's own software products. Thus, having fulfilled its primary purpose, the December release of SOAP Toolkit version 1.0 is the final update of the MSDN sample. The Microsoft SOAP Toolkit version 2.0 adds complete support for the Web Services Description Language (WSDL) 1.0 Specification, Universal Description, Discovery and Integration (UDDI), and the SOAP Specification version 1.1. While the Beta 1 release doesn't include full support of these specifications, subsequent beta releases will fill out the missing functionality with a feature-complete version delivered by the end of the first quarter of 2001. Microsoft is also announcing that the final release of SOAP Toolkit version 2.0 will be fully supported by Microsoft product support. With the SOAP Toolkit version 2.0, Microsoft has responded to customer requests to make some critical architectural changes, primarily to support the latest level of standards specifications and to implement a programming model designed for rapid developer productivity. These changes might require that you modify some of the existing application code that you've written using the SOAP Toolkit version 1.0 sample code."

• [January 16, 2001] "MSXML 3.0 MergeModule Redistribution Package." From Microsoft. January 2001. "Distribute MSXML 3.0 with your applications using the Microsoft Windows Installer technology. The Microsoft XML Parser (MSXML 3.0) MergeModule Redistribution Package makes it possible for developers to distribute MSXML 3.0 with their applications using the Microsoft Windows Installer technology. This redistribution package includes file version 8.0.7820.0 of the MXSML 3.0 file. You can use the MSXML 3.0 MergeModule Redistribution Package for redistribution on downlevel platforms, including Microsoft Windows 98, Microsoft Windows NT 4.0, Microsoft Windows 2000, and Windows Me. This article includes complete file version information for all MSXML releases. Merging can be performed with Orca.exe, which can be installed from the MSI SDK. In the merging sample below, MSXML3.MSM is merged with the MyTestFeature feature in the MyTest.MSI file, and a log file named Test.log is created." The Microsoft XML Parser MergeModule Redistribution Package is available for download.

• [January 15, 2001] OLIF Format Specification Version 0.9 - May-5-2000. "The OLIF specification (especially the header) is patterned after the TMF format and the Corpus Encoding Standard. The purpose of the OLIF format is to provide a standard method to describe data that is being exchanged among NLP tools and/or translation vendors, while introducing little or no loss of critical data during the process. OLIF is XML-compliant. It also uses various ISO standards for date/time, language codes, and country codes. OLIF files are intended to be created automatically by export routines and processed automatically by import routines. OLIF files are "well-formed" XML documents that can be processed without explicit reference to the OLIF DTD. However, a "valid" OLIF file must conform to the OLIF DTD, and any suspicious OLIF file should be verified against the OLIF DTD using a validating XML parser. Since XML syntax is case sensitive, any XML application must define casing conventions. OLIF uses mixed-case.

• [January 15, 2001] "OpenMath, MathML and XSL." By David Carlisle (The Numerical Algorithms Group Ltd, Oxford UK). In SIGSAM Bulletin Volume 34, Number 2 (June 2000), pages 6-11 (with 9 references). "The paper describes OpenMath and MathML, two closely related languages that are currently being developed to encode mathematical expressions, and also discusses the XML transformation language, XSL, that may be used to perform translations between these languages. XML uses a system of namespaces to avoid name clashes between different languages. In contrast, the languages discussed in this paper all have such a namespace defined. OpenMath and MathML, together with XSL, are proving to be a very effective combination that offers real promise of delivering documents with mathematical content marked up in a semantically rich language, and being presented in conventional mathematical notation. This should be possible without the reader having to install any non-standard or proprietary software. In the short term, the Mozilla browser is the nearest to delivering such a system, but it seems likely that other browsers will soon implement at least some of the necessary tools. Internet Explorer will certainly implement XSL, and Amaya (the W3C's testbed browser) currently implements most of Presentation MathML. Third-party plug-ins or applets may be used to display MathML in browsers which do not have native MathML rendering."

• [January 15, 2001] "Annex C on XML. Informative. From ISO CD 15046-18. Extensible Markup Language (XML). C.1 Introduction: "The purpose of this Annex is to give a short introduction to XML and the rationale behind the design of this standard. This standard uses XML in a way inspired by OMG's XML Metadata Interchange (XMI) specification. XML is an open, platform independent and vendor independent standard. It supports the international character set standards of ISO 10646 and Unicode. The XML standard is programming language neutral and API-neutral. A range of XML APIs are available, giving the programmer a choice of access methods to create, view, and integrate XML information. The cost of entry for XML information providers is low. XML's tag structure and textual syntax make it as easy to read as HTML, and it is clearly better for conveying structured information. The cost of entry for automatic XML document producers and consumers is also low. A growing set of tools is available for XML development. XMI is an XML based exchange standard for exchange of object-oriented metadata models. The purpose of XMI is to allow exchange UML models between modelling tools in a vendor neutral way. It is based on OMG's Meta Object Facility and on CORBA data types. XMI can in theory be used to exchange data based on UML models, but are not primarily designed for this purpose. This standard is therefore designed based on the principles of XMI, but simplified and adapted to suit the needs of this family of standards. And thus more specialised to allow exchange of data based on UML directly. Clause C.2 and C.3 give introductions to XML and XMI, respectively. Clause C.4 outlines some of the differences between XMI and this standard and clause C.5 gives some references for further reading..." [cache]

• [January 15, 2001] Incident Object Description and Exchange Format." IETF INTERNET DRAFT 'draft-terena-itdwg-iodef-requirements-00.txt'. By Jimmy Arvidsson, Andrew Cormack, Yuri Demchenko, and Jan Meijer. November 15, 2000. "The purpose of the Incident object Description and Exchange Format is to define a common data format for the description, archiving and exchange of information about incidents between CSIRTs (including alert, incident in investigation, archiving, statistics, reporting, etc.). This document describes the high-level requirements for such a description and exchange format, including the reasons for those requirements. Examples are used to illustrate the requirements where necessary... This document defines requirements for the Incident object Description and Exchange Format (IODEF), which is the intended product of the Incident Taxonomy Working Group (ITDWG) at TERENA. IODEF is planned as a standard format which allows CSIRTs to exchange operational and statistical information; it may also provide a basis for the development of compatible and inter-operable tools for Incident recording, tracking and exchange. Another aim is to extend the work of IETF IDWG (currently focused on Intrusion Detection exchange format and communication protocol) to the description of incidents as higher level elements in Network Security. This will involve CSIRTs and their constituency related issues. The IODEF set of documents of which the current document is the first will contain IODEF Data Model and XML DTD specification..." Note that one of the of the Incident Taxonomy and Description Working Group (TF-CSIRT) is an "Incident Object Elements Description and XML Data Type Description (XML DTD)." [cache]

• [January 15, 2001] XML Encoding for SMS Messages. Internet-Draft 'draft-koponen-sms-xml-00.txt'. November 16, 2000. Expires: May 17, 2001. By Juha P. T. Koponen, Teemu Ikonen, and Lasse Ziegler (First Hop Ltd.).

• [January 15, 2001] IAP: Intrusion Alert Protocol Internet Engineering Task Force. IDWG Internet Draft 'draft-ietf-idwg-iap-03.txt' "Intrusion Alert Protocol (IAP) is an application--level protocol for exchanging intrusion alert data between intrusion detection elements, notably sensor/analyzers and managers across IP networks. The proto- col's design is compatible with the goals for the HyperText Transfer Protocol (HTTP). The specification of alerts carried using this protocol is described in a companion document of the intrusion detection working group of the IETF."

• [January 15, 2001] "Voxeo Opens Telephony to Web Developers." By Shannon Cochran. In DDJ News (November 2000). "'Telephony's historic protocols and media and tools are viciously hard to get into,' counsels John Jainschigg, Editor-in-Chief of Computer Telephony. Jainschigg says the challenge exists, -- 'because there's a whole new vocabulary to learn, and because the protocols address...the vagaries and arcane characteristics of a heterogeneous global analog/digital hybrid phone network.' But a startup called Voxeo may be poised to help web developers enter this formidable new world. Founded this year by Jonathan Taylor and Gary Reback (of Microsoft antitrust fame), Voxeo plans to operate as an ASP of telephony, charging customers for use of its phone-to-web infrastructure. Its goal is to simplify the process of creating and deploying phone/Web applications... Hoping to attract developers to its infrastructure, Voxeo has launched a 'developer community portal' featuring tutorials, reference applications, a graphical design tool, and 24-hour technical support for developers. The services are all free, at least until January 15, 2001. Voxeo's applications reside on traditional web servers, but use phone markup languages -- VoiceXML, Microsoft's Web Telephony Engine, or Voxeo's own CallXML language -- to describe the presentation of a phone call instead of a web page. 'CallXML is useful for handling call control actions such as placing an outbound call or conferencing calls together. CallXML is also useful for easily detecting keyed input from a telephone,' explains the developer site. 'VoiceXML is useful for handling voice recognition as well as connecting users to existing web content.' Voxeo's system can also incorporate Perl, PHP, ColdFusion, JRun, and Active Server Pages. Voxeo Designer, also offered on the site, is a visual design environment for the phone mark-up languages. Possible applications within the Voxeo system include unified messaging, follow-me find-me, Internet call waiting, phone-enabled instant messaging, web-to-phone notifications, virtual call centers, broadcast fax or voice messages, and multimedia conference calling. The developer's site includes open-source programs of these types: Voxeo is planning to move some of them to SourceForge. Voxeo is also currently providing a network for test deployment."

• [January 15, 2001] "ADL and SCORM." By Kevin Cox. In Web Tools Newsletter. July 10, 2000. "In 1997 the Department of Defense in the USA initiated the Advanced Distributed Learning (ADL) initiative. A major part of the initiative has become the development of a Shareable Course Object Reference Model. (SCORM). SCORM addresses the following problems: [1] Moving a course (including student information) from one learning platform to another (e.g., from WebCT to LearningSpace) [2] Creating reusable chunks of course material for use in other courses [3] Searching for course material The techniques used to overcome these problems are the same as other areas. It is the same idea as behind Microsoft's .NET. SCORM defines a standard way for defining and accessing information about learning 'objects'. Once you have a common language (standard) then systems that are built using the language can 'talk' to each other. How does it do it? It does it by defining the data and its meaning using XML. So far SCORM has defined: [1] an XML-based specification for representing course structures (so courses can be moved from one server/LMS to another); [2] a set of specifications relating to the run-time environment, including an (Application Programming Interface) API, content-to-LMS (Learning Management System) data model, and a content launch specification; [3] and a specification for creating metadata records for courses, content, and raw media elements. Conceptually XML is not too hard to understand. XML is an extension of HTML so that users can define their own tags for their own purposes. SCORM XML definitions are a set of Tags that define things about courses. If you have a choice of Learning Management Systems consider SCORM compliance in your selection criteria..."

• [January 15, 2001] "Intensional and Extensional Languages in Conceptual Modelling." By Marko Niinimaki (Department of Computer Science, University of Tampere, Finland). In Proceedings of the Tenth European-Japanese conference on information modelling and knowledge bases Hotelli Riekonlinna, Saariselka, Lapland, Finland, May 8-11, 2000. "In conceptual modelling, there are many competing views of the role of the modelling language. In this paper, we propose a clarifying classification of different kinds of languages. This classification, based on the semantical background theory of each kind of language, divides the modelling languages into three categories: extensional modelling languages, languages based on concept calculus (intensional languages) and hybrid languages. The classifi-cation provides the background for studies of applicability of a modelling lan-guage. Based on an example, we observe that some features of the intensional approach are actually only terminologically different from those of the exten-sional one. We observe, too, that because of the clear semantic background, hybrid languages seem promising. Using them in conceptual modelling would benefit from a good methodology."

• [January 15, 2001] "First Steps in an Information Commerce Economy Digital Rights Management in the Emerging EBook Environment." By Eamonn Neylon (Manifest Solutions). In D-Lib Magazine [ISSN: 1082-9873] Volume 7, Number 1 (January 2001). "The delivery of digital content to consumers in a trusted manner allows business models to be tried that are different from existing forms of publishing. Thus rights management technologies take a central place in the development of the eBook ecology by providing the ability to enforce and negotiate usage restrictions. This emphasis on the control of usage rather than access is critical in distinguishing eBook publishing from other types of publishing that have gone before it. Press coverage of the eBook marketplace could lead readers to believe that an explosion is imminent in this new method of publishing. However, there are outstanding issues that need to be addressed for the current hype to bear some semblance to reality. All stakeholders will need to become active in informing how the eBook evolves as an economically viable resource type... DRM vendors have been actively participating in two industry groups: the Open eBook Forum (OeBF) and the Electronic Book Exchange (EBX). The DRM vendors see eBooks as a new market ripe for the use of protection technologies. Their interests are primarily in establishing principles from which actual business implementations can be established. Within these groups there is an emphasis on the expression of rights and the enforcement of expressed rights. eBooks present two interesting problems to rights management systems. These can be simply stated as how to: (1) Express the conditions and usages that are permitted by the rights-holder -- while respecting the pre-existing entitlements of the consumer; (2) Enforce those usage conditions in a range of environments that have different levels of trust -- and are not necessarily connected to an online authority. The enforcement of rights requires a consistent expression of rights to allow different systems to consistently interpret what is required. Two candidate languages have been promoted as means to achieve a universal means of expressing what is conferred in a sale or license of content. ContentGuard's eXtensible Rights Markup Language (XrML) is a licensable specification for expressing rights in XML based on work conducted at Xerox PARC. XrML has been criticized for its lack of process for developing the language, and there have been concerns about the terms under which the specification may be licensed. Open Digital Rights Language (ODRL) is an alternative rights expression language, still in its infancy, which is being proposed by IPR Systems to the World Wide Web Consortium as an open standard to be developed within the established process of the W3C..." See Open Ebook Initiative.

• [January 15, 2001] "An Application of XML and XLink Using a Graph-Partitioning Method and a Density Map for Information Retrieval and Knowledge Discovery." By Damien Guillaume and Fionn Murtagh (Université Louis-Pasteur, University of Ulster). 1999. In ASP Conference Series, Vol. 172, Astronomical Data Analysis Software and Systems VIII, edited by D. M. Mehringer, R. L. Plante, and D. A. Roberts. "We have defined an XML language for astronomy, called AML (Astronomical Markup Language), able to represent meta-information for astronomical objects, tables, articles and authors. The various AML documents created have links between them, and an innovative tool can cluster the documents with a graph-partitioning algorithm using the links. The result is displayed on a density map similar to Kohonen Self-Organising Maps. AML and its advantages will be briefly described, as well as the clustering program, which is one of the many possible applications of AML... Using the ideal features of XML for information retrieval, and its associated language for the links, XLink, we have implemented a new tool for knowledge discovery and used it with astronomical documents. This tool seems very useful, and the only drawback is that, as it is using distributed resources dynamically, it cannot be used in real-time because of the time required to download the documents. Its computational requirements are, however, not far from real-time, with access time dominating processing time..."

• [January 15, 2001] "Voxeo Speaks Out. [Vision Thing.]" By Bill Michael and John Jainschigg. In ComputerTelephony (January 05, 2001), pages 32-36. ['What's left to do, after breaking up Microsoft? Star antitrust attorney Gary Reback has turned entrepreneur - his startup, Voxeo, is building a network of open-platforms-in-the-sky for telephony application development and hosting.'] "Gary Reback has quit practicing law and founded a high-tech startup - in the process, handing biz/tech journalists one of the year's juciest storyline hooks. Reback's new company, Voxeo (Scotts Valley, CA - 831-439-5130) proposes to solve the 'telephony problem' for web-centric developers. They're building a community, providing low-or-no-cost tools, and inventing cooperative strategies to simplify XML-based telephony application development and remote testing; exploiting popular and broadly-supported XML variants such as VoiceXML and WML (plus a CallXML variant of their own) at toplevel. They're engineering middleware that coordinates between TML (Telephony Markup Language - our generic term for beasts like VoiceXML) scripts and the scary, underlying phone stuff; and permits remote integration of telephony to existing e-commerce back-ends. Ultimately, they aim to become profitable by hosting the 'telephony parts' of applications on a national network of servers. In a sense, it's a 'global platform play.' But with a distinctly open, Internet twist. Broadly-supported telephony markup langauges at the front end neatly abstract away from the underlying middleware and hardware, eliminating fears of lock-in. The development model is simple: compose a one-call app description, post it, and watch the middleware run it on an arbitrary number of lines. The prohibitive cost of buying a development platform is eliminated, as are fears that the app you write on a small development system won't scale neatly to higher line-counts and traffic metrics. Voxeo has designed their middleware so that (in theory) you don't have to worry about that stuff. And of course, integrating with your existing e-commerce infrastructure is close-to-transparent, since it's all happening at the level of XML scripts browser data-models, and arbitrary URL references..."

• [January 14-17, 2001] ARTS-IXRetail XML Event at NRF Annual Convention. "ARTS activities planned for the National Retail Federation Annual Convention at the Javits Center, New York City January 14 - 17, 2001 include: (1) Sunday January 14 at 1:00 - 3:15 in Room IE-16 will be the first public demonstration of the new IXRetail standard using XML to connect various applications from different vendors across multiple platforms. You will see XML connect POS to Price Management and Inventory including transactions from RF, wireless and the Internet. The XML messages will build on the work of ActiveStore schemas and use the new ARTS XML Data Dictionary. Retailers including The Limited and Nordstrom will speak to the importance of XML and associated standards. (1) Monday January 15 at 2:00 - 3:30 in Room IE-08 is the ARTS Member meeting. This is your chance to tell the Board what standards you need to more effectively operate your business. What devices should be included in the UnifiedPOS specification, what new business functions and supporting data should be added to the Data Model, and in what areas should IXRetail focus it's XML message development? A full report on activities completed in 2000 will be provided. Members and prospective members are welcome to attend."

• [January 14, 2001] Resource Directory Description Language (RDDL). Version 'January 14, 2001'. Edited by Jonathan Borden (The Open Healthcare Group) and Tim Bray (Antarcti.ca Systems). Jonathan Borden notes: "The most substantial changes [in RDDL draft 20010114] are in the specification of the xl:role and xl:arcrole attributes. I've also placed a preliminary RELAX schema (derived from MURATA Makoto's XHTML Basic RELAX schema) as a new resource in the document (see http://www.rddl.org/#RELAX and http://www.rddl.org/#ZIP )." "This document defines Resource Directory Description Language (RDDL). A Resource Directory provides a text description of some class of resources and of other resources related to that class. It also contains a directory of links to these related resources. An example of a class of resources is that defined by an XML Namespace. Examples of such related resources include schemas, stylesheets, and executable code. A Resource Directory Description is designed to be suitable for service as the body of a resource returned by dereferencing a URI serving as an XML Namespace name. The Resource Directory Description Language is an extension of XHTML Basic 1.0 with an added element named resource. This element serves as an XLink to the referenced resource. The Resource Directory Description 1.0 DTD has been defined according to Modularization for XHTML. This document defines the syntax and semantics of the Resource Directory Description Language, and also serves as a Resource Directory Description for the namespace http://www.rddl.org/." The RDDL spec, DTDs and other contents of the directory, zipped for download ('rddl-20010114.zip'), cache. [specification cache]

• [January 14, 2001] "How to validate XML." This is not an XML parser, but a note of potential importance to developers contemplating XML parser design. From Joe English. "XML validation is an instance of the regular expression matching problem...The most commonly-used technique to solve this problem is based on finite automata. There is another algorithm, based on derivatives of regular expressions, which deserves to be more widely known..." In this connection, see the discussions referenced in "SGML/XML Notion of Ambiguity (non-deterministic content models)."

• [January 14, 2001] simple RDDL 'parser' for use on Microsoft's .NET platform. The source archive, the API documentation, and a test web application that can extract the resources out of any online RDDL directory are available [online]. The API isn't quite usable enough for the type of processing that we're looking for yet but it's a start. I'd appreciate any and all comments that you might have..." From Jason Diamond, Sun, 14 Jan 2001.

• [January 13, 2001] DocBook Schemas. Norm Walsh (Sun Microsystems) recently announced the (alpha) release of schemas for DocBook, including W3C XML Schema, RELAX, and TREX. "I have just updated the experimental XML Schema for DocBook. You can get the new version (and the ChangeLog) at http://www.oasis-open.org/docbook/xmlschema/4.1.2.3/. ['The DocBook XML Schema V4.1.2.3 attempts to be an accurate translation of the DocBook XML V4.1.2 DTD. In this version, the parameterization of the schema is roughly identical to the parameterization of the DTD. This may change as I begin to experiment with the construction of derivative schemas.'] I have also produced a RELAX schema for DocBook V4.1.2, although I have no tool that validates all of RELAX so, while I believe it is correct, I can't demonstrate it; see http://www.oasis-open.org/docbook/relax/4.1.2.1/. Finally, I produced a TREX schema from the RELAX schema. I tweaked it a bit by hand (removing a few redundancies), so it's not quite a simple mechanical transformation. See http://www.oasis-open.org/docbook/trex/4.1.2.1/. Norm also released a preliminary version of the JRefEntry DTD, which represents a customization of the DocBook RefEntry model: "the purpose of this customization is to mirror the order and nature of structured comment tags in JavaDoc documentation." See the JRefEntry web page for details.

• [January 13, 2001] Zvon XML Schema Reference. By Miloslav Nic. January 09, 2001. "This reference is based on W3C Candidate Recommendations for XML Schema Part 1: Structures and XML Schema Part 2: Datatypes. This reference will be upgraded when the standard is finalized. This reference consists of two parts: (1) Schema browser - based on the analysis of normative XML Schema; (2) DTD browser - based on the analysis of non-normative DTD. Main features of the XML Schema reference include: (1) Clickable indexes and schemas. (2) Click on 'Annotation Source' leads to the relevant part of the specification." The Zvon web site provides tutorials for a wide range of XML-related technologies (DOM, XSLT, CSS, XML DTDs, XHTML, XLink, XPointer, SVG, etc.).

• [January 13, 2001] "Intrusion Detection Message Exchange Format. Comparison of SMI and XML Implementations." Intrusion Detection Working Group. IETF Draft 'draft-ietf-idwg-xmlsmi-01.txt." By Glenn Mansfield (Cyber Solutions, Inc.) and David A. Curry (Internet Security Systems). "The purpose of the Intrusion Detection Message Exchange Format (IDMEF) is to define data formats and exchange procedures for sharing information of interest to intrusion detection and response systems, and to the management systems which may need to interact with them. Two implementations of the IDMEF data format have been proposed: one using the Structure of Management Information (SMI) to describe a MIB, and the other using a Document Type Definition (DTD) to describe XML documents. Both representations appear to have their good and bad traits, and deciding between them is difficult. To arrive at an informed decision, the working group tasked the authors to identify and analyze the pros and cons of both approaches, and to present the results in the form of an Internet-Draft. The initial version of this draft was reviewed by the IDWG at the February, 2000 interim meeting where it was tentatively decided that the XML/DTD solution was best at fulfilling the IDWG requirements. This decision was finalized at the March, 2000 IETF IDWG meeting." [cache]

• [January 13, 2001] "Intrusion Detection Message Exchange Requirements." Intrusion Detection Exchange Format Working Group . Internet Engineering Task Force, Internet Draft 'draft-ietf-idwg-requirements-04.txt'. By Mark Wood (Internet Security Systems, Inc.). " The purpose of the Intrusion Detection Exchange Format is to define data formats and exchange procedures for sharing information of interest to intrusion detection and response systems, and to the management systems which may need to interact with them. This Internet-Draft describes the high-level requirements for such communication, including the rationale for those requirements where clarification is needed. Scenarios are used to illustrate the requirements." [cache]

• [January 13, 2001] "Intrusion Detection Message Exchange Format. Extensible Markup Language (XML) Document Type Definition." Intrusion Detection Working Group. IETF Internet Draft 'draft-ietf-idwg-idmef-xml-01.txt'. By David A. Curry (Internet Security Systems, Inc.). 2000-07. [cache]

• [January 13, 2001] "Intrusion Detection Exchange Format Data Model." Internet Engineering Task Force, Internet Draft draft-ietf-idwg-data-model-03.txt. By Herve Debar, Ming-Yuh Huang, and David J. Donahoo. "The purpose of the Intrusion Detection Exchange Format is to define data formats and exchange procedures for sharing information of interest with intrusion detection and response systems, and with the management sys- tems that may need to interact with them. This Internet-Draft describes a proposed data model to represent the information exported by the intrusion-detection systems, including the rationale for this model. This information is herein refered to as 'Alert'..." [cache]

• [January 13, 2001] IDMEF Data Model and XML DTD." Provisional 'draft-ietf-idwg-idmef-xml-02.txt'. By D. Curry, H. Debar, M. Huang. December 05, 2000. 86 pages. [This provisional version "is (under version -02 of the XML draft) an attempt to merge the data model and the XML representation, to avoid divergences between the two.'] "The purpose of the Intrusion Detection Message Exchange Format (IDMEF) is to define data formats and exchange procedures for sharing information of interest to intrusion detection and response systems, and to the management systems that may need to interact with them. The goals and requirements of the IDMEF are described in [req document]. This Internet-Draft describes a proposed data model to represent the information exported by the intrusion-detection systems, including the rationale for this model, and a proposed implementation of this data model, using the Extensible Markup Language (XML). The rationale for choosing XML is explained, a Document Type Definition (DTD) is developed, and examples are provided. An earlier version of this implementation was reviewed, along with other proposed implementations, by the IDWG at its September 1999 and February 2000 meetings. At the February meeting, it was decided that the XML solution was best at fulfilling the IDWG requirements." Extracted from http://www.semper.org/idwg-public/0247.html.

• [January 12, 2001] "Pocket computers get the picture." By Rachel Lebihan. From ZDNet (January 11, 2001). "Until now, the graphics for pocket computers have been poor, but with the popularity of personal digital assistants (PDAs) increasing, clearer graphic displays are in the pipeline. The Commonwealth Scientific and Industrial Research Organisation (CSIRO) is targeting the pocket PC platform with its recently developed software that implements a format emerging as the major standard for Web graphics - Scalable Vector Graphics (SVG) 1.0. An open industry standard based on XML, SVG images remain clear and detailed no matter how much you zoom or rescale them and it has the backing of industry big wigs including, Adobe, Sun Microsystems and Kodak... The CSIRO software works on Windows CE based handheld devices and has sparked interest in Europe and the US, although "we would very much like to talk with more Australian industries," Ackland said. The CSIRO claims to be in discussions with an Australian utilities company. "This is an area where we see mobile opportunities becoming very popular," Ackland said. "Builders could check house plans on site, electricity workers could view complex network diagrams." PDAs displaying SVG images will be available in the next six months, according to Ackland.

• [January 12, 2001] "Living Language." By Mark Pesce. From Feedmag.com (January 2001). ['HTML revolutionized the way information is shared worldwide. Can a new language do the same for the human genome? Mark Pesce reports'] "Everything's coming up roses! Or more precisely, Arabidopsis thaliana, a somewhat nondescript white flower selected by biologists as the model in the plant kingdom for genetic research. Related to broccoli and cauliflower, A. thaliana is the most studied plant in human history; every week new papers are published about its properties. In an astounding leap forward, the journal Nature just announced that the entire genome for A. thaliana has been sequenced first for the vegetal world. As plants go, it was a relatively easy task; the complete genome runs to 120 megabytes of information, compared to 1.6 gigabytes for wheat and a hefty 3 gigabytes for humanity. What makes this discovery just a bit different from the ever-increasing flow of genetic revelations is that, in another first, Nature has announced that all genomic information presented in its pages -- and on its Web site -- will be published in GEML, or Gene Expression Markup Language, a lingua franca defining a common standard for the bits of life... WHAT IS GEML exactly? It's a DTD (Document Type Definition) for the common expression of genetic information. Those of you who have done any Web design are likely familiar with another DTD -- HTML -- and its 'tags,' those little bits of formatting information enclosed by the '' symbols. In HTML there are tags such as 'TITLE' (which gives a page its title), 'B' (for bold), 'IMG' (for images) and so forth. GEML has its own tags, which define the kinds of data that interest geneticists... GEML, together with some clever computer programs, could help scientists greatly accelerate the process of winnowing the chaff from the grain of our genetics, allowing them to share their complementary (and often conflicting) databases of identified gene sequences to produce a more accurate map of ourselves. Craig Venter, CEO of Celera Genomics, has openly speculated that mapping the human genome onto gene sequences could take the next fifty years; with GEML, this estimate could easily be cut in half, provided that geneticists in competitive commercial organizations find it more profitable to share what they've learned than to keep it hoarded away and hidden from view. GEML ISN'T alone. It has a competitor, another DTD known as CellML, used to define the complex interactions that take place within cells. CellML takes an integrated approach to describing all of the processes within a living cell -- its genes, proteins, enzymes, and chemical reactions, the pathways and connections between each part of the whole. CellML seems well suited to the kinds of work that supercomputers do -- creating simulations of incredibly complex systems -- while GEML only defines the genetics that create the cell. Neither GEML nor CellML may be the final word in this convergence between biology and information. And, despite Metcalfe's Law -- which states that the value of a thing increases as more and more people use it -- the CEOs of the genomics companies are at least a little afraid that if knowledge advances too widely, their hard-earned advantages will slip away like water through their fingers. The year 2001 is to the genomics industry what the year 1991 was to informatics. The pieces are all in place for an incredible explosion in discovery, creativity, and wealth. But they're locked behind the prison walls of fear..."

• [January 12, 2001] "Semantic Web Technologies Workshop - Report." By Martin Bryan (Technical Manager, The Diffuse Project). From the Diffuse Project, December 2000. "This brief summary of some of the key points raised during presentations to the IST Semantic Web Technologies Workshop held by the Information Society Directorate of the European Commission in Luxembourg on 22nd/23rd November 2000 covers information gleaned from the following presentations..." See the main conference entry. [cache]

• [January 12, 2001] "RDF Terminology and Concepts." Edited by Graham Klyne. "This is currently a live working document, being a collection of suggestions from participants in the W3C RDF Interest Group." [cache]