Cover Pages: XML and Databases

Provisional references to resources on XML and Databases.

"All the relational vendors are trying hard to find ways of storing XML in relational databases. There are basically two approaches: flat storage and shredded storage. Flat storage stores an XML document in a cell of a table, shredded storage normalizes it into millions of rows and columns... In my view neither works well at all; but unfortunately, relational databases are what we've got to work with on the average consumer PC, and more advanced technologies like object databases have made little headway." — Michael Kay, GenealogyXML, 2004-02-03.

General Resources
Articles, Papers, News
Software: Projects, Frameworks, Packages, Products

General Resources

"XML and Databases." By Ronald Bourret. "This paper briefly discusses the relationship between XML and databases and lists some of the software available to process XML documents with databases. Although it is not intended to be exhaustive or provide in-depth evaluations of all the available software, I hope that it describes some of the major issues in using XML with databases. It is somewhat biased towards relational databases simply because that is where my experience is..."

XML Database Products. By Ronald Bourret. Updated November 08, 2000 or later. "The number of products for using XML with databases is growing with amazing speed -- new products seem to enter the market weekly. In this Web page, I have tried to capture the current state of the market, gathered from Web sites, product reviews, XML webzines, and other XML resource guides. . . Although complete description of how to use XML with databases is beyond the scope of this page, a brief review will help you choose what product is right for you. XML documents fall into two broad categories: data-centric and document-centric. Data-centric documents are those were XML is used as a data transport. They include sales orders, patient records, and scientific data and their physical structure -- the order of sibling elements, whether data is stored in attributes or PCDATA-only elements, whether entities are used -- is often unimportant. A special case of data-centric documents is dynamic Web pages, such as online catalogs and address lists, which are constructed from known, regular sets of data. Document-centric documents are those in which XML is used for its SGML-like capabilities, such as in user's manuals, static Web pages, and marketing brochures. They are characterized by irregular structure and mixed content and their physical structure is important. To store and retrieve the data in data-centric documents, you will need a database that is tuned for data storage, such as a relational or object-oriented database, and some sort of data transfer software. This may be built in to the database or might be third-party middleware. Depending on your needs, you may need Web-publishing abilities as well..."

Document Storage and Management. Software product listing in the "Free XML tools and software" list, by Lars Marius Garshol. This section lists tools for supporting document management, such as document databases and search engines. (1) XML document database systems Systems for persistently storing XML documents and providing access to their structure and individual parts; storing XML documents as blobs does not qualify. (2) XML document management utilities. (3) XML search engines.

"XML and Query Languages." See this reference collection for a number of research projects developing XML database management solutions in conjunction with XML-based query engines (e.g., SIM - The Structured Information Manager; Lore).

[March 06, 2001] XML Database Discussion List. A posting from Kimbro Staken (Chief Technology Officer, dbXML Group L.L.C.) announced the formation of a new mailing list for general discussions about XML database technologies. The mailing list is hosted by the XML:DB XML Database initiative. The list is designed as a "vendor neutral open forum and discussion of any topic related to XML database technology and standards is acceptable and encouraged." The forum is not intended for marketing, although announcements are acceptable if the list guidelines are followed. The new list had some 60 subscribers as of February 26, 2001, and is publicly archived. [Discussion]

Articles, Papers, News, Reviews

[August 14, 2008] Using XML and Databases: W3C Standards in Practice. By Bill Trippe (Gilbane, Senior Analyst) and Dale Waldt (Gilbane Contributing Analyst). From Gilbane Group Analysts. White Paper sponsored by EMC 'Where Information Lives'. 24 pages. February 2008. Reprinted with permission. This paper provides an excellent survey of technical issues, standards, and use cases relevant to the adoption of XML-based technologies for industrial strength applications supporting enterprise content management. From the Executive Summary:

XML has emerged as a powerful format for representing data in a wide variety of fields, from technical data to finance to healthcare. Unlike traditional data formats, such as relational data, XML has a hierarchical structure that can be used to model virtually any type of data. In addition, XML is far more flexible and forgiving of change than other formats.

XML presents a number of interesting challenges and opportunities for data storage. Relational databases and full-text search mechanisms that have been the backbone of many applications are not designed to manage XML content effectively. A new class of databases has emerged that is designed specifically to manage XML content. Typically called 'XML Native Databases' or just 'XML databases,' they incorporate functionality that greatly improves the management, searching, and manipulation of XML to produce the most effective XML data management solution.

The World Wide Web Consortium (W3C), the standards organization that developed XML, has also developed many standards that can be used to access, search, process, and store XML data. XML databases take advantage of these standards to provide efficient and precise access, query, storage, and processing capabilities not found in traditional database technology. The result is that applications using XML databases are more efficient and better suited for managing XML data.

These W3C standards, including XML Schemas, XSLT, DOM, XLink, and XQuery, are well established and tested in real world applications. The XML databases that take advantage of them provide the platform for industrial strength applications to manage XML content.

Like any new technology, adoption is slow at first. Then as the technology matures and understanding on how to best deploy increases, applications emerge that demonstrate the advantages of the approach. Today, we can find many applications to manage XML content that demonstrate the power and flexibility that can only be achieved through XML-native databases. Information intensive companies such as the airline and manufacturer described in this paper have achieved significant technical and business benefits from their use of XML standards and database technology over alternative approaches.

[July 03, 2008] "Oracle Updates Family of Open Source Berkeley DB Embeddable Databases." Announcement 2008-06-30: "Oracle Updates Entire Family of Oracle Berkeley DB Embeddable Databases. Continues Commitment to Open Source." — Oracle has announced new releases of Oracle Berkeley DB, Oracle Berkeley DB XML, and Oracle Berkeley DB Java Edition. The new releases and enhancements signify Oracle's commitment to continued innovation across the Oracle Berkeley DB product family, while maintaining the open source dual license business model. All three members of the Oracle Berkeley DB family are designed for high performance, reliability and embeddability within applications that have predictable data access patterns and therefore do not require a query language like SQL. For these applications, SQL is often unnecessary and may slow the overall performance. The software libraries directly linked into the application, eliminating the performance penalty of client-server architectures. Oracle Berkeley DB Release 4.7 now provides improved caching efficiency, faster database recovery, a cycling master feature for replicated (HA) environments, support for the Java Direct Persistence Layer (DPL) API, and QNX RTOS support Oracle Berkeley DB XML is an open source, embeddable XML database with XQuery-based access to documents stored in containers and indexed based on their content. Oracle Berkeley DB XML is built on top of Oracle Berkeley DB and inherits its rich features and attributes. Like Oracle Berkeley DB, it runs in process with the application with no need for human administration. Oracle Berkeley DB XML Release 2.4 offers support for XQuery update, cost-based query optimization for significantly improved performance and XQilla for XQuery processing. XQilla is an XQuery and XPath 2 library and command line utility written in C++, implemented on top of the Xerces-C library; it is made available under the terms of the Apache License v2. Oracle Berkeley DB Java Edition Release 3.3 provides improved scalability, caching improvements, more efficient in-memory and on disk storage, support for Google Android and Apache Maven... See also Oracle Berkeley DB.

[November 08, 2005] "XML and Semi-Structured Data." By C. M. Sperberg-McQueen (World Wide Web Consortium). From ACM Queue Volume 3, Number 8 (October 2005), pages 34-41. Special Issue on Semi-Structured Data. "XML makes several contributions to solving the problem of semi- structured data, the term database theorists use to denote data that exhibits any of the following characteristics: (1) Numerous repeating fields and structures in a naive hierarchical representation of the data, which lead to large numbers of tables in a second- or third-normal form representation; (2) Wide variation in structure; (3) Sparse tables. XML provides a natural representation for hierarchical structures and repeating fields or structures. Further, XML document type definitions (DTDs) and schemas allow fine-grained control over how much variation to allow in the data: Vocabulary designers can require XML data to be perfectly regular, or they can allow a little variation, or a lot. Because the core semantics of an XML document rely not on particular application software but on declarative semantics that are (or should be) explicitly documented, the use of XML really does help ensure data longevity and reusability. Sometimes a rather thin, syntax-oriented, semantically vacuous layer of commonality is all that is needed to simplify things dramatically..."

[November 08, 2005] ACM Queue Special Issue on Semi-Structured Data. Edited by Charlene O'Hanlon. ACM Queue Volume 3, Number 8 (October 2005). ISSN: 1542-7730. "Some believe semi-structured data is nothing more than a fancy term for a data structure left unfinished, and others who firmly believe semi-structured data is the best way to describe data that doesn't easily fit into the traditional database structure. Semi-structured data, because of its unstructured-yet-structured nature, presents its own set of problems, such as schema discovery and determining the proper method to perform essential database operations such as extraction, integration, translation, and storage of data. Fortunately, there has been much research and testing to make semi-structured data a better neighbor with its traditional database counterparts...": [1] "Unstructured, But Not Really: Data that Doesn't Fit the Mold" [print] page 8, by Charlene O'Hanlon; [2] "Managing Semi-Structured Data" [print] pages 18-24, by Daniela Florescu (Oracle); [3] "Learning from The Web" [print] pages 26-32, by Adam Bosworth (Google); [4] "XML and Semi-Structured Data" [print] pages 34-41, by C. M. Sperberg-McQueen (World Wide Web Consortium); [5] "Order from Chaos" [print] pages 42-49, by Natalya Noy (Medical Informatics, Stanford University); [6] "Why Your Data Won't Mix [print] pages 50-58, by Alon Halevy (University of Washington); [7] "The Cost of Data", pages 62-64 [Curmudgeon column], by Chris Suver (Microsoft).

[August 2005] "Firing Up the Hybrid Engine." By Anjul Bhambhri. From IBM DB2 Magazine Online (August 2005). "Because enterprises have, in aggregate, trillions of dollars invested in relational data and relational database management systems (RDBMSs), simply replacing RDBMSs with a pure XML store isn't an option. Adding an XML-only database into the infrastructure adds yet another integration and complexity challenge. IBM is about to introduce true-native support for both XML and relational data. This evolutionary technology, now in beta tests with a small group of IBM customers, provides hybrid relational/XML storage from the ground up. That means DB2 will no longer need the XML Extender (just as it doesn't need an SQL Extender). DB2 will simply handle XML natively. (There are varying definitions of "native" XML support. To clear up the confusion about what's typically called "native" today, see the sidebar.) In the hybrid version, 'XML is handled as a new data type. Nearly every DB2 component, tool, and utility has been enhanced to recognize and handle this new data type. The new storage paradigm retains XML in a parsed, annotated tree form — similar to the XML Document Object Model (DOM) — that's separate from the relational data store. On top of both data stores (relational and XML) sits one hybrid database engine. That single engine can process XQuery, XPath, SQL, and SQL/XML. The engine features a bilingual query compiler with parsers for both SQL and XQuery. So developers can access information using either language (or both together) according to what makes the most sense in specific situations. A hybrid DB2 provides the flexibility to shift (between XML and SQL) paradigms as information management needs change. Storing relational and XML data in a database management system that understands and supports both models at every level (from the client, through the engine, down to the disk) provides flexibility and consistently fast performance. The XML data inherits the same backup and recovery, optimization, scalability, and high availability DB2 offers for relational data. Ultimately, a unified XML/relational database keeps things simple by avoiding the need to integrate XML and relational data from separate stores..." [PDF format, cache]

[March 2005] "The IBM Approach to Unified XML/Relational Databases." IBM Technical Report on Unified XML/relational storage. "'Native XML Storage' uses a physical storage model that is representative of the logical model or XML document.The XML document must be the fundamental basis for logical modeling, logical storage and physical storage to accurately represent and render the XML document. This approach leaves no layer or portion of the data engine exempt from understanding this XML model, from the data model through the engine down to disk and back to the client. The result is a data storage model with the flexibility to handle any XML statement in any column and uniform and exceptional performance across document and collection sizes... Stand-alone XML-only database products are currently available — however, these products are only XML databases and do not include support for relational data or data models other than XML. Because these products do not offer the capability or flexibility of unified offerings from the major relational vendors, they are not covered in detail in this document.The unified support offered by major vendors — such as, Oracle, Microsoft, Sybase and IBM — can be loosely grouped into four categories, with each vendor offering support for one of more of the following: (1) Shred, or decompose, the XML into relational or object relational form; (2) Store the XML intact in character form in a character large object (CLOB) (3) Store the XML in encoded binary form in a binary large object (BLOB) (4) Store the XML in a truly native repository... The approach IBM has taken is to support both shredded and true native storage. Support for shredding is important because XML can be used to feed existing relational schemas.Since documents can grow large and will be updatable in many cases, the advantages of non-BLOB storage for XML documents, which include storing at the node level of granularity instead of at the document level, are significant... IBM provides a truly native unified XML/relational database, supporting the XML data model from the client through the database down to the disk and back again. By deeply implementing XML into a database engine that previously was purely relational, IBM offers superior flexibility and performance relative to other offerings." [cache]

[March 2005] "Comparing XML and Relational Storage: A Best Practices Guide." IBM Technical Report on Unified XML/relational storage. While there have been years of research into physical and logical database design in purely relational systems, little definitive work has been done on the influence of XML on the logical and physical database design of unified XML/relational systems. The bulk of the influence of XML on logical and physical database design is based on fundamental properties of XML that make it different from the relational model: (1) XML is self-describing. A given document contains not only the data, but also the necessary metadata. As a result, an XML document can be searched or updated without requiring a static definition of the schema. Relational models, on the other hand, require more static schema definitions. All the rows of a table must have the same schema. (2) XML is hierarchical. A given document represents not only base information, but also information about the relationship of data items to each other in the form of the hierarchy. Relational models require all relationship information to be expressed either by primary key or foreign key relationships or by representing that information in other relations. (3) XML is sequence-oriented — order is important. Relational models are set-oriented —order is unimportant. What is a unified XML/relational database? Case 1a: Data has inherent hierarchical relationships; Case 1b: Data has multiple inherent hierarchical relationships; Case 2: Data has containment relationships; Case 3: Data has sparse attributes or a large number of attributes; Case 4: Schema evolution; Case 5: Highly variable or multiple schema... A true native XML data store is more than merely a data store that exposes XML to its clients — it must represent the XML throughout the entire data engine stack from client to disk and back out again. While XML storage may seem best for XML data, and relational storage best for relational data, in many cases this does not hold true. At times, relational storage proves best for XML data and XML storage proves best for tabular data..." [cache]

[December 14, 2004] "Sleepycat Software Releases Berkeley DB XML 2.0. Major Upgrade of Native XML Database Adds New Support for Emerging XML Data Access Standard and Up To 10x Performance Increase." - "Sleepycat Software, makers of Berkeley DB XML, the leading open source, native XML database, today announced the general availability of Berkeley DB XML 2.0. The major new release includes support for XQuery 1.0, the emerging standard for XML data access, as well as significant performance and usability enhancements. 'This is a major upgrade of Berkeley DB XML and is a significant advancement in native XML database development,' said Mike Olson, CEO of Sleepycat Software. 'Our new XQuery support benefits customers that have been waiting for a standard for XML databases similar to the SQL standard for relational databases.' 'Berkeley DB XML combines the extremely reliable database engine of Berkeley DB with the tremendous ease-of-use of native XML storage,' said Steve Bishop, CTO at WildCard Systems, Inc., a leading developer of solutions for electronic payment systems using pre-paid smart cards. 'Sleepycat's new support of XQuery gives us greater confidence in moving critical financial transaction support infrastructure to Berkeley DB XML.' 'Since AllPeers is a consumer application, it was extremely important for us when choosing a database to find a system that is fast, lightweight, embeddable and needs zero administration,' said Matthew Gertner, CTO of AllPeers, developers of a peer-to-peer information sharing platform. 'Berkeley DB XML not only meets all of these requirements but also offers excellent native support for XML. The AllPeers platform supports a massive peer-to-peer network for the sharing of millions of files. By storing the XML metadata for each file in Berkeley DB XML, we have achieved much faster time-to-market and a cleaner, more consistent architecture.' New features in Berkeley DB XML 2.0 include: (1) XQuery 1.0 support that allows application portability through complying with the July 2004 draft of the XQuery standard. (2) XPath 2.0 support that allows the selection of a portion of an XML document. (3) PHP API support to easily enable developers using the popular PHP scripting languages to work with XML documents. (4) Improved query performance that can be up to 10 times faster for multi-megabyte XML documents. (5) Ability to control storage granularity of documents (whole documents or nodes) to optimize query performance. (6) XML document streaming into database dramatically simplifies how documents are stored. Documents can now be streamed in from an URI, memory, or file..."

[September 02, 2004] dbXML 2.0 Production Release Provides Open Source Native XML Database. A communiqué from Tom Bradford reports on the recent production release of dbXML Version 2.0 by the dbXML Group. dbXML is a Native XML Database "capable of storing and indexing collections of XML documents in both native and mapped forms for highly efficient querying, transformation, and retrieval. In addition to these capabilities, the server may also be extended to provide business logic in the form of scripts, classes and triggers." New features in the dbXML Version 2.0 release include journaling transactions, XSLT transformations, full text indexing and full text querying, pluggable security models, a new command line system, new client/server APIs, SSL connection support, JSP Tag Library support, and embedded database APIs. dbXML 2.0 as an open source project governed by the terms of the GNU General Public License. This version of dbXML is basically "a complete rewrite of the dbXML 1.0 code, which forked into the Apache Xindice project. dbXML was developed using the Java 2 Standard Edition version 1.4, and should operate properly on all platforms to which J2SE 1.4 has been ported." The dbXML Group also "provides commercial licenses for situations where utilization under the terms of the GPL are inappropriate. Those using or deploying dbXML in a commercial environment may wish to consider contacting the group to discuss commercial licensing and support."

[April 29, 2004] "Databases Flex Their XML: IBM, Microsoft, Oracle, and Sybase Compete in Our Data Management Gymnastics." By Sean McCown. In InfoWorld (April 23, 2004). "If you could do one thing to improve integration and automate processes with customers and business partners, it would be to implement XML, which has become the standard for exchanging information between disparate systems because it is easily transformed into any format. With very little effort, the same file can be sent to several different customers with their own specific needs. XML eases the development effort for the transmitting company and gives recipients a safety net for altering the way they use the data without having to alter how they receive it. Being able to merge, query, and transform transmitted data with relational data is becoming as essential to businesses as data warehouses themselves. The good news is that the four leading relational databases, namely Oracle Database, IBM DB2, Sybase ASE (Adaptive Server Enterprise), and Microsoft SQL Server, not only can store XML data, but they hide much of the complexity of working with XML. Depending on which of these relational databases you use, however, the XML features you will have to work with may be extremely rich or limited in important ways. What does a fashionable XML database provide? Four basic functions: the ability to consume, store, search, and generate XML. The extent to which the database supports these functions and the methods it uses to accomplish them are what make for a successful implementation of XML in a database. I examined these four areas in Oracle Database 10g, IBM DB2 Universal Database V8.1, Sybase ASE 12.5.1, and Microsoft SQL Server 2000. I tested how they imported and read XML files, their options for saving the data, their indexing and query capabilities, and their options for creating XML and graded them based on the ease, flexibility, and speed with which they handled the most common XML operations. Of course, these products have many other capabilities beyond handling XML..." See other details in the InfoWorld special report.

[February 26, 2004] "Getting Reacquainted with dbXML 2.0." By Tom Bradford. From XML.com (February 25, 2004). "The goal of the dbXML project has been to produce a high quality, small footprint XML database that just works. dbXML is a native XML database written in Java. Native XML databases (NXDs) are databases that store XML using an internalized format for faster overall processing and representational flexibility. NXDs also provide support for indexing XML for improved query performance. Because it utilizes Java's memory mapped I/O and overlapping socket I/O, dbXML requires Java 1.4 or higher... In version 2.0 dbXML supports basic journaling transactions under the hood. At present, all transactions are implicit unless you're accessing dbXML using the database's lowest level APIs. Explicit transaction APIs will be exposed via the client/server APIs in a future release... The database now has a pluggable security model. There are currently three security managers to choose from. (1) NoSecurityManager provides no security whatsoever and is used when authentication is not needed to access the database. (2) SimpleSecurityManager provides simple security, where a single user name and password is used for the entire database. The user name and password are defined in the database's system.xml configuration file. (3) DefaultSecurityManager is so named because it is the default security manager. It provides access control based on users and roles stored in the database's system collections. dbXML 1.0 leveraged CORBA to provide client/server communications. While CORBA made dbXML accessible to many platforms and languages, it also came with its share of headaches. For version 2.0, it was decided that CORBA would no longer be used. dbXML 2.0 utilizes a web services hub called Project Labrador to provide client/server communications. Currently, Labrador only supports REST and the XML-RPC protocol. As a result, dbXML only supports these modes of access. A future version of Labrador will support SOAP; when it does, dbXML will automatically inherit this capability. This project has evolved quite a bit since version 1.0 and is very likely to evolve considerably in the coming year. It is already a mature product, with some rather high profile users, and is in a very good position to become the dominant open source XML database, if not one of the more popular XML databases in general..."

[December 09, 2003] "Software AG Increases XML Support Within Its Natural Development Environment for Windows, UNIX and Linux Platforms. Natural version 6 Enables Developers to Access XML Documents Stored in the Company's Tamino XML Server Without Learning an XML-Specific Query Language." - "Software AG, Inc., a pioneer in XML solutions, today announced the availability of Natural version 6 for Windows, UNIX and Linux platforms. Software AG's popular 4GL development environment, which is currently installed at approximately 3,000 organizations worldwide, now enables Natural programmers to access XML documents stored in the company's Tamino XML Server without needing to learn an XML-specific query language. Natural version 6 also allows developers working in Windows to access Natural programs running in UNIX or on a mainframe -- a capability the company calls Single-Point-of-Development. Both capabilities are designed to increase the speed and convenience of using Natural in an open systems environment. The announcement was made at the 2003 XML Conference and Exposition. Thanks to an expanded XML tool kit and new language constructs, users of Natural version 6 can process XML documents with greater ease and flexibility. For example, developers can gain access to Software AG's Tamino XML Server using familiar Natural DML (Data Manipulation Language) statements, meaning that XML documents residing in Tamino can be queried from Natural without the developer having to learn XPath, XQuery or a similar XML-specific protocol. In addition, Websites and HTML pages can now be designed more easily and efficiently in Natural version 6 through the incorporation of XSL (Extensible Stylesheet Language) support and the implementation a revised Web interface... The Single-Point-of-Development interface allows a Windows PC running Natural version 6 to access Natural programs running on Unix and mainframes -- thus combining the flexible development potential found on a Windows operating system with the stability and performance of mainframe and Unix. Using Single-Point-of-Development, programs created in Windows can be modified directly on the server platform, thereby addressing versioning and synchronizing issues flowing from the need to save code separately on multiple platforms. Single-Point-of-Development is available not only for the core Natural system, but also for four Natural add-ons: Natural Construct, Natural Engineer, Predict and Mainframe Navigator. These additional Natural engineering tools can therefore also be used via the Single-Point-of-Development interface..."

[September 30, 2003] "What's Next for SQL Server?" By Lisa Vaas. In eWEEK (September 26, 2003). "Users demanded SQL Server bond tighter with Visual Studio .Net, and Microsoft Corp. has since heeded the call, putting into beta testers' hands a version that opens the database up to .Net-compliant languages. The next version of SQL Server, code-named 'Yukon,' was originally slated for a spring 2004 release. That deadline was pushed out to the second half of next year after customers said they expected Yukon to fit hand-in-glove with the next version of .Net, code-named Whidbey. The Yukon beta was released in July to some 2,000 customers and partners. eWEEK recently talked with Microsoft Group Product Manager Tom Rizzo to find out how the .Net integration that customers demanded, along with upcoming features such as native XML and Web Services support, will benefit enterprises." [Rizzo:] "From the data level, we have things like native XML support. You take data from SQL Server, put it into XML format and ship it to anything that understands XML, such as Oracle has some XML support, and [IBM's DB2 database]. XML is ultimate interoperability -- it's an industry-standard format, and it's self-describing. You know both the schema of the data as well as the data itself. You don't lose the context when you pass your data around. We upped the level of XML support in Yukon through a number of things. In 2000 we had XML support but -- it was shredding. (Shredding is the parsing of XML tag components into corresponding relational table columns.) In Yukon the key thing is we have an XML type. Like you have STRING and NUMBERS and all that inside the database, now you can declare with the native data type XML. Although we had XML support in 2000, and many leveraged it and were happy with it, now we have native support... One reason we [moved to a native data type for XML] it is to support XQuery. Also to support XQuery we had to build code so as to combine XML with relational query language. You can take the relational sorts of queries you're used to in the database world, where people select things from tables with filters on that data. You can combine XQuery statements with such relational queries..."

[July 25, 2003] "The Future of XML Documents and Relational Databases. As New Species of XML Documents Are Emerging, Vendors Are Unveiling Increased RDBMS Support for XML." By Jon Udell. In InfoWorld (July 25, 2003). "Having absorbed objects, the RDBMS vendors are now working hard to absorb XML documents. Don't expect a simple rerun of the last movie, though. We've always known that most of the information that runs our businesses resides in the documents we create and exchange, and those documents have rarely been kept in our enterprise databases. Now that XML can represent both the documents that we see and touch -- such as purchase orders -- and the messages that exchange those documents on networks of Web services, it's more critical than ever that our databases can store and manage XML documents. A real summer blockbuster is in the making. No one knows exactly how it will turn out, but we can analyze the story so far and make some educated guesses. The first step in the long journey of SQL/XML hybridization was to publish relational data as XML. BEA Chief Architect Adam Bosworth, who worked on the idea's SQL Server implementation, calls it 'the consensual-hallucination approach -- we all agree to pretend there is a document.' XML publishing was the logical place to start because it's easy to represent a SQL result set in XML and because so many dynamic Web pages are fed by SQL queries. The traditional approach required programmatic access to the result set and programmatic construction of the Web page. The new approach materializes that dynamic Web page in a fully declarative way, using a SQL-to-XML query to produce an XML representation of the data and XSLT to massage the XML into the HTML delivered to the browser. Originally these virtual documents were created using proprietary SQL extensions such as SQL Server's 'FOR XML' clause. There's now an emerging ISO/ANSI standard called SQL/XML, which defines a common approach. SQL/XML is supported today by Oracle and DB2. It defines XML-oriented operators that work with the native XML data types available in these products. SQL Server does not yet support an XML data type or the SQL/XML extensions, but Tom Rizzo, SQL Server group product manager at Redmond, Wash.-based Microsoft, says that Yukon, due in 2004, will... Most of the information in an enterprise lives in documents kept in file systems, not in relational databases. There have always been reasons to move those documents into databases -- centralized administration, full-text search -- but in the absence of a way to relate the data in the documents to the data in the database, those reasons weren't compelling. XML cinches the argument. As business documents morph from existing formats to XML -- admittedly a long, slow process that has only just begun -- it becomes possible to correlate the two flavors of data..."

[July 21, 2003] "XQuery and SQL: Vive la Différence." By Ken North. In DB2 Magazine (Quarter 3, 2003). "Sometimes SQL and XML documents get along fine. Sometimes they don't. A new query language developed by SQL veterans is promising to smooth things over and get everything talking again. It's impossible to discuss the future of the software industry without discussing XML. XML has become so important that SQL is no longer the stock reply to the question, 'What query language is supported by all the major database software companies?' The new kid on the block is XQuery, a language for running queries against XML-tagged documents in files and databases. A specification published by the World Wide Web Consortium (W3C) and developed by veterans of the SQL standards process, XQuery emerged because SQL -- which was designed for querying relational data -- isn't a perfect match for XML documents. Although SQL works quite well for XML data when there's a suitable mapping between SQL tables and XML documents, it isn't a universal solution. Some XML documents don't reside in SQL databases. Some are shredded or decomposed before their content is inserted into an SQL database. Others are stored in native XML format, with no decomposition. And the nature of XML documents themselves poses other challenges for SQL. XML documents are hierarchical or tree-structured data. They're self-describing in that they consist of content and markup (tags that identify the content). In SQL databases, such as DB2, individual rows don't contain column names or types because that information is in the system catalog. The XML model is different. As with SQL, schemas that are external to the content they describe define names and type information. However, it's possible to process XML documents without using schemas. XML documents contain embedded tags that label the content. But unlike SQL, order is important when storing and querying XML documents. The nesting and order of elements in a document must be preserved in XML documents. Many queries against documents require positional logic to navigate to the correct node in a document tree. When shredding documents and mapping them to columns, it's necessary to store information about the document structure. Even mapping XML content to SQL columns often requires navigational logic to traverse a document tree. Other requirements for querying XML documents include pattern matching, calculations, expressions, functions, and working with namespaces and schemas... For these and other reasons, the W3C in 1998 convened a workshop to discuss proposals for querying XML and chartered the XML Query Working Group..."

[July 14, 2003] "Sleepycat Boosts Database." By Lisa Vaas. In eWEEK (July 14, 2003). "Sleepycat Software Inc. last week tossed its open-source database into the XML ring with the release of code for Berkeley DB XML, a native XML database that's built on top of its open-source embedded database, Berkeley DB. Berkeley DB XML offers a single data repository for storage and retrieval of native XML and non-XML data, avoiding the XML conversion overhead that occurs with relational databases that have been retrofit with XML adapters, officials said. The database supports XPath 1.0, a World Wide Web Consortium standard language for addressing parts of an XML document. It offers flexible indexing, giving application developers the ability to control query performance and tune data retrieval... Having Berkeley DB as the base engine for the XML offering means that the new product will inherit advanced database features such as concurrent access, transactions, recovery and replication, officials said. It will scale up to 256 terabytes for the database and up to 4GB for individual keys and values... The release of the open source code heralds the end of a 12-month beta program that comprised some 5,000 companies, many of them huge names such as 3M Co., Amazon.com Inc., BEA Systems Inc., Lucent Technologies Inc.'s Bell Labs, The Boeing Co., Cisco Systems Inc., Hewlett-Packard Co., IBM and NEC America Inc. Those big names are testimony to the traction XML is gaining in the enterprise, said Sleepycat officials, in Lincoln, Mass. Sleepycat's software is sold using a typical open-source scheme: free to download and use or fee-based to ship a product whose source code is withheld. The company has 200 paying customers, according to officials..." See details in the news story "Sleepycat Software Releases Berkeley DB XML Native XML Database."

[May 27, 2003] IBM Announces General Availability of DB2 Information Integrator V8.1. IBM has announced the general availability of DB2 Information Integrator V8.1 which "provides the foundation for a strategic information integration framework that helps customers to access, manipulate, and integrate diverse and distributed information in real time. The new product enables businesses to abstract a common data model across data and content sources and to access and manipulate them as though they were a single source. IBM's DB2 software helps businesses increase efficiencies by enabling them to centrally manage data, text, images, photos, video and audio files stored in a variety of databases. The new IBM product is most appropriate for projects whose primary data sources are relational data augmented by other XML, Web, or content sources." Core components in the DB2 Information Integrator include a Federated Data Server, a Replication Server for Mixed Relational Databases, and a Local Database Server. The federated data server allows administrators to use integrated graphical tools to configure data source access and define integrated views across diverse and distributed data; XML schema can be automatically mapped into relational schema. "DB2 Information Integrator V8.1 supports the predominantly read-access scenarios common to enterprise-wide reporting, knowledge management, business intelligence, portal infrastructures, and customer relationship management."

[May 20, 2003] "The Center of the Universe." By Ken North. In Intelligent Enterprise Volume 6, Number 9 (May 31, 2003). ['XML, Web services, analytics, and other hot technologies have the leading relational DBMS providers working overtime to remain the best choice for managing all of your data. Here's a look at what IBM, Microsoft, and Oracle are doing.'] "Whatever form software takes in the next decade, databases will continue as the primary tool for managing data. DBMSs from rivals IBM, Microsoft, and Oracle will provide persistent data management for Web services, embedded applications, Web stores, grid services, and other software. But SQL DBMS products will increasingly be judged on how well they support traditional tasks (such as transaction processing) while evolving to provide new capabilities (such as integrated business analytics). The latest releases of data management software from the big three vendors unite SQL with multidimensional and document-centric (XML) data and grid computing. Whether an organization follows a best-of-breed approach or taps a single vendor to build an IT infrastructure, problems can arise with interoperability, data aggregation, and data and application integration. That's why XML, XML-based messaging, XML-enabled databases, and Web services have become increasingly important. But XML is only one of the fields on which the database software giants are competing. Although there are now fewer SQL database vendors than a decade ago, competition remains fierce among IBM, Oracle, and Microsoft. Each company tries to gain an edge over the others by complementing their database platforms with broad-spectrum software offerings such as vertical market applications and developer tools. The DBMS products from each of these vendors provide parallel processing, extensible servers, online analytic processing (OLAP), tight integration with messaging software, and support for XML and Web services. The products diverge when it comes to programming database server plug-ins, querying multidimensional data sets, persisting message queues, orchestrating the flow of Web services, and processing audio, video, and other rich data types. This overview of the different strategies vendors are following sheds light on their plans for developing technologies to extend the SQL DBMS to handle business intelligence (BI), XML, Web services, and grid requirements..." See also Ken North's interviews with: [1] Rob High and Nelson Mattos of IBM; [2] Andrew Mendelsohn of Oracle; [3] Jim Gray and Michael Rys of Microsoft.

[May 12, 2003] "DB Updates Ease Web Services." By Lisa Vaas. In eWEEK (May 12, 2003). "Best-of-breed XML database developers Ipedo Inc., Sonic Software Corp. and Sleepycat Software Inc. are enhancing their respective native XML database software to make it easier for enterprises to use and manage XML data in Web services environments... Ipedo, for example, late this month will release Version 3.3 of its XML Information Hub, which boasts three new components. The first, called content conversion, automatically converts PDF, Microsoft Corp. Word and other non-XML documents into XML. The auto-organization component organizes, merges and transforms inbound content according to business rules, said Ipedo officials, in Redwood City, Calif. The third new piece, a universal XML Query engine, provides local and remote content and data source searching and updating using the XQuery standard. Some of the Ipedo upgrade's new features are compelling for user Thor Anderson, who is manager of program development at Collegis Inc. Anderson is working with Texas A&M University to take online digital library resources and put them into a repository with additional, educationally specific metadata. 'The more that [XQuery] engine is improved and sped up and usable, that's important,' said Anderson... Separately, Sonic late this summer will roll out a suite of integration products, called Sonic Business Integration Suite, that includes Sonic XML Server, a renamed and enhanced version of the Excelon XIS (Extensible Information Server) native XML database that the company acquired last fall. Enhancements to the XML database include a Web services-style interface laid over the XML processing and storage engine within XIS, said Sonic officials, in Bedford, Mass... Sleepycat next month will release Version 4.2 of Berkeley DB, its open-source embedded database..." See: "Ipedo Enhancements Boost Award-Winning XML Information Hub. Content Conversion, Auto-Organization, Universal XQuery, Web Services Views Reduce Cost and Complexity of Information Delivery."

[May 12, 2003] "Berkeley DB XML: An Embedded XML Database." By Paul Ford. From XML.com (May 07, 2003). ['Paul Ford introduces the embeddable Berkeley DB XML database. For many years the open source Berkeley DB libraries have been a popular choice for embedded database applications. It has been so ubiquitously used that chances are, you rely on some software product that embeds Berkeley DB. It is therefore pretty exciting when SleepyCat, the maintainers of Berkeley DB, announce that they will be releasing an XML-aware version of their database software.'] "Berkeley DB XML is an open source, embedded XML database created by Sleepycat Software. It's built on top of Berkeley DB, a 'key-value' database which provides record storage and transaction management. Unlike relational databases, which store data in relational tables, Berkeley DB XML is designed to store arbitrary trees of XML data. These can then be matched and retrieved, either as complete documents or as fragments, via the XML query language XPath. Berkeley DB XML is written in C++, APIs for Berkeley DB XML exist for C/C++, Java, Perl, Python, and TCL, and more languages interfaces are currently under development... An XML database has several advantages over key-value, relational, and object-oriented databases: (1) XML data is dropped straight into the database; it does not need to be manipulated or extracted from a document in order to be stored. (2) When inserted into the database, most (in Berkeley DB XML, all) aspects of an XML document, including white space, are maintained exactly. (3) Queries return XML documents or fragments, which means that the hierarchical structure of XML information is maintained... Berkeley DB XML, even in beta, is a promising solution for XML storage and retrieval. According to [John] Merrells, it is being evaluated by "several serious commercial enterprises." Based on Berkeley DB, it has an well-proven foundation for data storage, and SleepyCat's prior releases have proven them to be a reliable provider of well-documented open source tools for data storage. SleepyCat allows for commercial licensing of their open source tools, which may make this solution attractive for corporations that are skittish about open source. It is also worth noting that Berkeley DB XML users essentially get Berkeley DB "for free" with the product. In other words, it's easy to mix and match regular DB data sources with XML data sources. This combination may provide a strong alternative to relational and object-oriented databases... Since any data storage technology requires a significant investment in time and effort, this strong level of community and corporate support is encouraging; Berkeley DB XML, currently in its infancy, seems likely to be around for a long time, and by offering a standard embedded interface it may provide a very useful tool for programmers in need of robust data storage who want to avoid the overhead of a relational database. The tool has some growing to do, but even in its current form many programmers will find it a useful tool with a logical, powerful interface..."

[May 01, 2003] "A Normal Form for XML Documents." By Li-Yan Yuan (Professor, Department of Computing Science, University of Alberta, Canada). 40 pages. Reading reference for the course "Modern Database Management Systems" (Winter Term, 2003); "this course covers research topics in advanced database management systems as well as emerging database techonologies, with emphasis on XML data and XML support for object-oriented database management systems... Given a DTD, and a set F of FDs, ( D, F ) is in XML normal form (XNF) if and only if for every nontrivial FD of the form S --> p.@l or S --> p.S, it is the case that S--> p is implied by F. The presentation references the paper "A Normal Form for XML Documents", by M. Arenas and L. Libkin, published in the Proceedings of ACM PODS02. [cache]

[May 01, 2003] "An Information-Theoretic Approach to Normal Forms for Relational and XML Data." By Marcelo Arenas and Leonid Libkin (University of Toronto). Paper for presentation at the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2003), San Diego, USA, [June 9-12] 2003. "Normalization as a way of producing good database designs is a well understood topic. However, the same problem of distinguishing well designed databases from poorly designed ones arises in other data models, in particular, XML. While in the relational world the criteria for being well designed are usually very intuitive and clear to state, they become more obscure when one moves to more complex data models. Our goal is to provide a set of tools for testing when a condition on a database design, specified by a normal form, corresponds to a good design. We use techniques of information theory, and define a measure of information content of elements in a database with respect to a set of constraints. We first test this measure in the relational context, providing information theoretic justification for familiar normal forms such as BCNF, 4NF, PJ/NF, 5NFR, DK/NF. We then show that the same measure applies in the XML context, which gives us a characterization of a recently introduced XML normal form called XNF. Finally, we look at information theoretic criteria for justifying normalization algorithms. Several [other] papers attempted a more formal evaluation of normal forms, by relating it to the elimination of update anomalies. Another criterion is the existence of algorithms that produce good designs: for example, we know that every database scheme can be losslessly decomposed into one in BCNF, but some constraints may be lost along the way... Our [research] goal was to find criteria for good data design, based on the intrinsic properties of a data model rather than tools built on top of it, such as query and update languages. We were motivated by the justification of normal forms for XML, where usual criteria based on update anomalies or existence of lossless decompositions are not applicable until we have standard and universally acceptable query and update languages. We proposed to use techniques from information theory, and measure the information content of elements in a database with respect to a set of constraints. We tested this approach in the relational case and showed that it works: that is, it characterizes the familiar normal forms such as BCNF and 4NF as precisely those corresponding to good designs, and justifies others, more complicated ones, involving join dependencies. We then showed that the approach straightforwardly extends to the XML setting, and for the case of constraints given by functional dependencies, equates the normal form XNF of ["A Normal Form for XML Documents", by M. Arenas and L. Libkin, published in the Proceedings of ACM PODS02] with good designs. In general, the approach is very robust: although we do not show it here due to space limitations, it can be easily adapted to the nested relational model, where it justifies a normal form NNF..." [cache]

[April 29, 2003] "Bluestream Upgrades XML Database." By Lisa Vaas. In eWEEK (April 29, 2003). "Bluestream Database Software Corp. has released an upgrade to its native XML database that features smoother handling of collaborative content management with XML and binary data. XStreamDB 3.0 has a new resource manager that enables content management via integration with Web or print authoring and publishing software. It now supports Corel XMetaL for XML content authoring using that software's word processor-like view of content. The new version also improves support for Altova XMLSpy for editing data-centric XML documents... Other new features include faster full-text search and indexing, built-in WebDAV server, event triggers, automated backup and the new XStreamDB 3.0 Server Console application for easier server administration. XStreamDB 3.0 also now supports binary document types with MIME-type attributes in addition to native XML document support. It has also acquired derivation by extension and attribute groups, adding to its existing W3C schemas support. The database now has full-text search that features LIKE wildcard matching, found word marking, phrase search and proximity search. XStream 3.0 is a cross-platform database server that runs on Windows NT/2000/XP, Solaris, Mac OS X or Linux. It features a choice of Java API, WebDAV or the XStreamDB 3.0 Explorer application for access to documents and data. The database also supports XQuery with update extensions, full text search, shared resource management and XML schemas with automatic validation. XStreamDB is compliant with the ACID (Atomicity Consistency Isolation Durability) standard for open-source database systems. That standard indicates that the database supports "all or nothing" transactions -- those that either work to their conclusion or refrain from changing data. XStreamDB is a pure Java technology-based server and requires Java 1.3 or 1.4 Runtime Environment..." See the announcement "Bluestream Releases XStreamDB 3.0 Native XML Database. Major Upgrade Adds Features for Collaborative Content Management."

[April 05, 2003] Book announcement: XML Data Management: Native XML and XML-Enabled Database Systems, by Akmal Chaudhri, Awais Rashid, and Roberto Zicari. Addison Wesley, 2003. ISBN: 0201844524. 688 pages. The book is divided into five parts each containing a coherent and closely related set of chapters; these are self-contained and can be read in any order: Introduction; Native XML Databases; XML and Relational Databases; Applications of XML; Performance and Benchmarks. Topics covered include: (1) The power of good grammar and style in modeling information to alleviate the need for redundant domain knowledge; (2) Tamino's XML storage, indexing, querying, and data access features; (3) The features and APIs of open source eXist; (4) Berkeley DB XML's ability to store XML documents natively; (5) IBM's DB2 Universal Database and its support for XML applications; (6) Xperanto's method of addressing information integration requirements; (7) Oracle's XMLType for managing document centric XML documents; (8) Microsoft SQL Server 2000's support for exporting and importing XML data; (9) A generic architecture for storing XML documents in a relational database; (10) X007, XMach-1, XMark, and other benchmarks for evaluating XML database performance. The Preface and Chapter 1 ("Information Modeling with XML") are available online. See also the online Table of Contents.

[March 03, 2003] "ISO-ANSI Working Draft XML-Related Specifications (SQL/XML)." Draft text for: Information technology -- Database languages -- SQL -- Part 14: XML-Related Specifications (SQL/XML). // Technologies de l'information -- Langages de base de donnée -- SQL -- Partie 14: «Specifications à XML» (SQL/XML).] Edited by Jim Melton. ISO/ANSI WD Reference: WG3:DRS-020, H2-2002-365. August, 2002. ISO Reference: ISO/IEC JTC 1/SC 32/WG 3. Date: 2002-08-09. ISO/IEC 9075-14:200x(E). Produced by ISO (International Organization for Standardization) and ANSI (American National Standards Institute). ISO/IEC JTC 1/SC 32/WG 3; ANSI TC NCITS H2. 154 pages. "This part of ISO/IEC 9075 defines ways in which Database Language SQL can be used in conjunction with XML. This standard defines mappings from SQL to XML, and from XML to SQL. The mappings from SQL to XML include: (1) Mapping SQL character sets to XML character sets; (2) Mapping SQL <identifier>s to XML Names; (3) Mapping SQL data types (as used in SQL-schemas to define SQL-schema objects such as columns) to XML Schema data types; (4) Mapping SQL data values to XML data values; (5) Mapping an SQL table to an XML document and an XML Schema document; (6) Mapping an SQL schema to an XML document and an XML Schema document; (7) Mapping an SQL catalog to an XML document and an XML Schema document. The mappings from XML to SQL include: [1] Mapping Unicode to SQL character sets; [2] Mapping XML Names to SQL <identifier>s..." For an overview, see "Standards: SQL/XML is Making Good Progress," by Andrew Eisenberg (IBM) and Jim Melton (Oracle Corp), ACM SIGMOD Record Volume 31, Issue 2 (June 2002). The update describes new work as of June 2002: "in three parts. The first part provides a mapping from a single table, all tables in a schema, or all tables in a catalog to an XML document. The second of these parts includes the creation of an XML data type in SQL and adds functions that create values of this new type. These functions allow a user to produce XML from existing SQL data. Finally, the 'infrastructure' work that we described in our previous article included the mapping of SQL's predefined data types to XML Schema data types. This mapping has been extended to include the mapping of domains, distinct types, row types, arrays, and multisets..." [cache]

[March 03, 2003] "Special Characters, Database Mappings." By John E. Simpson. From XML.com (February 26, 2003). ['John E. Simpson discusses XML special characters and SQLX.'] "... Yes, an XSLT processor, just like any other application which expects legitimate XML as input, will choke on ampersands, less-than symbols, and so on instead of their entity-reference forms. If you use a GUI-based XHTML or HTML editor, you may have noticed that you're free to enter any old character into a document, even the 'dangerous' markup-significant ones. What's more, the editor even shows you a literal ampersand, instead of something horrible like &. If you examine the raw source behind the GUI cosmetics, though, you'll find entity references scattered around even though you well know you didn't key them in yourself. The editor is in effect mediating between the markup- and non-markup-based worlds in the same way that your preprocessor would need to do... In a comment posted shortly after [the last month] column's publication, technical writer, editor, and Oracle database guru Jonathan Gennick directed me to an emerging ISO/ANSI standard called SQLX. Billed as the place 'where SQL meets XML,', SQLX is a joint effort by representatives of IBM, Oracle, Sybase, Microsoft, and Northrop-Grumman to establish a standard for the 'ways in which Database Language SQL can be used in conjunction with XML.' As Gennick says, 'no need to reinvent the wheel' by proposing some alternative method of mapping identifiers'..." The the SQLX Workgroup website references an August 2002 ISO-ANSI Working Draft for XML-Related Specifications (SQL/XML).

[February 12, 2003] "XML Data Binding." By Eldon Metz and Allen Brookes. In Dr. Dobb's Journal #346 Volume 28, Issue 3 (March 2003), pages 26-36. Special Issue on XML Development, edited by Jonathan Erickson. ['XML data binding utilities dramatically simplify the task of writing XML-enabled applications by automatically creating a data binding for you.'] "An XML binding is programming language code that represents XML data, thereby ensuring that the documents conform to their schema. The generated code enables the transfer of XML data to/from instances of the generated classes. While XML data-binding tools may not always be useful when writing code to process XML, they usually do save time in coding, testing, and maintenance..." See also Ronald Bourett's "XML Data Binding Resources" and the program listings.

[January 17, 2003] "Driving ODBC, JDBC Drivers to XML Web Services." By Vance McCarthy. In Enterprise Developer News (January 13, 2003). "It will get much easier in 2003 for developers using ODBC and JDBC to upgrade to XML-based web services, at least according to DataDirect Technologies, one of the leading providers of database driver technologies to software providers and end users. DataDirect, a long-time provider of OEM and end-user driver tools for both the ODBC and JDBC worlds, is now bearing down on the idea using XML technologies to bring its driver-based technologies into web services. [Said] Brian Reed, DataDirect's vice president of market intelligence: 'SQL is for data at rest. XML is for data in motion. There is nothing more optimized than a relational database, if you're talking about stored data. But XML makes data easier to share... XML and SOA [Service-Oriented Architecture] are bringing the ability to standardize middleware, and allow the application to more easily move into the infrastructure -- and not just be inside a silo. This creates a dynamic infrastructure that will make it easier to change new things and still keep data interoperable'... Based on this picture, Reed said DataDirect is aggressively working with leading web services providers -- including Microsoft, Oracle, IBM, Sybase and others -- to migrate the ODBC/JDBC world into a new world of XML-driven loosely-coupled connectivity... DataDirect Connect for .NET 1.1 adds support for distributed transactions on Oracle and Sybase databases is based on Microsoft's Distributed Transaction Coordinator (MS DTC) as the transaction manager. Using MS DTC enables developers to implement 'serviced components' that require distributed transaction support and use ADO.NET data providers... In addition, MS DTC can be used to (1) update multiple databases and files from a single application, (2) update geographically distributed databases, and (3) update databases that have been partitioned for scalability. MS DTC uses a two-phase commit protocol to ensure that all the resource managers commit the transaction or all abort it, to ensure data integrity... DataDirect's jXTransformer is DataDirect's XML software component for transforming data between relational and XML formats in Java programs. Rather than require developers and database professionals to learn database-specific tools, jXTransformer uses a language that is very similar to XQuery, the new draft SQL/XML standard, to enable developers to create XML from relational data or updating relational data from XML input. The goal, Reed said, is to let developers code once using a simple component, and reuse it across multiple databases without learning complex database-specific XML extensions. jXTransformer provides a Java API and a simple language, and a GUI tool for writing queries that will map and transform data between relational and XML formats. jXTransformer uses an API for data access and does not require any database changes, letting developers use existing stored procedures, reports, and queries without changing anything in the database. The tool reads data from relational databases and transforms data into any desired XML structure, and creates simple or complex hierarchical XML documents and XML document fragments..."

[January 17, 2003] "IBM Preparing Xperanto Deliverables." By Paul Krill. In InfoWorld (January 16, 2003). "IBM in the first half of this year is pledging to offer the first products based on its Xperanto technology for integrating multiple data points, as part of IBM's OnDemand initiative for leveraging existing technology assets. Xperanto represents a significant extension of IBM's DB2 database technology, allowing for federated access to data, regardless of whether the data resides in DB2 or in data management systems from vendors such as Oracle, Sybase, and Microsoft, said Nelson Mattos... IBM believes there is 'a major shift happening in the data management industry, which is moving away from the notion of a data management system that is only managing information that is physically stored in the repository toward a data management infrastructure that is managing, integrating, accessing, and analyzing all the information in the enterprise,' he added. IBM differs from Oracle in that Oracle favors a centralized approach to data management, Mattos contended. 'Oracle encourages customers to solve the integration problem by centralizing or moving all the data into the Oracle system, and that does not allow customers to obtain information on demand because if I'm going to centralize, I need to know what information I need to move into the Oracle system,' which is not always doable these days, Mattos said. Oracle officials, however, said IBM with Xperanto is not offering anything new as far as data federation because both IBM and Oracle already have federated data management capabilities..." See IBM Research's Xperanto project and references in "IBM: Xperanto Rollout To Start In Early 2003. Long-Promised Information Integrator on the Horizon."

[November 15, 2002] "Normalizing XML, Part 1." By Will Provost. From XML.com. November 13, 2002. ['Will Provost's Schema Clinic series on XML.com has so far taken an object-oriented view of W3C XML Schema design. This month, Will has written the first of a two-part series that examines the relational aspects of schema design. The series examines guidelines that achieve the goal of normalization -- the principles guiding database design -- using the mechanisms provided by W3C XML Schema.'] "The goal is to see what relational concepts we can usefully apply to XML. Can the normal forms that guide database design be applied meaningfully to XML document design? Note that we're not talking about mapping relational data to XML. Instead, we assume that XML is the native language for data expression, and attempt to apply the concepts of normalization to schema design. The discussion is organized loosely around the progression of normal forms, first to fifth. As we'll see, these forms won't apply precisely to XML, but we can adhere to the law's spirit, if not its letter. It's possible to develop guidelines for designing W3C XML Schema (WXS) that achieve the goals of normalization: (1) Eliminate ambiguity in data expression; (2) Minimize redundancy -- some would say, 'eliminate all redundancy'; (3) Facilitate preservation of data consistency; (4) Allow for rational maintenance of data. In this first of two parts, we'll consider the first through third normal forms, and observe that while there are important differences between the XML and relational models, much of the thinking that commonly goes into RDB design can be applied to WXS design as well. ... the key concept of reducing redundancy through key association is alive and well in W3C XML Schema design. While I'd love to finish on this bright note, I must report that there are devils inhabiting the details. In part two of this article, I'll point them out and discuss the implications for WXS design, as well as addressing the subtler fourth and fifth normal forms..."

[November 05, 2002] "Look at Storage Issues Before You Leap Into XML." By Kevin Dick (Kevin Dick Associates). In Application Development Trends Volume 9, Number 11 (November 2002), pages 45-49. Adapted from Chapter 5 of XML: A Manager's Guide, Second Edition, by Kevin Dick, Addison-Wesley. ['Organizations can avoid missteps by first selecting the right storage model for a project: a DBMS, a content management system or a native XML store.'] "XML documents are data that can be either at rest or in transit. Therefore, enterprises that want to successfully deploy XML must figure out how to manage XML in both of these states. For XML at rest, developers must first decide on the type of store to use. For XML in transit, they must first decide on the server infrastructure to deploy. It is not uncommon for projects using XML to stall while figuring out how to address the storage issue. The confusion stems from the fact that there are three vastly different choices: a database management system (DBMS), a content management system (CMS), or a native XML store. The appropriate choice depends on the characteristics of your XML data. What if you use XML as a data interchange format? In this case, a source application encodes data from its own native format as XML, and a target application decodes the XML data into its own native format. XML is an intermediate data representation. Both the source and target applications already have persistent storage mechanisms, almost certainly DBMSs of one sort or another. There is really no need to store the XML documents persistently themselves, except perhaps for logging purposes..."

[September 25, 2002] "Introduction to Xindice. An Open Source Native XML Database System." By Arun Gaikwad (Independent Software Consultant). From IBM developerWorks, Web architecture, XML zone. September 2002. ['This article is an introduction to an Open Source Native XML Database System, called Xindice (pronounced zeen-dea-chay). It is also an introduction to Native XML Database concepts.'] "Xindice is an Open Source Native XML Database System. In this article, you will learn how to: (1) Install Xindice; (2) Create and delete collections; (3) Insert and delete documents into these collections; (4) Use XQuery to query these documents. You can perform these operations on the command line or embed them in Java programs using the Java API. You will also learn to use the Java API to write JDBC style programs to communicate with Xindice. An XML Database System is something which you may think is unnecessary but once you start using it, you wonder how you would survive without it. I say this from personal experience. When I first heard of Native XML Database Systems about two years ago, I completely ignored them thinking that it was just hype. At that time, I was involved in the development of a project for a large financial brokerage company. We were using XML to send and receive financial feed data. It was necessary to save the feed data in some kind of permanent storage. As a Relational Database programmer, my first choice was to use a Relational Database System to save these XML documents. I decided to use CLOBs (Character Large Objects) with a modern RDBMS to save these documents. Since the RDBMS supported a Java API to insert and retrieve CLOBs, this was a very easy task. As our project evolved, I found that this approach had a major drawback. This was nothing but DIDO (Document In, Document Out). Retrieving partial documents or nodes from a DOM tree was not possible. I would have found a tool which saved the XML documents, performed database-like queries on nodes, and retrieved partial or full documents very useful. This is when NXDs came into the picture. If I had to do this project all over again, I would definitely use an NXD. If you need simple DIDO functionality, you might want use an RDBMS to save your documents, but for extended functionality such as Query and Update you should consider an NXD. Sometimes people try to save XML documents into Normalized Relational Database tables by mapping the document nodes into Relational format. This is not always easy. It is relatively easy to build an XML document from RDBMS tables, but not to store them because XML documents are hierarchical and almost free format..." Also available in PDF format.

[September 16, 2002] "Tame the Information Tangle: XML Data Management Systems." By Paul Sholtz. In New Architect Magazine Volume 7, Issue 10 (October 2002), pages 36-40. "Encoding information in XML and exposing it on the Web will help overcome these hurdles and enable fine-tuned, database-like queries on a global scale. Of course, if all the world's data is to be encoded in XML, we'll need more efficient ways to store and manage large volumes of XML data. To address that need, a new breed of document storage and management systems has appeared that's been specially optimized for publishing XML documents on the Web... If creating and maintaining relational data mappings seems like too much work for the scope of your XML application, one attractive alternative is to use a native XML database (NXD). The concept of a native XML database was first introduced by Software AG during the marketing campaign for its Tamino product line. Since then, the term has come into common usage among other companies developing similar products. NXDs are optimized for the storage and management of XML documents. Like other modern data management systems, they provide support for transactions, security, concurrent access, and query languages. Formally, a native XML database can be defined as a data management system that exhibits the following characteristics: (1) XML documents are the fundamental unit of logical storage in the system (similar to the way in which rows in a table are the fundamental unit of logical storage in a relational database system). (2) The system defines a logical model for XML documents, and stores and retrieves documents according to that model. At the very least, the model must include support for elements, attributes, PCDATA, and document order. Some examples of logical models that meet these requirements include the XPath data model and the XML InfoSet. (3) The system is independent of any underlying physical storage model. For example, it could be implemented using relational, hierarchical, object-oriented, or proprietary storage formats. NXDs are often a good choice for storing document-centric XML information. For example, NXDs support XML query languages that let you perform highly specialized queries like "find all documents where the second paragraph contains an italicized word." Most NXDs provide other powerful and sophisticated text-searching features, such as thesaurus support, word stubbing (for matching all forms of a word: swim, swam, and swimming, for example), and proximity searches (find all instances where the word "lake" occurs within five words of "swim"). These are extremely useful features when you're working with traditional documents, although they are usually much less important if you are working with data-centric XML information. There are other reasons you might want to consider using an NXD. Many such repositories are able to understand a DTD or an XML Schema, and can therefore provide data validation on the fly, as information is stored or updated. NXDs can also persist information such as document order, processing instructions, comments, CDATA sections, and entity usage, while many systems that attempt to store XML data into relational databases cannot..."

[August 06, 2002] "Managing Change." By Adam Bosworth (Vice President, Engineering, BEA Systems Inc). In XML & Web Services Magazine Volume 3, Number 5 (August/September 2002). "How can running instances of applications handle changes in business logic? That's the question I posed a few weeks back in an e-mail to a few key internal architects at BEA discussing some of the problems I think we still need to solve. Then I left on a four-day, five-country trip that left me out of the loop on e-mail. The question was meant to address a challenge faced by our customers with really long-running workflows and "conversations." In such cases, the business logic may change while the instances of the prior version of the application are still far from complete. Previously, I had thought this would not be an issue because people would not want to change the business logic of running instances, but simply deploy new applications with the new logic. However, numerous discussions with customers proved that the real world is a weird and wonderful place; people really do want to change business logic on the fly... [Customers?] If it is metadata, they are storing it in XML. If it is state that is essentially transient, they are increasingly managing it in XML. They are doing this because it is easy to write tools to analyze, migrate, and reshape XML to handle change. Customers have learned the hard way that this isn't true of either Java serialization or relational databases. With databases in particular, one of our customers' biggest problems, considering the highly dynamic world they live in, is the inflexibility of data in continuously running systems. Database administrators spend untold fortunes coping with this. Even after working with XML for six years, I'm still pleasantly surprised at the prevalent use of XML for metadata. I believe that we are at a point where the two biggest revolutions in computer science of the last 20 years, object-oriented computing and relational databases, have failed us. Because our systems must be available 24x7 for years on end, the methods we have for accommodating change just don't work. Customers running complex operations such as fabrication systems can never shut them down, but they constantly want to fine-tune the operations. In so doing they need to change the shape of the information they need, but cannot easily do so... So who needs an XML database? Anyone dealing with change..."

[August 5, 2002] "Oracle Goes XML." By Timothy Dyck. In eWEEK (August 02, 2002). "Oracle Corp. is the first among the big relational database vendors to make major changes to its database in response to XML, shaking up the generally overpriced and underperforming native XML database market something fierce but having a lesser effect on current Oracle database sites. Oracle9i Database Release 2 continues to provide the largest range of features available in a database... All the major database players are moving to strengthen support for XML data and XML query languages in their products. In the case of IBM's DB2 and Microsoft's SQL Server databases, XML technologies and SQL will be on the same level as data access techniques. However, Oracle has gotten there first with its XML DB engine. XML DB is a combination of three technologies: a large set of SQL functions that allows XML data to be manipulated as relational data (through a view or special SQL functions) as well as to retrieve relational table data in XML format; a native XML data type called XMLType that can store XML data either in an object-relational storage format that maintains the XML DOM (Document Object Model) or as the original text document; and a special hierarchical XML index type to speed access to hierarchies of XML files stored in Oracle9i's XML file repository. XML DB also supports XML Schema, the latest standard for defining the structure of XML documents, although it doesn't support the upcoming XML query language, XQuery. Instead, XML DB uses a combination of XPath and SQL to manipulate XML. The database includes an Extensible Stylesheet Language Transformation engine, made accessible through the built-in copy of Apache, that can retrieve XML data from XML DB and transform it into HTML or other formats... Previous versions of Oracle and other relational databases support the option of storing XML as text data or extracting data from XML and storing it in normal relational tables, but the interim option of storing data in a format that maintains DOM fidelity (including comments, namespaces, the distinction between elements, and attributes and element ordering) is valuable and is the distinguishing feature of a native XML database. The DOM format doesn't require XML documents to be re-parsed when accessed, and this, in combination with XML and SQL index types, should provide good performance."

[August 02, 2002] "The Next Generation Database - XDB." By Greg Mable. In XML Journal Volume 3, Issue 6 (June 2002). "... With the advent of Web services, applications are now free to communicate in a common format - that of an XML document - anywhere on the Web. Where the Web was once built on static content linked together via hypertext, XML takes it to the next level. Instead of users surfing the Internet via HTML pages linked with hyperlinks, we can now build Web-based applications that can be linked via XML documents. Imagine a user clicking on a link to a Web site. This in turn fires off an exchange of an XML document to another application. Here's the key: the XML document. This will be the primary means of information exchange and message passing. With the need to process XML documents comes the need to be able to store, retrieve, and report on them. Hence the need for a management system to handle the flood of XML documents that an application will process. This is where an XML database, XDB, comes in... So what is an XML database, or an XDB? In this article I define what it is, when and why you will need to use one, and what impact it will have on the business world. By the time you finish reading, you just may realize the importance of an XDB and will want to grab your surfboard to ride the next big wave. There are no requirements for how an XDB is expected to physically store XML documents. Some XDBs are built on an object database, others might use compressed files with an indexing scheme, and still others might be built on top of a relational database. At this time XDBs can be classified into two basic types (with a third type on the horizon): native and XML enabled. Native XML database: A native XML database (NXDB) is simply one that was designed from the ground up to store XML documents. It might make use of a preexisting technology such as object-oriented data storage techniques, but its mission is to store, retrieve, and update XML documents. XML-enabled database: In the second type, an XML-enabled database (XEDB), extensions are added to a preexisting database management system to support XML documents. An XEDB can be built on top of an existing object-oriented or relational database management system. An XEDB provides a mapping layer between the XML documents and its database structures as well as support for XML-based tools to retrieve and update XML documents. Convergence of NXDB and XEDB: The third type of XDB is in its formative stages, and like a wave approaching the beach, it is about to crest. It can be considered a convergence of the two other types: an XDB that is designed to handle XML documents but is built on a preexisting database technology, combining them into a unified data model and a single repository. In this article I'll briefly describe an example for each of these types..."

[July 30, 2002] "Adventures in High-Performance XML Persistence, Part 1. A High-Performance TCL-scripted XSLT Engine." By Cameron Laird (Vice president, Phaseit, Inc.). From IBM developerWorks, XML Zone. July 2002. ['XML storage is too sprawling a topic to offer easy answers. There's no one fastest XML database, nor fastest XML processing language. Still, it's helpful to understand the basic concepts of XML persistence so you can apply them to your specific situation. This article begins a new developerWorks series on high-performance XML by offering an explanation of common industry practices in XML persistence -- that is, storage of data beyond the lifetime of a single process.'] "You're responsible for large, mission-critical XML programs. You have dozens, or maybe thousands, of simultaneous users. Your XML pilot programs have gone well, and you've deployed more and more features. Your systems are in constant use, and response time is starting to stall. You start to wonder, 'What does it take to maximize XML performance?' The answer: You don't want to maximize your XML performance. You need to meet engineering requirements. Perhaps you need to manage scalability, or boost the responsiveness of specific applications. Don't hunt for the fastest XML storage. In that direction lie $700 hammers and the other symptoms of counter-productive obsession. Instead, learn to apply the basic concepts of XML so you can engineer the persistence needed for your own situation... The first principle of designing XML persistence is that any solution must make for a comfortable organizational fit. If your company requires use of Java technology, and a particular XML database has a poor Java binding, don't choose it. No matter how high its performance on standard benchmarks, it's likely that your co-workers will not make good use of it. Working with an unfamiliar technology will annoy them, and they're unlikely to achieve favorable results. On the other hand, suppose you work in an environment that provides a great deal of support for a database such as DB2. However well or poorly your XML content fits the DB2 persistence model, you should seriously consider DB2 storage. Sufficiently enthusiastic, well-equipped, and motivated expertise is likely to overcome modest mismatches on the technical level, as this article will show you. The principal categories of XML persistence center on these technologies: (1) Native file system, (2) Relational database management systems (RDBMS), (3) Special-purpose XML database managers, (4) Other data managers. The easiest XML storage is native: Keep XML document instances as named files in a file system. This is the most transparent and flexible persistence method, and should be your default starting point for new designs. ... No one XML persistence method is right for all scales of problem. Start with familiar technologies for your needs to store XML data. Make a clear distinction between policy requirements for transacting or storing data formatted as XML, and application-specific design requirements for data security and performance. Choose persistence methods compatible with the technologies your organization uses..."

[June 18, 2002] "XML Stores Get Richer Queries." By Matt Hicks. In eWEEK (June 17, 2002). "Native XML database developers X-Hive Corp., Excelon Corp., Ipedo Inc. and Software AG are adding more support in upcoming releases for emerging standards for such functions as querying. Much of the focus for the developers with their latest crop of XML databases is on bolstering querying capabilities through the XQuery XML data retrieval standard. X-Hive, for example, last week began shipping Version 3.0 of its X-Hive/DB, which supports XQuery, said officials in Rotterdam, Netherlands. Separately, Excelon, in a point release to its Extensible Information Server, due next month, will add full XQuery support. Also in releases planned over the next year and a half, the Burlington, Mass., company aims to support the XForms standard for handling XML forms, officials said. Ipedo, of Redwood City, Calif., is beefing up its current XQuery support. Version 3.1 of its namesake XML database, due next week, will be able to perform updates in the querying language. That release will also include support for the WebDAV, or Web-based Distributed Authoring and Versioning, protocol so documents from popular client applications can be published into the database server, officials said. For its part, Software AG plans to add full XQuery support in the next major release of its Tamino XML Server, Version 4.11, due by the end of the year. That release will also include validation of XML Schema; enterprise-level backup and restore of the database; and improved tools for Web services features such as Universal Description, Discovery and Integration directories, said company officials, in Darmstadt, Germany. All these companies are looking to extend their technological lead over Oracle Corp., IBM and Microsoft Corp., which offer XML add-ons and have plans to embed XML support deeper within their database engines..."

[May 06, 2002] "XML in Java: Data Binding with Castor. A Look at XML Data Binding for Java Using the Open Source Castor Project." By Dennis M. Sosnoski (President, Sosnoski Software Solutions, Inc.). From IBM developerWorks, XML Zone. April 2002. ['XML data binding for Java is a powerful alternative to XML document models for applications concerned mainly with the data content of documents. In this article, enterprise Java expert Dennis Sosnoski introduces data binding and discusses what makes it so appealing. He then shows readers how to handle increasingly complex documents using the open source Castor framework for Java data binding. If your application cares more about XML as data than as documents, you'll want to find out about this easy and efficient way of handling XML in Java.'] "Most approaches to working with XML documents in applications put the emphasis on XML: You work with documents from an XML point of view and program in terms of XML elements, attributes, and character data content. This approach is great if your application is mainly concerned with the XML structure of documents. For many applications that care more about the data contained in documents than the documents themselves, data binding offers a much simpler approach to working with XML... The document models discussed in previous articles of this series are the closest alternatives to data binding. Both document models and data binding build document representations in memory, with two-way conversions between the internal representation and standard text XML. The difference between the two is that document models preserve the XML structure as closely as possible, while data binding is concerned only with the document data as used by your application... Data binding is a great alternative to document models in applications that use XML for data exchange. It simplifies your programming because you no longer need to think in terms of XML. Instead, you can work directly with objects that represent the meaning of the data as used by your application. It also offers the potential for better memory and processor performance than document models... Data binding can provide other benefits beyond justprogramming simplicity. Since it abstracts many of the document details, data binding usually needs less memory than a document model approach. Consider, for instance, the two data structures shown in the earlier figures: The document model approach uses 10 separate objects, as compared to two for data binding. With a lot less to build, it may also be faster to construct the data binding representation for a document. Finally, access to the data within your program can be much faster with the data binding approach than with a document model, since you control how the data is represented and stored. I'll get back to these points later. If data binding is such great stuff, when would you want to use a document model instead? The two cases that require a document model are: (1) Your application is really concerned with the details of the document structure. If you're writing an XML document editor, for instance, you'll want to stick to a document model rather than using data binding. (2) The documents you're processing don't follow fixed structures. For example, data binding wouldn't be a good approach for implementing a general XML document database..."

[April 24, 2002] "Database Future Debated." By Paul Krill. In InfoWorld (April 24, 2002). "Whether the future of databases is the traditional, relational and SQL model with XML technologies incorporated into it or a new XML-based model is a matter of debate, according to panelists during a session Tuesday [2002-04-23] at the Software Development Conference & Expo. The fate of XML and SQL dominated the discussion, which featured officials from companies such as Oracle, Sun Microsystems, and IBM. 'I think that XML will become the dominant format for data interchange,' with its flexibility and ability to provide self-description,' said Don Chamberlin, a database technology researcher at IBM. Relational databases, he said, will be fitted with front ends to support XML and process queries based on the XQuery standard... Sun's Rick Cattell, a distinguished engineer at the company, had a less dominant outlook for XML, saying very few people are going to store XQuery data in an XML format. 'I think the momentum behind relational databases is insurmountable,' Cattell said, adding that he was drawing on his experience with object-oriented databases, which were unable to unseat relational databases in enterprise IT shops. Developers, Cattell said, will need tools to convert relational data to XML and vice versa. Another panelist, Daniela Florescu, chief technology officer at XQrl, said she was 'pretty optimistic [about] the performance of XML databases.' Documents will be stored natively in XML, she said. XQrl offers a version of the XQuery XML query language. Currently, performance on the Web is hindered because of translations between Java and XML data formats, Florescu said. 'I don't think we will have good performance as long as we have people marshalling data from XML to Java and back,' Florescu said... Panelists also touched on topic such as tuple space technology, which is intended to make it easier to store and fetch data by recognizing patterns. Tuple space technology is 'interesting, but I wouldn't predict that it's going to take over the world,' since much more research needs to be done and most people are not building production applications based on it, Cattell said. Cattell also said in-memory database technology is a 'no-brainer,' but there is not enough memory available yet to accommodate it... Panelist Jim Melton, consulting member of the technical staff at Oracle, said he is part of a vendor group called SQLX that has been working for a year to define ways to bring SQL and XML closer together. The group in mid-2003 plans to publish a specification called SQL/XML, which will contain publishing functions for the two formats..."

[April 11, 2002] "Database Strategies for Unstructured Content." By Stuart J. Johnston. In XML Magazine Volume 3, Number 3 (April/May 2002), pages 18-27. ['Relational and native XML database developers take diverse approaches to managing free-form information data.'] "... Consider using a native XML database as a consolidation point for data that has to be exchanged in an industry-standard format. With its ability to represent and query data from many sources as pure XML, a native XML database can enable levels of data correlation, aggregation, and information mining that would be difficult or impossible to achieve without a central place to standardize data formats and protocols. Although the field is still incipient, native XML databases might help enterprises respond to changes in the business environment more quickly and cheaply than custom approaches... a survey conducted in March 2001 at the Data Administration Management International Symposium by Intellor Group and Wilshire Conferences found that 12 percent of companies had already implemented a native XML database or planned to within the following 12 months. Indeed, advocates say that the use of native XML databases as middle-tier servers between conventional relational databases such as IBM's DB2 and XML-based Web services is catching on. Rather than using translators, the pure XML databases can speed processing time for electronic transactions while off-loading demand from the large-scale, enterprise database. Most native XML databases can communicate with relational systems relatively easily using either ODBC or JDBC drivers, through Extensible Stylesheet Language Transformations (XSLT), or in some cases using XPath. Rather than make the relational database translate SQL data into XML 'on the fly,' argue native XML aficionados, why not off-load most of that work to a native XML database? This approach has several benefits, including consolidation of all XML data in a single repository designed specifically to handle XML information and documents. However, the market is still formative, and many of the products themselves are not yet mature. Many of the players so far have come out with only version 1.0 releases, although a few have 2.0 and 3.0 releases now. But that doesn't mean that even the 1.0 releases aren't useful today, or that now wouldn't be a good time to get up to speed, try some pilot projects, and gain a measure of understanding as to what's good and bad about them. Key to the functionality of XML databases is support for several XML standards or proposed standards, although not every product will support all of them. The proposed standards include XSLT, Document Type Definitions (DTDs), and XML Schemas, as well as XPath (an XML language for addressing parts of XML documents), and XQuery. XQuery is an XML-based query language, an emerging World Wide Web Consortium (W3C) proposed standard for querying data in XML, which includes XPath 2.0 as a subset...

[March 15, 2002] "XPERANTO: Bridging Relational Technology and XML." From International Business Machines Corporation, DB2 Developer Domain. By Catalina Fan, John Funderburk, Hou-in Lam, Jerry Kiernan, and Eugene Shekita (IBM Almaden Research Center, San Jose, CA 95120) and Jayvel Shanmugasundaram (Cornell University). [March 2002.] 9 pages. ['The cutting edge of data management research! The XPERANTO research project enables XML-based applications to leverage relational database technology by using XML views of existing relational data.'] "XML has emerged as the standard data-exchange format for Internet-based business applications. These applications introduce a new set of data management requirements involving XML. However, for the foreseeable future, a significant amount of business data will continue to be stored in relational database systems. Thus, a bridge is needed to satisfy the requirements of these new XML-based applications while still leveraging relational database technology. This paper describes the design and implementation of the XPERANTO middleware system, which we believe achieves this goal. In particular, XPERANTO provides a general framework to create and query XML views of existing relational data. One of the features provided by XPERANTO is the ability to create XML views of existing relational data. XPERANTO does this by automatically mapping the data of the underlying relational database system to a low-level default XML view. Users can then create application-specific XML views on top of the default XML view. These application-specific views are created using XQuery, a general-purpose, declarative XML query language currently being standardized by W3C. XPERANTO materializes XML views on demand, and does so efficiently by pushing down most computation to the underlying relational database engine. Another feature provided by XPERANTO is the ability to query XML views of relational data. This is important because users often desire only a subset of a view's data. Moreover, users often need to synthesize and extract data from multiple views. In XPERANTO, queries are specified using the same language used to specify XML views, namely XQuery. XPERANTO executes queries efficiently by performing XML view composition so that only the desired relational data items are materialized. In summary, XPERANTO provides a general means to publish and query XML views of existing relational data. Users always use the same declarative XML query language (XQuery) regardless of whether they are creating XML views of relational data or querying those views. ... XPERANTO exposes relational data as an XML view. Users can then query these XML views using a general-purpose, declarative XML query language (XQuery), and they can use the same query language to create other XML views. Thus, users of the system always work with a single query language In addition to providing users with a powerful system that is simple to use, the declarative nature of user queries allows XPERANTO to perform optimizations such as view composition and pushing computation down to the underlying relational database system." See also "IBM Federated Database Technology," by Laura Haas and Eileen Lin.

[February 11, 2002] "Combining UML, XML and Relational Database Technologies. The Best of All Worlds For Robust Linguistic Databases." By Larry S. Hayashi and John Hatton (SIL International). Pages 115-124 in Proceedings of the IRCS Workshop on Linguistic Databases (11-13 December 2001, University of Pennsylvania, Philadelphia, USA. Organized by Steven Bird, Peter Buneman and Mark Liberman. Funded by the National Science Foundation). "This paper describes aspects of the data modeling, data storage, and retrieval techniques we are using as we develop the FieldWorks suite of applications for linguistic and anthropological research. Object-oriented analysis is used to create the data models. The models, their classes and attributes are captured using the Unified Modeling Language (UML). The modeling tool that we are using stores this information in an XML document that adheres to a developing standard known as the XML Metadata Interchange format (XMI). Adherence to the standard allows other groups to easily use our modeling work and because the format is XML, we can derive a number of other useful documents using standard XSL transformations. These documents include (1) a DTD for validating data for import, (2) HTML documentation of diagrams and classes, and (3) a database schema. The latter is used to generate SQL statements to create a relational database. From the database schema we can also generate an SQL-to-XML mapping schema. When used with SQL Server 2000 (or MSDE), the database can be queried using XPath rather than SQL and data can be output and input using XML. Thus the Fieldworks development process benefits from both the maturity of its relational database engine and the productivity of XML technologies. With this XML in/out capability, the developer does not need to translate between object-oriented data and relational representation. The result will be, hopefully, reduced development time. Another further implication is the potential for an increased interoperability between tools of different developers. Mapping schemas could be created that allow FieldWorks to easily produce and transfer data according to standard DTDs (for example, for lexicons or standard interlinear text). Data could then be shared among different tools -- in much the same way that XMI allows UML data to be used in different modeling tools..."

[January 11, 2002] "An Introduction to the XML:DB API." By Kimbro Staken. From XML.com. January 09, 2002. ['The growing number of native XML databases all have different programming interfaces. The XML:DB API is an open source project to provide a unified API for native XML databases.'] "In my last article, 'Introduction to dbXML', I provided an example that used the XML:DB API to access the dbXML server. This time around we'll take a more detailed look at the XML:DB API in order to get a better feel for what the API is about and how it can help you build applications for native XML databases (NXD). Currently, there are about 20 different native XML databases on the market. Among them are commercial products such as Tamino, X-Hive and Excelon. And open source NXDs include dbXML (now renamed Apache Xindice), eXist, and Ozone/XML. While this selection is a nice thing to see in an emerging market, it makes developing applications quite a bit more difficult. Each NXD defines its own API which prevents the development of software that will work with more then one NXD without coding for each specific server. If you've worked with relational databases, then you've likely worked with ODBC or JDBC to abstract away from proprietary relational database APIs. The goal of the XML:DB API is to bring similar functionality to native XML databases. The XML:DB API project was started a little over a year ago by the XML:DB Initiative and is currently still evolving. Most of the core framework is stable, and it has already been implemented by dbXML/Xindice and eXist. There's also a reference implementation in Java available, and there are several other implementations in progress, including some for commercial databases... There is much more to the XML:DB API than what's illustrated in this simple example and short article. But I have given you a better idea of what the API is and how it is used. If you want to find out more you should take a look at the XML:DB API site and the dbXML developers guide. The eXist documentation also contains some information about developing with the API. While there is still a lot of work to do on the XML:DB API, what is available today is already usable and provides a solid framework to build on. In fact, projects like Apache Xindice are using the XML:DB API as the primary Java API for accessing the server. Participating in API development is open to anyone who's interested; feel free to join the project mailing list and contribute to the development of the XML:DB API."

[January 11, 2002] "Working out the Bugs in XML Databases." By John Cox. In Network World Volume 19, Number 1 (January 07, 2002), page 24. The article summarizes the pros and cons of special XML repositories. ['As network executives begin to experiment with Web services, they're likely to find that they need a new kind of data store: the XML database. There's a growing belief that XML-based information needs its own database.'] "XML database software products are designed to efficiently store and manage the growing numbers of XML documents that users are creating, especially in Web interactions with business partners and customers. Advocates cite several advantages of XML databases compared with traditional databases: simplicity, ease of application development, ability to search and query XML documents, and fast document retrieval. There's no formal, standard definition of an XML database, although the XML:DB Initiative describes such a database as one that defines a logical model for an XML document (not for the data in the document), and manages documents based on that model. The key point is the database 'thinks and acts' based on XML - XML goes in, and XML comes out, even though these products can physically store the documents in an object or relational database or a proprietary storage model, such as indexed files. The lack of formal definition is just one issue that raises the hackles of critics. They also point to the immaturity of the products and of XML standards; the absence of a standard, reliable query language to match the SQL used in relational databases; and possible data integrity problems... Analysts expect these benefits to fuel a fast-growing market. IDC estimates enterprise spending for XML databases will grow by 130% annually, reaching $700 million in 2004. XML databases will complement relational databases, according to IDC analyst Anthony Picardi - the former being better suited for storing and processing XML documents, the latter for numbers and text. There are plenty of choices for network executives to evaluate, with at least two dozen native XML database products (see XML Database Products). The key vendors include Software AG and eXcelon - which stores documents in its ObjectStore object-oriented database. There are a host of smaller vendors, such as NeoCore, IXIA and ZYZFind, working on XML database products. There are also a number of open source projects. One is Xindice, formerly dbXML Core, which now is being handled by The Apache Software Foundation..."

[January 10, 2002] "On Database Theory and XML." By Dan Suciu (University of Washington). In SIGMOD Record Volume 30, Number 3 (2001). 7 pages (with 64 references). "Over the years, the connection between database theory and database practice has weakened. We argue here that the new challenges posed by XML and its applications are strengthening this connection today. We illustrate three examples of theoretical problems arising from XML applications... [We describe] three XML research problems, inspired from our own work. XML's semistructured data model represents paradigm shift for theoretical database research. It is not the first one: for example the object-oriented data model can also be considered a paradigm shift, which generated a vast amount of theoretical and applied research. This time, however, the shift comes from outside the community (XML was imposed on us) and this, at least, settles easily the question of applicability. It offers us both a chance both to apply research on old topics (query containment) and to conduct research on new topics (typechecking)... Today the most promising approach to typechecking remains that based on type inference. The XDuce language defines a type inference system for a functional language with recursion; the XQuery algebra defines a type inference system using XML Schema as its type system. Since we know that this approach cannot be as robust as typechecking in general-purpose programming languages, a study of its applicability and limitations is needed. XML Storage XML data is a labeled tree; a relation is a table. The problem of storing XML data in one or several tables is a challenging one, both for theoreticians and practicians. Since the tree is meant to describe some irregular structure while tables are by definition regular, we are attempting to store some irregular data into a regular data type. In addition to the pure combinatorial aspect, there is a logical aspect to the storage problem: given a storage mapping, one needs to be able to translate queries formulated over the XML data into relational queries formulated over the relational storage. The combination of combinatorics and logic make the problem particularly appealing. Several approaches have been tried so far. The simplest is to store XML as a graph, in a ternary relation (two columns for the edges, the third for the labels and/or data values). This approach is explored by Florescu and Kossman. The price one pays for its simplicity is that many self-joins of the edge table are required in order to reconstruct a given XML element: one join for each subelement. Shanmugasundaram et al. ["Relational databases for querying XML documents: limitations and opportunities"] use the DTD (or XML-Schema) to derive a relational schema. One table is created for each element type that can occur in a collection position. This technique works well in practice whenever one has a schema for the XML document. A subtle problem is that the resulting storage is very sensitive to that schema. For example if the content of <person> changes from (name, phone) to (name, phone*) then we need to move all phone numbers to a separate table, although perhaps the XML document has changed very little. The case when the XML document has no schema, or when the schema changes frequently is harder, and has a more dramatic impact on performance... The challenge in any storage schema is that it has to be flexible enough to accommodate any XML data, yet it has to be as efficient as regular data storage when the XML data happens to be regular. Finding the largest regular subset in an irregular data instance is a problem which can be formulated and addressed theoretically..." [source]

[December 20, 2001] "E-business Middleman. Native XML Databases." By Maggie Biggs. In InfoWorld Issue 51 (December 17, 2001), pages 37-38. ['Native XML databases tap heterogeneous back-end databases to feed Web-based applications and trading partners. A native XML database makes good economic sense for enterprises that must support XML document handling and interaction with multiple back-end data sources. In addition, native XML databases can simplify the management of enterprise data processing performance... An emerging technology, native XML databases are currently best suited to early adopters willing to experiment. When existing shortcomings -- such as query and update handling -- are resolved, these databases promise to make XML handling much more manageable for most IT shops.'] "Without a doubt, XML is fast becoming the lingua franca of b-to-b data exchange. As the use of XML increases, executives and IT managers must begin factoring in the growing number and differing types of XML solutions now coming to market before they can determine the most cost-effective XML strategy to implement. Recently major relational database vendors, such as Oracle and Microsoft, have introduced XML-enabling technologies in their products: Oracle's XDB and Microsoft's SQLXML. Rival IBM has offered an XML Extender for its DB2 database for some time. Another promising, more manageable approach to XML in the enterprise is the emerging NXDB (native XML database). An NXDB does not replace your existing enterprise data sources. Rather it acts as an intermediate cache that sits between back-end data sources and middle-tier application components. Using an NXDB provides two principal benefits. First, it's likely your enterprise has multiple back-end data sources and various types of middle-tier applications. Rather than liberally sprinkling XML capabilities across the middle tier and back end, which may significantly increase technology expenditures, you could add the XML support you need by implementing an NXDB. An NXDB supplies the programmatic interfaces and data access methods necessary to support multiple applications and data sources. Second, you might use an NXDB to augment the processing power of your primary enterprise databases. Rather than devote primary database processing cycles to XML translation, storage, and retrieval during peak hours, moving these operations to an NXDB can free primary databases for more important tasks, such as transaction processing. Interaction between the NXDB and your back-end data sources can then be performed at times of the day or night that allow you to optimize processing performance and reduce the load on back-end databases that must also serve other applications and end-users. Many of the XML handling capabilities recently added to RDBMSes provide functionality similar to that provided by an NXDB. This has caused some confusion and begs the question, What constitutes an NXDB? An NXDB differs from an RDBMS in three key ways... Executives and IT managers should consider NXDBs when formulating an XML strategy. However, NXDBs are an emerging technology; querying and update capabilities are still maturing..."

[December 13, 2001] "IBM Spills Beans on Xperanto Database Initiative." By Tom Sullivan and Ed Scannell. In InfoWorld (December 13, 2001). "XML has been causing quite a splash in the database world, particularly in the last few weeks, and IBM is the latest vendor to detail plans for the standard. In IBM's research labs, the company is working on a project, code-named Xperanto, which will be a native XML database that acts as a subset of DB2, said Janet Perna, general manager of Armonk, N.Y.-based IBM's data management solutions group. By using XML and relying on the XML query language XQL, Xperanto will be a critical piece of IBM's long-term vision to marry structured and unstructured data. 'The value of this is it's the next step beyond a federated database,' Perna said. That step, Perna added, is information integration. IBM has application integration via its WebSphere products, business process integration from its recent CrossWorlds acquisition, and Xperanto acts as a dedicated server for data or information integration. 'We have a new class of software that really is about information integration,' Perna said. Nelson Mattos, a distinguished engineer and director of information integration at IBM's Silicon Valley Labs, said that the customer pain point Xperanto is aimed at is how to tie together all the systems in an organization... Mattos continued that Xperanto will be the materialization of IBM's work on a number of Web services-related standards, including XQuery, XML Schema, UDDI (Universal Description, Discovery, and Integration), SOAP (Simple Object Access Protocol), WSDL (Web Services Description Language) and WSFL (Web Services Flow Language). The end goal of IBM's integration strategy is to be able to combine structured and unstructured data, thereby enabling access to a broader array of data sets within an organization, such as Office files. So organizations would be able to access the content in the Word files that reside on individual employees desktop systems. Both Microsoft and Oracle said they are working to enhance XML support in the database as well as toward the same goal of providing users more insight into all of the intelligence within an organization... Within IBM's strategy, DB2 handles structured data, OLTP (Online Transaction Processing), BI (business intelligence), and Web applications, while the Content Manager software takes care of unstructured information, such as rich media and flat files. Perna said that the widespread adoption of XML has made the idea of combining structured and unstructured data come alive..."

[December 11, 2001] "Integrating Network-Bound XML Data." By Zachary Ives, Alon Halevy, and Dan Weld. In IEEE Data Engineering Bulletin 24/2 (June 2001). "Although XML was originally envisioned as a replacement for HTML on the web, to this point it has instead been used primarily as a format for on-demand interchange of data between applications and enterprises. The web is rather sparsely populated with static XML documents, but nearly every data management application today can export XML data. There is great interest in integrating such exported data across applications and administrative boundaries, and as a result, efficient techniques for integrating XML data across local- and wide-area networks are an important research focus. In this paper, we provide an overview of the Tukwila data integration system, which is based on the first XML query engine designed specifically for processing network-bound XML data sources. In contrast to previous approaches, which must read, parse, and often store XML data before querying it, the Tukwila XML engine can return query results even as the data is streaming into the system. Tukwila features a new system architecture that extends relational query processing techniques, such as pipelining and adaptive query processing, into the XML realm. We compare the focus of the Tukwila project to that of other XML research systems, and then we present our system architecture and novel query operators, such as the x-scan operator. We conclude with a description of our current research directions in extending XML-based adaptive query processing..." Also in Postscript. A related paper ("An XML Query Engine for Network-Bound Data") has been submitted for publication (2001). See other XML-related publications from the University of Washington Database Research Group, including those of Zachary G. Ives.

[December 06, 2001] "Interview: Oracle's Jeremy Burton Talks XML." By Michael Vizard and Steve Gillmor. In InfoWorld (December 5, 2001). "At its Oracle OpenWorld user conference in San Francisco this week, Oracle jumped into the Web services fray with both feet. In a bid to gain marketshare in the J2EE (Java 2, Enterprise Edition) application server world, Oracle has added support for Web services standards in Oracle9iAS, along with better clustering, wireless capabilities, and bolstered security. To make XML-based Web services integral to its flagship relational database, Oracle is detailing a project called XDB (XML database support), which is designed to provide high-performance XML document storage. Oracle's vice president of worldwide marketing Jeremy Burton spoke to InfoWorld editors Michael Vizard, Steve Gillmor, Martin LaMonica, Tom Sullivan, and Mark Leon about Oracle's views on emerging software technologies and architectures... [What are the different levels of 'XML support'?] Burton: [The first level is] if we look at the tags in the XML document and we can build an index based on those tags, and although the document is stored in kind of one big chunk, we can make the document searchable from standard SQL. The next level I guess would be is as the document goes in, you do a certain amount of parsing of the XML document and you apply structure to the data as it goes into the database. Now, the overhead is that you've got to do parsing so there's a performance hit there. But the upside is that you can add a bit more structure to that document, you don't store it in one piece. So if you just want to search abstracts, for example, or conclusions of documents that are tagged with XML, then you can do that. So maybe you've got XML documents which have an introduction, body, and conclusion. You could store the introduction, body, and conclusion separately and search those three things independently. Or, if you want to take more of a hit when the document goes in, you can write some code to parse the entire document so you can do more granular searching. The big problem with doing a lot of parsing and adding structure is that you take a performance hit. Why? Because when you've got a relational database, a table is a different shape to a series of nested structures, which is what an XML document is... So most [vendors], right now, they can take the document and store it in a database and make it opaque. I think ourselves and probably IBM can index the document as it goes in and make it searchable. And I guess there's much debate over how much structure you can add to a document and how searchable you can make it. I'd say we do some things that IBM can't do, but it's not a long-term defensible advantage. So we set about kind of solving the problem for good, and one of the things that we have [is] called XDB, a project at Oracle. One of the underlying technologies that our new system is based on is objects. If you look at an XML document, it's a series of nested structures. If you look at objects in the database, it's a series of nested structures. And the XML document by and large is exactly the same shape as the object in the database. If you're then [storing] that object, you can do it in a very high-performance way -- there's no huge amount of parsing and flattening of structures and then reconstituting them later. The beauty of it as well from the way we've implemented our objects is that your application need not know nor care whether they're dealing with SQL, relational information, or structured XML-related information. And we also store the document in a very highly compressed way. We add structure to the document, we store it in the objects. And then the tags we pull out and store in a separate metadata directory. All the time we're building a directory of every tag available that your company deals with, and it often means we can store the information in a very highly compressed form..."

[November 30, 2001] "Oracle to Boost XDB, Web Services Support." By Mark Leon and Tom Sullivan. In InfoWorld (November 30, 2001). "Oracle says it will raise the bar in XML database and Web services support next week at the Oracle OpenWorld conference in San Francisco. The buzz surrounds Oracle's XDB (XML database support) that the company reports will utilize the object database functions it built in the mid-90s that allow users to store XML as an object and index the tags, then stored it in a compressed table. Oracle has also revealed it will use OpenWorld to unveil added support for Web services standards and J2EE 1.3 in the Oracle application server. 'We are also putting support for Web services in our JDeveloper tool,' said Jeremy Burton, vice president of worldwide marketing at Oracle in Redwood Shores, Calif. As for the advantages of XDB, Burton explained that XML documents are stored 'natively' but will not require special treatment. 'You can use SQL, and don't need to master the new XML query languages like XQuery,' Burton said. 'The object technology that makes this possible was a hammer looking for a nail,' Burton said, 'and XML is suddenly the biggest nail on the planet. If XML had not come along, you could argue that a lot of our object development efforts were wasted effort.' 'With XDB,' Burton continued, 'the overhead of managing XML, parsing for example, largely disappears.' Meanwhile, Oracle remains lukewarm to XQuery, the emerging XML query language from the World Wide Web Consortium (W3C). 'It is still very early stages for XQuery,' said John Magee, senior director of Oracle 9i product marketing in Redwood Shores, Calif. 'Any vendor who says they are offering [XQuery] is blowing smoke and not doing their customers any service.' IBM is more actively pushing XQuery. 'We plan to have an alpha version [of XQuery] in DB2 next year,' said Nelson Mattos, director and distinguished engineer at IBM's Silicon Valley Lab in San Jose, Calif. And Gordon Mangione, vice president of SQL Server at Microsoft, in Redmond, Wash., said, 'The next big [development] is XQuery, how you query against XML data. It will be part of our strategy to present SQL Server as a Web service.' The biggest XML issue for all the database vendors is how to store the data. 'The debate has always been,' explained Magee, 'do you store XML in document format or do you break it up into relational tables?' He said the jury is in and the consensus is that you need to do both. 'You can store an XML document as a BLOB [Binary Large OBject] and search on it with Oracle Text,' said Magee, 'or you can define a mapping that maps XML data to the standard relational format.' The issue is important because there is a new crop of XML database vendors that claim the ability to do it better. 'We store the XML document natively, just as it is,' said Lawell Kiing, vice president of software development at XML Global Technologies in Vancouver, Canada. 'Our query engine is based on the latest draft of XQuery.'... Where other vendors are concerned, Software AG plans to add XQuery to Tamino, its XML database. Currently Tamino uses a query engine based on XPath. 'Oracle essentially [performs XQuery functions] with smoke an mirrors, storing the entire XML document as a column in a relational table and allowing you to search it with their text engine,' said John Taylor, senior technology officer at Software AG in Reston, Va. No one, however, expects these XML pure plays to take on the relational vendors directly. 'Software AG initially positioned Tamino as an alternative to relational databases,' said Susan Funke, an analyst with IDC in Framingham, Mass. 'But they quickly realized the better strategy was to pursue symbiotic partnerships with other vendors including IBM, BEA, Sun, and HP to promote its solution.'...Analysts say integration is behind the XML database craze. 'With more b-to-b sites, people are going to be passing XML documents back and forth,' said Carl Olofson, an analyst with IDC in Framingham, Mass. 'It will be really important to store these things'..."

[November 30, 2001] "XML Dominates Database File Formats." By Tom Sullivan and Ed Scannell. In InfoWorld Issue 48 (November 26, 2001), pages 1, 17-18. "With Oracle's annual OpenWorld conference on the horizon, database vendors are preparing for battle once again. This time around, the big three -- IBM, Oracle, and Microsoft -- are brandishing XML as the not-so secret weapon for making their databases faster and using it to anchor Web services. Microsoft is readying its charge into the enterprise-class arena with the forthcoming version of SQL Server, code-named Yukon. Yukon is being hailed as an XML-savvy, back-end engine for Microsoft's .Net Web services initiative. The second, and perhaps more important, design goal for Yukon is language independence, said Barry Goffe, group product manager of the .Net enterprise server group. 'We've had this vision for a long time: to have a multilanguage database,' Goffe said. To create that multilanguage database, Microsoft is arming Yukon to host XML natively in the database and is making XML a definable column type, which enables XML data to more effectively be searched and retrieved, said Stan Sorenson, director of server marketing at Microsoft. Although the delivery date for Microsoft's next generation of SQL Server has thus far been a closely guarded secret, a company official said that the Redmond, Wash.-based giant has a specific time frame in mind for the product to be finished... Microsoft has been adding support for emerging XML standards in a series of Web releases, the latest of which, SQLXML 2.0, came in October. SQLXML 2.0 contains support for XSD (XML Schema Definition), a specification designed to ease data integration from the World Wide Web Consortium (W3C) standards body in Cambridge, Mass. Oracle and IBM, meanwhile, are also sharpening their XML battle-axes. Jeremy Burton, vice president of worldwide marketing at Oracle in Redwood Shores, Calif., said that Oracle will be touting an XML-related technology called XDB (XML database support) at the company's annual Oracle OpenWorld conference the week after next in San Francisco. Burton, however, declined to provide further details about the future of XDB, now part of 9i. Not to be outdone, IBM officials say they have a rolling head start on Oracle and Microsoft, noting they have already delivered key database-related technology pieces and have employed all the right programming standards and protocols. Big Blue, in fact, is stressing that the combination of DB2 and its XML Extender provides the functional equivalent of Oracle's XDB technology, said Jeff Jones, director of strategy for IBM data-management solutions at IBM's labs in San Jose, California..."

[November 29, 2001] "Introduction to dbXML." By Kimbro Staken. From XML.com. November 28, 2001. [An introduction to dbXML, an open source native XML database. The dbXML database offers XPath-based query over collections of semi-structured XML documents.'] " In this article we'll take a look at a native XML database implementation, the open source dbXML Core. What it Offers The dbXML Core has been under development for a little more than a year. The current version is 1.0 beta 4 with a 1.0 final release expected to appear shortly. Full source code is available from the dbXML web site. Most of the basic native XML database features are covered, including: (1) storage of collections of XML documents, (2) multi-threaded database engine optimized for XML data, (3) schema independent semi-structured data store, (4) pre-parsed compressed document storage, (5) XPath query engine, (6) collection indexes to improve query performance, (7) XML:DB XUpdate implementation for updates, (8) XML:DB Java API implementation for building applications, and (9) complete command line management tools. Proper transaction support is the major missing feature right now; it will appear in the 1.5 release... Like all native XML databases dbXML is just a tool. It will be right for some jobs and completely wrong for others, and like all tools the best way to find out if it works is to try it. This is an exciting time for dbXML; it's on the verge of an initial production release and will soon be receiving a new name and a new home. Development of dbXML is coming under the stewardship of the Apache Software Foundation XML sub-project, and dbXML will be renamed Xindice in the process. The project has come a long way, and now is the best time to get involved to help shape the future of open source native XML database technology."

[November 01, 2001] "Introduction to Native XML Databases." By Kimbro Staken. From XML.com. October 31, 2001. ['The choice of storage system for XML-based applications is a crucial one. In his article, Staken explains the strengths and weaknesses of native XML stores as opposed to conventional databases. Staken's "Introduction to Native XML Databases" is the first in a three-part series that will also take in the dbXML project and the XML:DB API.'] "The term native XML database (NXD) is deceiving in many ways. In fact many so-called NXDs aren't really standalone databases at all, and don't really store the XML in true native form (i.e., text). To get a better idea of what a NXD really is, let's take a look at the NXD definition offered by the XML:DB Initiative, of which the author is a participant. A native XML database... (1) Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Examples of such models are the XPath data model, the XML Infoset, and the models implied by the DOM and the events in SAX 1.0. (2) Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage. (3) Is not required to have any particular underlying physical storage model. For example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files... An NXD can store any type of XML data, but probably isn't the best tool to use for something like an accounting system where the data is very well-defined and rigid. Some potential application areas include Corporate information portals, Catalog data, Manufacturing parts databases, Medical information storage, Document management systems, B2B transaction logs, and Personalization databases... NXDs aren't a panacea and they're definitely not intended to replace existing database systems. They're simply another tool for the XML developers' tool chest, and when applied in the right circumstances they can yield significant benefits. If you have lots of XML data to store, then an NXD is worth a look, and might just prove to be the right tool for the job..."

[August 24, 2001] "CPI: Constraints-Preserving Inlining Algorithm for Mapping XML DTD to Relational Schema." By Dongwon Lee and Wesley W. Chu (University of California at Los Angeles, Department of Computer Science, Los Angeles). 27 pages. To be published in Journal of Data and Knowledge Engineering. "As Extensible Markup Language (XML) is emerging as the data format of the Internet era, there are increasing needs to efficiently store and query XML data. One path to this goal is transforming XML data into relational format in order to use relational database technology. Although several transformation algorithms exist, they are incomplete in the sense that they focus only on structural aspects and ignore semantic aspects. In this paper, we present the semantic knowledge that needs to be captured during transformation to ensure a correct relational schema. Further, we show an algorithm that can: (1) derive such semantic knowledge from a given XML Document Type Definition (DTD), and (2) preserve the knowledge by representing it as semantic constraints in relational database terms. By combining existing transformation algorithms and our constraints-preserving algorithm, one can transform XML DTD to relational schema where correct semantics and behaviors are guaranteed by the preserved constraints. Experimental results are also presented... One way to query XML data is to reuse the established relational database techniques by converting and storing XML data in relational storage. Since the hierarchical XML and the at relational data models are not fully compatible, the transformation is not a straightforward task. To this end, several XML-to-relational transformation algorithms have been proposed (Deutsch et al., 1998; Florescu and Kossmann, 1999; Shanmugasundaram et al., 1999). For instance, Shanmugasundaram et al. (1999) presents 3 algorithms that focus on the table level of the schema while Florescu and Kossmann (1999) studies different performance issues among 8 algorithms that focus on the attribute and value level of the schema. They all transform the given XML Document Type Definition (DTD) to relational schema. Similarly, Deutsch et al. (1998) presents a data mining-based algorithm that instead uses XML documents directly without a DTD. Although they work well for the given applications, they miss one important point. That is, the transformation algorithms only capture the structure of a DTD and ignore the hidden semantic constraints... Our experimental results reveal that constraints can be systematically preserved during the conversion from XML to relational schema. Such constraints can also be used for semantic query optimization or semantic caching... Despite the obstacles in converting from XML to relational models and vice versa, there are several practical benefits: (1) Considering the present market that is mostly dominated by RDB products, it is not easy nor practical to abandon RDB to support XML. It is very likely that industries would be reluctant to adopt the new technology if it does not support the existing RDB techniques as they were reluctant towards object-oriented database in the past. (2) By using RDB as an underlying storage system, the mature RDB techniques can be leveraged. That is, a vast number of sophisticated techniques (e.g., OLAP, Data Mining, Data Warehousing, etc.) developed for RDB can be applied to XML data with minimal changes. (3) The integration of a large amount of XML data on the Web with the legacy data in relational format is possible. We strongly believe that devising more accurate and efficient conversion metholodogies between XML and relational models is very important and our CPI algorithm can serve as an enhancement for such conversion algorithms. The prototype of CPI algorithm is available online at http://www.cobase.cs.ucla.edu/projects/xpress/. The interested readers are welcome to experiment, improve and extend further." Also in Postscript format. [cache]

[August 11, 2001] "More XML Fundamentals. [Web Developer.]" By Michael S. Dougherty. In DB2 Magazine (Quarter 3, 2001). "In this article, I'll discuss some more advanced concepts and show how XML ties into DB2 Universal Database (UDB)... Like a database, XML involves storage, schema, query languages, and plenty of interfaces. However, XML lacks many core database tools and features, such as indexes, security, data integrity, large capacity, and triggers. When determining how to implement XML with a database, one of the first points to consider is whether the XML implementation will encompass data storage or the overall design of Web pages or will be used primarily for document management. This question is important because using XML with document management is very different than using XML with data storage retrieval. XML handles document management with the Document Object Model (DOM) using a native XML database (one designed specifically for XML storage) or a content management system (an application designed to manage documents that is built with native XML). DOM is an API for accessing content within a Web browser that is written to include information about document structure. DOM allows the developer to dynamically access and update the content, structure, and style of documents. Using the DOM is excellent for document management, but often is not necessary for data management. In spirit with the first installment of this article, we shall focus on data management... In the last article, the author described how to use XML to connect to DB2 UDB. Because DB2 is a relational database, the most common connection mechanisms include Microsoft's Open Database Connectivity (ODBC), Sun's Java Database Connectivity (JDBC), and newer hybrids such as Object Linking and Embedding (OLE). The main liability of using these class libraries, as well as those that access native database drivers, is that they are too complex for standard XML use. Currently, XML interfaces provide the best support for update, delete, insert, and query messages. The interface for handling multiple objects in the database will not be much more complex. Therefore, the XML classes in the development environment provide simple functionality, and may not be sufficient for the types of connectivity requirements of some applications... . The primary use of relational databases like DB2 UDB 7.2 with regard to XML is to integrate XML styles, tables, and object mapping to the dynamic appearance of Web pages. When mapping data with XML and relational databases, you can choose from several options. Remember that XML is basically a hybrid similar to object databases in data modeling, so it can represent data from an RDMS adequately. There are plenty of software products that effectively and automatically map XML objects and classes to relational database tables and directly into XML databases. Database vendors such as IBM, Microsoft, Oracle, and Sybase have developed tools to assist in converting XML documents into relational tables..."

[July 31, 2001] "XML-Native Databases. [Internet Technology.]" By David F. Carr. In InternetWorld July 15, 2001. "With new XML query tools, some say efficiency and preserving metadata make them the way to go. But are they the only way?'] "If XML is really, truly becoming all that it's been cracked up to be, then 'XML-native' databases should have a chance of loosening the stranglehold relational databases have long had on corporate data. When an XML database vendor talks, however, it's most often an object-oriented database vendor's lips that are moving. One of the bigger fish in this small but growing pond, eXcelon, was formerly known as Object Design, for example. Object Design, maker of the ObjectStore database, has been redefined as a subsidiary of eXcelon, reflecting the latter's emphasis on its XML-centric product line. The case for XML databases is simple. Vendors talk about how much more convenient it is for XML developers to work entirely within XML and not have to worry about how their data will be mapped onto relational database tables, while XML-native developers complain about the overhead required to disassemble an XML data structure for storage, then put it back together again. 'It bridges the gap between the database structure and the way the data is actually used,' says Tim Matthews, president of an XML database startup called Ipedo. 'When you can get XML content in XML by using XML, that bridges the gap between a developer and a database administrator.'... Then there were the object-relational databases that were supposed to give us the best of both worlds. Object-relational databases were supposed to revolutionize the management of Web content. These hybrid products would be better able to organize all the arbitrary relationships between HTML pages, embedded multimedia content, and so on. The object-relational model is still very much with us: Now it's being adapted to the management of XML data. In much the same fashion, object-relational mapping middleware is being recast as XML middleware. The reason for the parallel evolution is that XML is essentially a way of modeling object hierarchies as marked-up text. But there are key differences. For example, XML and the tools for creating and querying XML documents are accessible to a large pool of Web developers instead of being limited to a small community of object-oriented programmers... 'None of these people have defined technically what an XML database is -- it's entirely a marketing term,' says Ronald Bourret, a freelance programmer, researcher, and writer. Nevertheless, Bourret says, you might consider an XML-native product if your application truly revolves around XML and performance is crucial, as long as the vendor can substantiate claims of superior performance. Besides maintaining an open-source middleware solution called XML-DBMS, Bourret maintains a catalog of XML data management solutions on his personal Web site at www.rpbourret.com. Alternatives to 'going native' include middleware that translates between XML and relational databases and relational database vendors' who claim that their products are now 'XML-enabled'."

[July 23, 2001] "Ixiasoft Speeds Access to XML Data. Vendor's TextML database is optimized for handling documents in XML format." By Amy Johnson. In ComputerWorld July 16, 2001. ['Niche: Native XML database that handles XML documents more efficiently than relational databases.'] "Bill Bean, vice president of business development at American LegalNet Inc., needed a high-performance database to serve up 170,000 files to more than 1 million users per month. The catch was that the Encino, Calif.-based online supplier of electronic legal forms kept its files in XML format. After some comparison trials that pitted relational databases against the TextML native XML database from Canadian start-up Ixia Inc. (known in the U.S. as Ixiasoft), the company went with the latter. Speed tests showed that the native product was at least 30% faster. 'It's fast, and it works,' says Bean, whose Web site, www.uscourtforms.com, went live in January. What makes TextML faster than a relational database, says Ixiasoft CEO Philippe Gelinas, is that it keeps information in original XML documents, rather than breaking it down into pieces and storing it in tables and cells as relational databases require. That conversion step is a significant performance drain, he says. In addition, the rigidity of relational database structures makes modifications to accommodate changes in the XML document structure a complex process. '[A native XML database] is a solid technology for managing for XML,' says Nick Wilkoff, an analyst at Forrester Research Inc. in Cambridge, Mass. XML files are designed in a hierarchical fashion, which is difficult to map to a relational database's table structure, he explains. But according to Wilkoff, the challenge for Ixiasoft is making a native XML database the preferred choice over relational databases for managing XML data. This could be a difficult idea to sell to IT departments that have a large investment in relational database infrastructure and programming skills... TextML runs on a Windows NT 4.0 or Window 2000 server. Since the product relies on some features of the Windows operating system that are hard to duplicate on Unix, support for Unix is still up in the air, Wilkoff says. TextML functions as a black box, so developers must build an application around it so end users can retrieve XML data, says Gelinas. Ixiasoft supplies an application programming interface for developers to build those applications, based on Microsoft Corp.'s COM+. The product will also support Microsoft's .Net Web-based services initiative."

[July 13, 2001] "An Efficient Data Extraction and Storage Utility For XML Documents." By Ismailcem Budak Arpinar, John Miller, and Amit P. Sheth (LSDIS Lab, Computer Science Department, University of Georgia). Pages 293-295 in Proceedings of the 39th Annual ACM Southeast Conference (ACMSE'01), Athens, Georgia, March 2001. "A flexible filtering technique and data extraction mechanism for XML documents are presented. A relational database schema is created on the fly to store filtered and extracted XML elements and attributes. Building an XML based workflow process repository provides a motivation. Dynamic XML technology combined with Java reflection provides for an efficient traversal method for XML hierarchies to locate the elements/attributes to be filtered... The repository, involving the Data Extraction and Storage Utility (i.e., Extractor), has the following main capabilities: (1) Filtering of XML objects that need to extracted, (2) Generating relational schemas for on-the-fly storage of XML documents, (3) Loading data from XML documents into relational tables, (4) Re-creating original XML documents as needed, (5) Querying, browsing, and versioning. Our scheme has superiority over other storage alternatives for XML documents in terms of practicality and flexibility. Practicality arises because of the obvious acceptance and wide us of Relational Database Management Systems (RDBMSs); flexibility is provided by selective extraction mechanism (i.e., filtering) employed by the Extractor, which is not available in similar approaches using a RDBMS. Other approaches, such as XML databases (e.g., Lore), might have superiority over our approach in terms of efficient storage and querying XML documents... Recently, XML gains a great acceptance as a data interchange format on the Web. Thus, providing storage and querying capabilities for XML attains interests of many researches. However, a broadly accepted solution is still missing. We believe that our approach provides for a flexible and practical solution until XML DBMSs are improved and standardized. Furthermore, the XML based workflow repository provides easy exchange of workflow process definitions between companies, and an integration tool to enable coordination of companies' business processes." Also available in PDF format.

[May 11, 2001] "Tutorial: Mapping DTDs to Databases." By Ronald Bourret. From XML.com. May 09, 2001. ['XML and database expert Ron Bourret discusses mapping DTDs to database schemas, and vice versa. In his in-depth article, Bourret discusses both table-based and object-relational mappings. The article describes best practices.'] "A common question in the XML community is how to map XML to databases. This article discusses two mappings: a table-based mapping and an object-relational (object-based) mapping. Both mappings model the data in XML documents rather than the documents themselves. This makes the mappings a good choice for data-centric documents and a poor choice for document-centric documents. The table-based mapping can't handle mixed content at all, and the object-relational mapping of mixed content is extremely inefficient. Both mappings are commonly used as the basis for software that transfers data between XML documents and databases, especially relational databases. An important characteristic in this respect is that they are bidirectional. That is, they can be used to transfer data both from XML documents to the database and from the database to XML documents. One consequence is that they are likely to be used as canonical mappings on top of which XML query languages can be built over non-XML databases. The canonical mappings will define virtual XML documents that can be queried with something like XQuery. In addition to being used to transfer data between XML documents and databases, the first part of the object-relational mapping is used in "data binding", the marshalling and unmarshalling of data between XML documents and objects... Most XML schema languages can be mapped to databases with an object-relational mapping. The exact mappings depend on the language. DDML, DCD, and XML Data Reduced schemas can be mapped in a manner almost identical to DTDs. The mappings for W3C Schemas, Relax, TREX, and SOX appear to be somewhat more complex. It is not clear to me that Schematron can be mapped. In the case of W3C Schemas, a complete mapping to object schemas and then to database schemas is available. Briefly, this maps complex types to classes (with complex type extension mapped to inheritance) and maps simple types to scalar data types (although many facets are lost). "All" groups are treated like unordered sequences and substitution groups are treated like choices. Finally, most identity constraints are mapped to keys. For complete details, see http://www.rpbourret.com/xml/SchemaMap.htm." See also by Ronald Bourret: (1) "XML and Databases," and (2) "XML Database Products."

[May 11, 2001] "DTDs and XML Documents from SQL Queries. [XML Matters #9.]" By David Mertz, Ph.D. (Bricolateur, Gnosis Software, Inc.) From IBM developerWorks. May 2001. ['This column discusses the public-domain sql2dtd and sql2xml utilities that allow RDBMS-independent generation of portable XML result sets. SQL queries that extract data from relational databases can provide very practical ad hoc document-type information for the representation of query results in XML.'] "The previous "XML Matters" column discussed some of the theory and advantages underlying various data models. One conclusion of that column was that RDBMSs are here to stay (with good reasons), and that XML is best seen in this context as a means of transporting data between various DBMSs, rather than as something to replace them. XPath and XSLT are useful for certain "data querying" purposes, but their application is far less broad and general than that of RDBMSs, and SQL, in particular. However, for lack of space, I am deferring a discussion of the specific capabilities (and limits) of XPath and XSLT until a later column. A number of recent RDBMSs, including at least DB2, Oracle, and probably others, come with built-in (or at least optional) tools for exporting XML. However, the tools discussed in this column are intended to be generic; in particular, the DTDs generated by these tools will remain identical for the same query performed against different RDBMSs. I hope this will further goals of data transparency. Simplifying too much What you might imagine as the most obvious way to convert relational database data to XML is also generally a bad idea. That is, it would be simple enough -- conceptually and practically -- to do a table-by-table dump of all the contents of an RDBMS into corresponding XML documents... Suppose that A and B each has its own internal data storage strategy (for example, in different RDBMSs). Each maintains all sort of related information that is not relevant to the interaction between A and B, but they also both have some information they would like to share. Suppose, along these lines, that A needs to communicate a particular kind of data set to B on a recurrent basis. One thing A and B can do is agree that A will periodically send B a set of XML documents, each of which will conform to a DTD agreed to in advance. The specific data in one transmission will vary with time, but the validity rules have been specified in advance. Both A and B can carry out their programming, knowing the protocol between them. One way to develop this communication between A and B is to develop DTDs (or schemas) that match the specific needs of A and B. Then A will need to develop custom code to export data into the agreed DTDs from A's current RDBMS; and B will need to develop custom code to import the same data (into a differently structured database). Then, finally, the communication channel can be opened. However, a quicker way -- a way that is likely to leverage existing export/import procedures -- usually exists. The Standard Query Language (SQL) is a wonderfully compact means of expressing exactly what data interests you within an RDBMS database. Trying to bolt XML native techniques like XPath or XSLT onto a relational model will probably feel unnatural, although they can certainly express querying functions within XML's basically hierarchical model. Many organizations have already developed well-tested sets of SQL statements for achieving known tasks. Often, in fact, RDBMSs provide means for optimizing stored queries. While there are certainly cases where designing rich DTDs for data exchanges makes sense, in many or most cases, using the structuring information implicit in SQL queries as an (automatic) basis for XML data transmissions can be a good solution. While SQL queries can combine table data in complex ways, the result from any SQL query is a rather simple row-and-column arrangement. Query output has a fixed number of columns, with each row filling in values for every fixed column. (That is, as well as not changing in number, neither the value type nor the names of columns change within a SQL result -- even though both these things could change in XML documents.) The potential of XML to represent complex nesting patterns of elements is just simply not going to be deeply exercised in representing SQL results. Nonetheless, several important aspects of an SQL query can and should be represented in an XML DTD beyond simply row/column positions... In general, sql2dtd can generate the DTD from an SQL query but does not itself query any database. sql2xml peforms queries via ODBC and optionally utilizes sql2dtd to get a DTD (or it can generate DTD-less XML documents). These tools help with only approximately half the process contemplated between A and B. A and B can quickly arrive at DTDs using these tools, and A can equally quickly generate the output XML documents conforming with these DTDs. But B, at its end, still needs to do all the work involved in parsing, storing and processing these received documents. Later columns will discuss B's job in some more detail."

[May 11, 2001] "XML Databases Offer Greater Search Capabilities." By Charles Babcock. In Interactive Week Volume 8, Number 18 (May 01, 2001), pages 11-13. "The Extensible Markup Language is emerging not only as a Web page markup standard, but as a database technology with the potential to simplify and speed future Web operations. With databases that store whole documents in their native XML format, an archive becomes easier to search by title, author, keywords or other attributes. The development will broaden information that is available over the Web and make speedy content serving more practical, database experts said. The World Wide Web Consortium (W3C) last week released its XML Schema specification, which defines how to use XML -- a larger and more useful tagging language than its predecessor, HTML. At the same time, pioneering efforts to implement XML in database systems for managing XML documents are gaining steam. Software AG leads the field with its Tamino XML Database, and 9-month-old start-up Ipedo announced its own XML Database System last week. In the meantime, relational database vendors IBM, Oracle and Sybase continue to upgrade their products to give them more XML-handling capabilities... Both Ipedo and Software AG implement their own versions of the W3C's proposed specification for the XML Query language, now known as XQuery for short. The XQuery draft specification was released Feb. 16, 2001. Once it becomes a released specification, the use of XML documents and XML databases will proliferate, experts predicted. Ipedo is trying to capitalize on speed by urging its customers to equip their database servers with a gigabyte or more of memory. The Ipedo XML Database System dispenses with many of the time-consuming input/output operations of traditional databases by having the database engine and much of the data it works with reside in main memory. The move adds $1,500 or more to the cost of the server on which the database resides, but augments the speed already inherent in serving XML documents from an XML database, Matthews said. Software AG of Darmstadt, Germany has sold 300 copies of its mainframe-style Tamino product since the system was launched in 1999. 'Content delivery is one of our greatest strengths,' said John Taylor, Software AG's director of product marketing for Tamino. He conceded that customers wouldn't buy an XML database primarily to manage large financial accounts. On the other hand, Taylor added, emerging query languages such as XQuery, which was co-authored by IBM and Software AG, will make it possible to query the XML database using 'keys' and retrieve related information from a variety of documents. Just as Structured Query Language queries the relational database, pulling out data related to a primary key or identifier, XQuery will be able to query a large set of documents based on the name of an author, date filed, subject or keywords in the document, Taylor said..."

[May 08, 2001] "XML Databases Gain Momentum." By L. Scott Tillett. In InternetWeek (May 07, 2001). "As companies turn to XML as a common language for conducting intercompany business and as organizations publish more content using XML, IT shops are warming up to using specialized XML databases to manage content. When XML database developer Ipedo launches this week with a repository for XML content, it will join a host of such vendors that have emerged in recent months, including B-Bop, Ixia and X-Hive. Longtime vendor Software AG has offered a native XML database product since 1999. IT services firm ProLogic Inc. began testing Ipedo's XML Database to manage content for a Defense Department project. The project focuses on digitizing technical manuals, such as those used to repair helicopters. The manuals, called interactive electronic technical manuals (IETMs), enable repair technicians to take notebook computers instead of thick repair books with them to the hangars when they work on aircraft... Storing commonly used documents in an XML database saves having to translate documents from their native formats as they're needed. That usually requires custom JavaScript code. XML databases could also help users overcome a fundamental shortcoming of relational databases made by Microsoft, Oracle and others. Because relational databases structure data in rows and columns, it's difficult to express the relationship among different data records. XML databases lets data be structured hierarchically, thereby grouping documents that relate to one another, said Glenn Copen, director of application development at ProLogic. For example, the process of repairing a fan assembly may call for replacement of the fan belt first. Advocates of XML databases say relational databases work well for handling transactional data, while XML databases are better for data about multilayered processes that require context..."

[May 08, 2001] "Ipedo Unveils XML Database Breakthrough. First Main-Memory, Native-XML Database Offers Performance Advantage for Leading-Edge Internet and Wireless Applications." - "Ipedo, Inc., a leading provider of software products that ensure rapid delivery of dynamic content over the Internet, today introduced the Ipedo XML Database, a breakthrough product that is the first to combine native-XML information storage and processing with ultra-fast main-memory performance. Used standalone or in conjunction with existing databases and file systems, the Ipedo XML Database can deliver the performance levels required by the new generation of XML-intensive Web services, B2B marketplaces and wireless applications. The Ipedo XML Database simplifies XML content management, enabling e-businesses to achieve the flexibility of a dynamic and reusable XML content infrastructure without sacrificing performance. Specialized XML handling features and core performance allow companies to improve search relevance and style management for large Web sites, enable dynamic B2B portal content assembly and accelerate content customization for wireless devices. The Ipedo XML Database stores XML data natively in its structured hierarchical form, which eliminates the complex process of mapping the XML data tree structure to two-dimensional tables, substantially improving overall performance. Utilizing the W3C's XML query standard XPath, XML document collections can be queried directly in XML syntax. The Ipedo XML Database also contains an XSLT transformation engine that combines data access and transformation in a single step. The all-Java design integrates easily with the leading application servers to speed the development of next generation e-business applications. SOAP, DOM and XPath APIs address the needs of systems integrators, application developers and database administrators. At the core of the Ipedo XML Database is Ipedo's Active Edge architecture, a combination of sophisticated network caching and intelligent main-memory data processing techniques. Optimized for 64-bit systems, large amounts of memory can be directly utilized for processing, resulting in more than a ten-times performance boost with excellent ROI. The Ipedo XML Database is available now for Windows 2000, Windows NT, Sun Solaris and Red Hat Linux. Pricing on a per server basis starts at $50,000."

[April 20, 2001] "Ipedo XML Database." Company white paper. April, 2001. Excerpts: "The Ipedo XML Database stores XML data natively in its structured, hierarchical form. Queries can be resolved much faster because there is no need to map the XML data tree structure to tables. This preserves the hierarchy of the data and increases performance. Working along side a relational database and file system, a native XML database adds flexibility and speed. Ipedo's sophisticated user-defined indexing allows the database to directly address any node or element of an XML document. This indexing method allows for much finer-grained access to XML document information. Integrated XML Query and Translation: The Ipedo XML Database can be queried directly using the W3C's XPath query language that was designed to retrieve information directly from XML documents. The result of an effort to provide a common syntax and semantics for functionality shared between Extensible Stylesheet Language Transformations (XSLT) and XPointer, XPath allows you to address parts of a document. It also provides basic facilities for manipulation of strings, numbers and booleans. By combining an XPath query with an XSL transformation, the Ipedo XML Database allows data access and XML transformation to be completed in one step. The result is a fast query that skips time-consuming steps and a result document already in the desired output format. Performance Caching and In-Memory Database Processing: Ipedo's XML Database is built using unique in-memory database architecture. Running in the in-memory mode, the entire database is mapped directly into virtual address space in main memory, eliminating disk access to retrieve data. This model provides direct, high-speed access to data, significantly reducing disk I/O while increasing performance. Running Ipedo's XML Database in this mode provides data in real-time, and allows developers to focus their efforts on minimizing the number of program instructions required to perform data-base functions, instead of worrying about minimizing disk operations. Ipedo's XML Database includes Hot Indexing technology, which allows administrators to make memory versus performance trade-offs. Ipedo's Hot Indexing distinguishes heavily used data and loads those indexes into memory for faster access. Hot Indexing adds flexibility to in-memory data processing to yield an order of magnitude better performance than conventional systems... Ipedo, Inc. provides high-performance data delivery and management products optimized for Internet and wireless applications. Based on its Active Edge performance technology, Ipedo's products include directory, caching, and XML database servers. Ipedo's products provide rapid personalization, instant delivery, and scalable data access to very large user populations, ideal for ASPs, ISPs, Web portals, B2B exchanges, wireless services, and next-generation Internet telephony." Comments from the XML-DEV post of Samantha Cichon: "Ipedo, Inc. for serious testers, who can offer constructive feedback, to participate in our beta program for our XML database. The Ipedo XML Database is a native XML database with fast XSLT processing, geared to optimize your system's performance. Featured in this release are: (1) Support for SOAP and HTTP Servlet; (2) Query through XPath's direct access to document; (3) Integrated XML Transformation through XSLT; (4) Persistent DOM allows access to the document object model after it has been loaded; (5) Data pre-indexed for faster queries; (6) Schema-based dynamic indexing."

[April 13, 2001] "Progress Towards Standards for XML Databases." By Maria Chinwala, Rakesh Malhotra, and John A. Miller. Pages 277-284 in Proceedings of the 39th Annual ACM Southeast Conference (ACMSE'01), Athens, Georgia (March 2001). Also in PostScript.

[April 13, 2001] "Putting XML in Context with Hierarchical, Relational, and Object-Oriented Models. [XML Matters #8.]" By David Mertz, Ph.D. (Ideationist, Gnosis Software, Inc.). From IBM developerWorks (April 2001). ['On the way to making a point about how XML is best suited to work with databases, David Mertz discusses how XML fits with hierarchical, relational, and object-oriented data modeling paradigms.'] "XML is an extremely versatile data transport format, but despite high hopes for it, XML is mediocre to poor as a data storage and access format. It is not nearly time to throw away your (SQL) relational databases that are tuned to quickly and reliably query complex data. So just what is the relationship between XML and the relational data model? And more specifically, what's a good design approach for projects that utilize both XML and relational databases -- with data transitions between the two? This column discusses how abstract theories of data models, as conceptualized by computer scientists, help us develop specific multirepresentational data flows. Future columns will look at specific code and tools to aid the transitions; this column addresses the design considerations... The problem for many XML-everywhere (and XML-only) aspirations is that at the core of an RDBMS are its relations -- in particular, the set of constraints that exists between tables. Enforcing the constraints is what makes RDBMSs so useful and powerful. While it would surely be possible to represent a constraint set in XML for purposes of communicating it, XML has no inherent mechanism for enforcing constraints of this sort (DTDs and schemas are constraints of a different, more limited sort). Without constraints, you just have data, not a data model (to slightly oversimplify matters). Some XML proponents advocate adding RDBMS-type constraints into XML; others suggest building XML into RDBMSs in some deep way. I believe that these are extremely bad ideas that arise mostly out of a "buzzword compliance" style of thinking. Major RDBMS vendors have spent many years of effort in getting relational matters right, and especially right in a way that maximizes performance. You cannot just quickly tack on a set of robust and reliable relational constraints to the representation in XML that, really, is closer to a different modeling paradigm. Moreover, the verbosity and formatting looseness of XML are, at heart, quite opposite to the strategies RDBMSs use to maximize performance (and, to a lesser extent, reliability), such as fixed record lengths and compact storage formats. In other words, go ahead and be excited by XML's promise of a universal data transport mechanism, but keep your backend data on something designed for it, like DB2 or Oracle (or on Postgres or MySQL for smaller-scale systems)." Article also in PDF format.

[April 03, 2001] "Find a Home For Your XML Data. The Sudden Rise of XML Puts a New Twist on The Old Problem of Data Storage." By Mark Leon. In InfoWorld (April 02, 2001). "Customers say they want them, vendors are scrambling to provide them, and opinions vary as to how to set them up correctly. They are XML databases, a way to store, search, and retrieve all that mission-critical business data that is finding expression in XML format. Currently, XML rivals HTTP, HTML, and SQL as one of the big hits on the top 10 chart of information management standards. But XML's strength, its great capability of facilitating the flow of semistructured data among applications and heterogeneous systems, also introduces several new problems. One of the more pressing problems is how to store and manage XML data. . . Microsoft, Oracle, and IBM have already added XML extensions to their relational databases, but these efforts will not satisfy everyone for a number of reasons... 'The relational database design does not easily support indexing or searching XML,' says Satish Maripuri, president and COO of eXcelon. 'An object database such as ours offers a more natural way to store, search, and retrieve XML data. This is why we took a bet with XML.' Analysts give eXcelon high marks for what it has been able to accomplish both in simplifying its product and in adapting it to XML, but few are willing to predict the bet will pay off. The reasons were the by-now-familiar advantages of XML: It is more flexible than EDI (electronic data interchange) and cheaper because it can, via the Internet, bypass expensive, private VANs (value-added networks)... the big boys of data storage have not been sitting on their hands. Oracle, IBM, and Microsoft have added XML extensions to their relational offerings... Not exactly a household name in the United States, Software AG of Germany owns a substantial share of the global database market with its Adabase product. And now the company, with U. S. headquarters in Reston, Va., thinks it has an edge in the XML storage space. 'We released Tamino in September of 1999,' says John Taylor, director of product marketing at Software AG. 'Tamino is not a relational database, nor is it an object database modified for XML. It is, rather, a database built from the ground up specifically for XML.' The interface for Tamino is HTTP, and Taylor says his company is working with the World Wide Web Consortium (W3C) to develop the next XML query language. 'The issue of query and retrieval is key,' Taylor says. 'You can use extensions to SQL for this, but to do that you need to break the XML hierarchy into a set of relational tables. This means queries will necessarily contain a complex set of join statements. With XPath, our query language, we can replace all that with one line.' Taylor says that more than 280 customers are currently using Tamino. One of these is the California Board of Equalization in Sacramento, Calif. The board collects about $37 billion in taxes (primarily sales tax) for California. 'We started looking at XML to facilitate the electronic filing of taxes,' says Larry Hanson, data architect for the board. 'Before long we also realized XML would be the best way to store tax returns, tax schedules, and tax-related messages'..."

[September 12, 2001] "Requirements for XML Document Database Systems." By Airi Salminen (Dept. of Computer Science and Information Systems, University of Jyväskylä, Jyväskylä, Finland) and Frank Wm. Tompa (Department of Computer Science, University of Waterloo, Waterloo, ON, Canada). Paper to be presented at ACM Symposium on Document Engineering, November 2001. 10 pages, with 52 references. "The shift from SGML to XML has created new demands for managing structured documents. Many XML documents will be transient representations for the purpose of data exchange between different types of applications, but there will also be a need for effective means to manage persistent XML data as a database. In this paper we explore requirements for an XML database management system. The purpose of the paper is not to suggest a single type of system covering all necessary features. Instead the purpose is to initiate discussion of the requirements arising from document collections, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper models and systems for XML database management. Our discussion addresses issues arising from data modelling, data definition, and data manipulation... Effective means for the management of persistent XML data as a database are needed. We define an XML document database (or more generally an XML database, since every XML database must manage documents) to be a collection of XML documents and their parts, maintained by a system having capabilities to manage and control the collection itself and the information represented by that collection. It is more than merely a repository of structured documents or of semistructured data. As is true for managing other forms of data, management of persistent XML data requires capabilities to deal with data independence, integration, access rights, versions, views, integrity, redundancy, consistency, recovery, and enforcement of standards. A problem in applying traditional database technologies to the management of persistent XML documents lies in the special characteristics of the data, not typically found in traditional databases. Structured documents are often complex units of information, consisting of formal and natural languages, and possibly including multimedia entities. The units as a whole may be important legal or historical records. The production and processing of structured documents in an organization may create a complicated set of documents and their components, versions and variants, covering both basic data and metadata... Data model, DDL, and DML design must be coordinated if the resulting system is to be consistent. Much effort has been devoted to data definition for the purpose of validation and to query language features. We believe that now the highest priority is to define a complete data model that covers enterprise and document data, serves as a means to define conceptual schemas, and defines the mechanism to answer whether any two items of data are equivalent. We are encouraged by the move towards convergence of the XPath and XQuery data models; if convergence with the DOM and Infoset models were undertaken, a complete and stable database model might evolve. DDLs and DMLs can then be defined to include all components of the model. We believe that priority should also be given to developing mechanisms to manage collections of DTDs and other document definitions along with managing the documents themselves. This is especially important in the context of managing diverse collections of documents, each of which encompasses many versions and variants and subject to various levels of validity. The purpose of the paper is to initiate discussion of the requirements for XML databases, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper models and systems for XML database management. A well-defined, general-purpose XML database system cannot be implemented before database researchers and developers understand the needs of document management in addition to the needs of more traditional database applications..." [cache]

[February 19, 2001] "Practical XML with Linux, Part 3. XML database tools for Linux. Hierarchical, relational, and object databases." By Uche Ogbuji (CEO and principal consultant, Fourthought, Inc.). In LinuxWorld (February 2001). ['Your stash of XML documents is probably growing exponentially. Uche Ogbuji provides an overview of database types, then surveys the wide range of tools available for storing and managing XML data stores.'] "There are almost as many uses of XML as there are XML users, but there are only two ways of looking at how XML documents are organized. XML's roots lie in SGML, which was originally conceived as a way of structuring documents for machine preparation and human consumption. XML has inherited much of that bias toward documents, and is often used for presentation-oriented publishing (POP). Examples include books, slide presentations, and company Websites. POP formats tend to have elements and text that flow in a flexible and free-form manner. XML has also gained popularity as the basis for data formats suitable for exchange between computer programs: consumed by machines but able to be inspected by humans. This is known as messaging-oriented middleware (MOM) because of its role in the infrastructure of applications. Examples include serialized objects, automated purchase orders, and Mozilla bookmark files. MOM formats tend to be highly regular, with elements making up well-defined fields with content according to strict data typing. MOM and POP formats often impose different needs on XML databases, based on the differences in usage patterns and format. We will decide whether certain Linux database technologies are more appropriate for MOM or POP documents. There are many ways of structuring databases. The relational model, used by well-known DBMSs like PostgreSQL and Oracle, is probably the most popular for new systems, but there are many other approaches. Databases can be: Hash-based systems; Hierarchical databases; Relational and object/relational databases; Object databases; Multi-dimensional databases ; Semistructured databases... support the notion that it is impractical to have a rigid schema for data that models the real world, given the fluidity of the real world. Many of its concepts are a natural fit for XML and related technologies like the Resource Description Framework (RDF). There is a growing body of work on how to effectively manage XML data in hierarchical, relational, and object databases."

[February 08, 2001] "Object Database Management Systems." From Barry & Associates, Inc. " Object Database Management Systems (ODBMSs) are designed to work well with object programming languages such as C++ and Java. These articles provide a background on ODBMSs and their use..." [Announcement: "Object Database Technology Overview Available on the Internet. Barry & Associates, Inc. today announced publication of an extensive overview of object database management system (ODBMS) technology on the Internet. The overview includes more than 70 ODBMS articles featuring examples, definitions, and commentary. The articles are available for no cost. Doug Barry, who has been involved with ODBMS technology since 1987, prepared the articles. Mr. Barry said, "Object databases are useful in many types of architectures. Embedded systems, financial systems, web sites using XML for B2B applications or online catalogs, telecommunications, airline reservations, and very large database applications are just a few examples where ODBMSs are being used today. ODBMSs are designed to work well with object programming languages such as Java and C++. These articles provide a background on ODBMSs and their use." Since 1992, Barry & Associates has provided facts about database products and their use in advanced applications. They particularly focus on database product comparison and selection by providing publications and services that accelerate the decision-making process..."

[January 30, 2001] "XML Structures for Existing Databases. Eleven rules for moving a relational database to XML." By Kevin Williams and nine other database developers (Professional XML Databases authors). From IBM developerWorks, XML library. January 2001. ['Learn how to convert an existing database into XML data in XML Structures for Existing Databases, a preview chapter from Wrox's Professional XML Databases.]' "This book chapter, excerpted from the just-published Wrox Press book Professional XML Databases, offers clear, authoritative guidance for how to deal with an existing database that you need to move to XML, from modeling the tables and keys to dealing with orphaned elements. The chapter provides an overview of the issues involved and details 11 rules for creating XML data structures for data in a relational database. The article includes suggestions for creating data structures that can be processed rapidly. Used with the permission of the publisher. In this chapter, we will examine some approaches for taking an existing relational database and moving it to XML. With much of our business data stored in relational databases, there are going to be a number of reasons why we might want to expose that data as XML: (1) Sharing business data with other systems. (2) Interoperability with incompatible systems. (3) Exposing legacy data to applications that use XML. (4) Business-to-business transactions. (5) Object persistence using XML. (6) Content syndication. Relational databases are a mature technology, which, as they have evolved, have enabled users to model complex relationships between data that they need to store. In this chapter, we will see how to model some of the complex data structures that are stored in relational databases in XML documents. To do this, we will be looking at some database structures, and then creating content models using XML DTDs. We will also show some sample content for the data in XML to illustrate this. In the process, we will come up with a set of guidelines that will prove helpful when creating XML models for relational data... Summary: In this chapter, we've seen some guidelines for the creation of XML structures to hold data from existing relational databases. We've seen that this isn't an exact science, and that many of the decisions we will make while creating XML structures will entirely depend on the kinds of information we wish to represent in our documents. If there's one point in particular we should come away with from this chapter, it's that we need to try to represent relationships in our XML documents with containment as much as possible. XML is designed around the concept of containment -- the DOM and XSLT treat XML documents as trees, while SAX and SAX-based parsers treat them as a sequence of branch begin and end events and leaf events. The more pointing relationships we use, the more complicated the navigation of your document will be, and the more of a performance hit our processor will take -- especially if we are using SAX or a SAX-based parser. We must bear in mind as we create these structures that there are usually many XML structures that may be used to represent the same relational database data. The techniques described in this chapter should allow us to optimize our documents for rapid processing and minimum document size. Using the techniques discussed in this chapter, and the next, we should be able to easily move information between our relational database and XML documents. Here are the eleven rules we have defined for the development of XML structures from relational database structures..."

[November 04, 2000] "XML Enters the DBMS Arena." By Edmund X. DeJesus. In ComputerWorld (October 30, 2000). "XML is emerging as the format of choice for a variety of types of data, especially documents. With its ability to tag different fields, XML makes searching simpler and more dynamic, turning enterprise documents from recycling fodder into data mining gold. Because XML content is liberated from presentation format which independent style sheets specify - XML enables the extensive reuse of material. This allows enterprises to turn the same content into press releases, white papers, brochures, presentations and Web pages. For enterprises trying to meld incompatible systems, XML can serve as a common transport technology for moving data around in a system-neutral format. In addition, XML can handle all kinds of data, including text, images and sound - and is user-extensible to handle anything special. Clearly, XML is coming into its own and seems destined to become the lingua franca of data online and off-line. The problem until now has been how to manage the XML-tagged data. One promising solution is to use databases to store, retrieve and manipulate XML. The idea is to place the XML -- tagged data in a framework where searching, analysis, updating and output can proceed in a more manageable, systematic and well-understood environment. Databases have the merit that users are familiar with them and their behavior, so taming XML with a database context seems natural. However, there are XML databases and there are XML databases. Purists would contend that only databases that store XML in its native format deserve the label 'XML database.' Others contend that if you can store and retrieve XML from it, and it's a database, then it's an XML database, regardless of how the data is stored. We'll sidestep these religious battles and consider both types. If the XML isn't stored internally as XML, we'll call that an 'XML-enabled database.' If the XML is actually stored as XML internally, we'll call it a 'native XML database.' There are a number of reasons to use existing database types, and existing database products, to store XML even if it isn't in its native form. First, ordinary relational and object-oriented databases are well known, while native XML databases are new. Second, as a result of familiarity with relational and object-oriented databases, users understand their behavior, especially with regard to performance. There is a reluctance to move to a native XML database whose characteristics - especially scalability - haven't been tested. Finally, relational and objectoriented databases are safe choices in the corporate mind. It's the old 'nobody ever got fired for buying X' rationale. You don't necessarily want to bet the enterprise on a native XML database when you don't have to. Luckily, you don't have to. There are XML-enabled databases that handle XML fine and that are based on triedand-true relational or object-oriented models. These databases typically accept XML, parse it into chunks that fit the database schema and store it as usual. To retrieve XML, the chunks are pieced back together again..."

"XML representation of a relational database." By Bert Bos (W3C). 1997/07/11. "A relational database consists of a set of tables, where each table is a set of records. A record in turn is a set of fields and each field is a pair field-name/field-value. All records in a particular table have the same number of fields with the same field-names. This article describes an application of (a simple subset of) XML that can be used to represent such a database. The relational data-model also defines certain constraints on the tables and defines operations on them. We are not concerned with the constraints and operations here. In other words, we are not trying to create a query language or a data-definition language, just a language that captures the data in a database or in a particular view of the database. Several such languages are possible, of course, and it not hard to come up with alternative and equally valid ones as the one described below..."

Software: Projects, Frameworks, Packages, Products

[November 16, 2005] "IBM Previews Next-Generation DB2 "Viper" Database. Beta Testing Program Announced at XML 2005 Conference." - In a keynote address at the XML 2005 Conference, Bob Picciano, vice president of database servers for IBM, outlined the company's plans for DB2 Viper — the industry's first database designed with both native XML data management and relational data capability. Picciano also announced that Viper is entering open testing and evaluation by qualified customers, developers and partners. Scheduled for release in 2006, DB2 Viper is expected to lead the way to a new era in data management by creating unparalleled opportunities for users to extract value from their business information... Viper is expected to be the only database product able to seamlessly manage both conventional relational data and XML data without requiring the XML data to be reformatted or placed into a large object within the database. This breakthrough will enable customers to increase the availability, speed and versatility of their information, while dramatically reducing administrative costs associated with existing data management techniques. It also will significantly reduce the complexity and time a typical developer spends creating applications able to access both relational data and XML repositories... In addition breaking new ground with its native XML capability, DB2 Viper also will be the first database to support all three common methods of database partitioning at the same time — a major innovation in improving data management and information availability. By simultaneously handling range partitioning, multi-dimensional clustering and hashing, Viper will enable organizations to arrange and order their information in the way that best suits their individual business requirements and demands. Viper's native XML technology also will provide XQuery support. XQuery is a powerful emerging industry standard language that extends XPath and is specially-designed for processing XML data. Applications can use XQuery, standard SQL or both to retrieve documents from either or both underlying storage formats..."

[July 14, 2003] Sleepycat Software Releases Berkeley DB XML Native XML Database. An announcement from Sleepycat Software Inc. describes the general availability of Berkeley DB XML, a data management solution based upon Berkeley DB. Berkeley DB XML "is designed for professional software developers who need a transactional, recoverable data manager for native XML and semi-structured data that is fast, cost-effective, and flexible. The software was validated during a 12-month beta program by more than 5,000 companies, including BEA Systems, Boeing, EDS, Leadscope and TELOS. It provides a high-performance, extremely reliable embedded database engine that stores and manages XML data. Berkeley DB XML stores and retrieves native XML documents, so no conversion to relational or object-oriented models required. It combines XML and non-XML data in a single database, with flexible indexing features that give developers control over query performance and the ability to tune retrieval speed. The Berkeley DB XML software supports XPath 1.0 and other W3C standards for XML and XML Namespaces, accepting UTF-8 encoded documents and XPath expressions. It inherits advanced database features from Berkeley DB, including concurrent access, transactions, recovery, and replication." The source code is available for free download, testing, evaluation and development.

[November 13, 2002] "eXcelon Announces Release of XIS Lite. New Version of Native XML Database Provides Key Benefits of Full-Scale Offering at Lower Cost Point of Entry." - "eXcelon Corporation today announced the availability of XIS Lite, a fully functional version of the company's native XML database management system, eXtensible Information Server (XIS). XIS Lite is designed to enable customers with less extensive data requirements to achieve the benefits of dynamic, extensible and highly reliable data management at a lower cost of ownership. eXcelon developed XIS Lite for companies that are using XML business documents to build the foundation for web services applications and standards-based integration networks where resilience to change is required. With XIS Lite, companies can now fully and cost effectively harness the extensibility and flexibility of XML to create, audit and continuously change applications that store and manipulate limited amounts of XML data. XIS Lite Features: (1) A fully functional version of eXcelon's eXtensible Information Server (XIS), providing extensible, schema-agnostic XML storage and document and subdocument data manipulation support (e.g., updates, inserts, XPath/XQuery, XSLT transformations); (2) Limited only by database size (maximum of 500 MB total storage) and high-availability options; (3) Intended for deployment in production-quality applications; (4) Comparatively low cost to deploy and maintain; (5) Supported by Stylus Studio, eXcelon's award-winning XML/XSLT Integrated Development Environment... XIS Lite provides additional ROI to developers and architects who deploy applications on XIS Lite by providing them with the ability to re-deploy those applications on full-scale XIS without requiring modifications to application-level code. This migration path allows companies to efficiently scale their XML data management solution as their web services mature and their data storage needs grow. As with the full-scale XIS, XIS Lite also facilitates XML-based interoperability across .NET and J2EE environments, providing support for XML business documents as the foundation of composite business applications that integrate systems based on both J2EE and .NET architectures..." See also the XIS datasheet.

[September 30, 2002] "eXcelon Announces Latest Release of eXtensible Information Server (XIS). New Version of Native XML Database Enables Application Interoperability across .NET and J2EE Environments, Adds XQuery and Verity Text Searching Support." - "eXcelon Corporation today announced the availability of eXtensible Information Server (XIS) 3.12. XIS is a native XML database that provides extensible, reliable and recoverable management of in flight XML business documents that enable long-running business transactions connecting multiple systems and people. XIS 3.12 facilitates XML-based interoperability across .NET and J2EE environments while delivering faster, standards-based access to data in XML business documents with full XQuery support and Verity full-text search. XIS is the industry's only XML database that allows XML business documents to be dynamically extended while providing granular access to elements of information contained in the documents. This functionality lets customers create, audit and continuously change transaction gateway and web service applications in response to changing business conditions... With this release, XIS now manages XML business documents in the .NET environment, to complement existing support for J2EE environments. XIS promotes heterogeneous interoperability between the industry's two leading software platforms; companies can now use XML business documents as the foundation of composite business applications that integrate systems based on both J2EE and .NET architectures... XIS 3.12 provides improved access to critical data and business process metadata contained in XML business documents by embracing all of the XQuery use cases as specified by the World Wide Web Consortium (W3C) in April 2002 and adding support for full text searching using the Verity search engine. XQuery is a standards-based XML query language that provides developers with an intelligent facility for rapidly accessing, querying and auditing XML documents managed in an XML data store.. The Verity add-on allows users to quickly access the abundance of valuable corporate data residing in XML documents by providing full-text indexing and high performance querying of XML documents stored in XIS using the industry's leading full-text search engine...eXcelon Corporation is a leading provider of data management software designed to accelerate the performance of distributed applications built using XML and Java and deployed on market leading software platforms. eXcelon's core products, XIS, Javlin and ObjectStore, deliver enhanced levels of speed, flexibility, scalability and run-time availability to Web Services and distributed applications, which can result in substantial cost savings and significant competitive advantage for large and mid-sized enterprises. For corporate developers and software engineers who need to manage XML in a distributed environment, eXcelon Corporation's eXtensible Information Server (XIS) is a native XML database management system that provides development organizations with the speed and throughput necessary to manage the extreme flexibility of XML..."

[June 24, 2002] IBM Clio Tool Supports Mapping Between Relational Data and XML Schemas. Clio is a Computer Science Research project at IBM's Almaden Research Lab. Its developers are designing methods to specify the transformation of legacy data to make it fit for new uses. Clio addresses the challenge of "merging and coalescing data from multiple and diverse sources into different data formats. In particular, it addresses schema matching (the process of matching elements of a source schema with elements of a target schema) and schema mapping (the process of creating a query that maps between two disparate schemas), which lie at the heart of data integration systems. Clio is a tool for generating mappings (queries) between relational and XML Schemas. The user is presented with the structure and constraints of two schemas and is asked to draw correspondences between the parts of the schemas that represent the same real world entity. Correspondences can also be inferred by Clio and verified by the user. Given the two schemas and the set of correspondences between them, clio can generates the (SQL, XSLT, or XQueries) queries that drive the translation of data conforming to the first (source) schema to data conforming to the the second (target) schema." [Full context]

[June 18, 2002] "X-Hive Corporation Launches Version 3.0 of Its Native XML Database." - "X-Hive Corporation today announced the release of version 3.0 of X-Hive/DB, the native XML database designed for software developers who need to process and store XML data in their applications. X-Hive/DB supports all major XML standards including XML 1.0, XQuery, XPath, XPointer, XLink, XSL, XUpdate and DOM. It also offers a transaction mechanism, versioning with branching, BLOB storage and various indexing methods, as well as support for J2EE and WebDAV. Next to overall performance improvements the following features have been added or enhanced in X-Hive/DB 3.0: (1) Support for XQuery the standard query language for XML data. (2) Support for DOM Level 3 Abstract Schema enables on-the-fly schema validation. (3) Support for DOM Level 3 Load & Save for loading XML documents into DOM and saving DOM trees as XML source. (4) New indexing method: element name indexes speeds up queries with element names. (5) Improved XPath/XPointer API makes it easier to specify and execute an XPath/XPointer query. (6) Improved administrator client with new functions and updated GUI. (7) Simplified database setup with default configuration based on best practices. X-Hive/DB 3.0 is available as of today on Linux, Solaris and Windows. A free 30-day evaluation license can be obtained from the company website." See previously: "X-Hive/DB 3.0 Technology Preview Adds XQuery, XPath, and DOM Level 3 Support."

[June 05, 2002] Berkeley DB XML. Sleepycat Software has announced that "its innovative, embedded open source XML data management engine, Berkeley DB XML, is scheduled for release in the fourth quarter of 2002. Berkeley DB XML is a programmatic toolkit that specializes in the storage and retrieval of XML documents. Documents are stored in collections and queried using XPath. Berkeley DB XML will provide the first truly embedded, high-performance XML database system. Berkeley DB XML is built as a module on top of Berkeley DB, the world's most widely deployed embedded database engine, with hundreds of millions of copies in use worldwide and thousands of copies downloaded from Sleepycat's Web site daily. Berkeley DB provides application developers with robust, scalable, transactional data management services for their applications. Berkeley DB XML offers the same fast, reliable database management services that users have come to expect from Berkeley DB, coupled with native storage and query services for XML data. The combination of Berkeley DB's embedded architecture and the new XML layer allows developers to create cutting-edge Web-based applications and services with unmatched reliability, scalability and performance. The key components of the system are: (1) the XML Storage Manager, which writes native XML data to Berkeley DB for storage; (2) the XPath Query Processor, which uses the XPath 1.0 specification to parse, plan, and optimize XPath queries, and which searches the repository for matching documents; and (3) the XML Indexer, which provides a number of XML indexing strategies to support efficient expression evaluation. Sleepycat has established the Berkeley DB XML web site at http://www.sleepycat.com/xml/index.html for the user community and marketplace customers. This site includes information on early availability programs, topical news and an opt-in discussion forum..." [From the website and the 2002-06-03 announcement "Sleepycat Software Announces Development of First Embedded, Industrial Strength XML Database Engine."]

[May 29, 2002] Oracle XQuery Prototype and Oracle9i Database Release 2 with SQLX and XMLType Support. A communiqué from Steve Muench reports on two XML-related announcements from Oracle. (1) In March 2002, Oracle released a Java XQuery prototype which includes a Java API to XQuery (JXQI) and a command-line interface. This technical preview implementation of the W3C XQuery language with Oracle specific extensions features support "focusing on the 'R' (Relational Data) and the 'XMP' (Experiences and Exemplars) XQuery use cases; it also features an experimental JDBC-style Java API for XQuery as well as a sql() function for using XQuery over SQL query results." Oracle's goal ultimately is to "provide both a SQL-flavored and an XQuery-based query syntax for XML content in Oracle leveraging the same underlying database engine via appropriate query rewriting." (2) Oracle has also announced Oracle9i Release 2, offering significant new "native database support for XML. The new Oracle9i Database Release 2 provides a high-performance, native XML storage and retrieval technology available within Oracle9i Release 2; it fully absorbs the W3C XML data model into the Oracle9i Database, and provides new standard access methods for navigating and querying XML." Enhanced support includes XMLType and related native XML data-management features as well as XML Repository and XML-based content-management features. [Full context]

[March 21, 2002] X-Hive/DB 3.0 Technology Preview Adds XQuery, XPath, and DOM Level 3 Support. A communiqué from Irsan Widarto (X-Hive Corporation) announces the release of a Technology Preview for X-Hive/DB 3.0. The new release adds support for XQuery 1.0, XPath 2.0, DOM Level 3 Load and Save, and DOM Level 3 Abstract Schemas. X-Hive/DB also supports XML 1.0 with Namespaces, DOM Level 2, XPath, XPointer, XLink, XSL-T, XSL-FO, and XUpdate. Using native XML database technology, X-Hive/DB stores XML documents in parsed form, eliminating translations between XML document structure and the database schema. X-Hive/DB includes an XLink-compliant 'intelligent-linking engine'. "With X-Hive/DB's XLink engine you are able to offer support for bi-directional links, link-bases and link management. In conjunction with the XPointer implementation, X-Hive offers a pure XML based link Processor. Within X-Hive/DB documents can be stored as versionable documents. X-Hive/DB offers linear versioning with support for branching. X-Hive/DB calculates differences between documents and stores the calculated delta efficiently in an XML format. Branching allows you to concurrently maintain different threads of versioned documents." Earlier in 2002, the company added X-Hive/DB support for WebDAV, J2EE, and XUpdate. X-Hive's WebDAV implementation "allows end users to access, create, and manage X-Hive/DB collections and documents directly from WebDAV-compliant desktop applications." [Full context]

[February 28, 2002] "Apache Xindice XML database 1.0rc2 Released." Announcement posted by Kimbro Staken. "The Apache Xindice team is pleased to announce the release of Apache Xindice 1.0 release candidate 2. Full source code is available under the terms of the Apache Software License and downloads are available from http://xml.apache.org/xindice. Apache Xindice is a native XML database. As such it has basically one purpose, easy management of large quantities of XML data. It is not intended as a competitor for relational databases and is primarily targeted at new application development where XML plays a significant role. The server is currently suitable for medium volume XML storage applications. It supports XPath for queries and XML:DB XUpdate for XML updates. An implementation of the XML:DB XML database API is provided for Java developers and access from other languages is enabled through the download of an XML-RPC plugin. Apache Xindice was formally known as dbXML. The dbXML source code was donated to the Apache Software Foundation in December 2001. The 1.0 release of Xindice represents the conclusion of the work undertaken by the dbXML project and the official commencement of new development on the Xindice code base. The development team has added two new members and it is expected we'll add several more in the coming weeks. Future development will focus on improved performance, ACID properties, better standards support and better integration with other Apache projects..."

[August 24, 2001] Toronto XML Server (ToX) Provides Repository for Real and Virtual XML Documents. ToX (The Toronto XML Engine) is a research project of the Database Group in the Department of Computer Science at the University of Toronto. The Toronto XML Server is "a repository for XML data and metadata, which supports real and virtual XML documents. Real documents are stored as files or mapped into relational or object databases, depending on their structuredness; indices are defined according to the storage method used. Virtual documents can be remote documents, defined as arbitrary WebOQL queries, or views, defined as queries over documents registered in the system. The system catalog contains metadata for the documents, especially their schemata, used for query processing and optimization. Queries can range over both the catalog and the documents, and multiple query languages are supported." [Full context]

[June 22, 2001] Extensible Information Server (XIS). Re: XML databases. "eXcelon Corporation provides the Extensible Information Server (XIS), a component of the eXcelon XML Platform. XIS is a native XML database and was the first one to market in September of 1998. Our new version 3.0 has been tested with millions of documents and hundreds of gigabytes of data and is a fully transactional database supporting incremental updates and capable of performing over 1200 XPath queries per second. We also have a highly scalable and fast XSLT transformation engine built right into the server. We have over 300 customers including Global 2000 corporations like NTT DoCoMo, SwissRe Life and Health, and Siemens. We also have award-winning XML development tools including our latest offering, Stylus Studio 3.0." Contact: Christopher Parkerson.

"eXist is an Open Source native XML database with pluggable storage backends and support for fulltext search. XML is either stored in the internal, native XML-DB or an external RDBMS. The search engine has been designed to provide fast XPath queries, using indexes for all element, text and attribute nodes. eXist is lightweight and well suited for large document collections. The server is accessible through easy to use HTTP and XML-RPC interfaces... As of version 0.5, there are two different backends... For both backends, the basic model employed for storing and retrieving XML documents is the same. There's no direct link between the XPath processor and - for example - the SQL statements used to retrieve data in the relational backend. The XPath processor does his job only through calls to the storage backend and it does not matter how this backend is implemented. During parsing, a DOM-Tree is build from SAX-events. The hierarchy of DOM-Nodes is then mapped into a system of unique identifiers, which are used to generate the index structure. Documents may be large, although some issues like updating docs may be better solved with many smaller documents. The eXist server may be accessed through either HTTP requests or an XML-RPC interface (Perl and Python examples are provided). eXist integrates well with Cocoon2. It is easily possible to retrieve documents dynamically out of the database instead of the file system and transform them using Cocoon2."

[June 22, 2001] XML Agent. "...isn't open source, but it can be used freely for non-commercial purposes. It was built in Delphi, if that intrigues you. It's rather experimental, being a prototype for our full scale xml database product. (Unfortunately, the full scale product is not yet quite ready for release outside of Japan -- trademark issues and other legal stuff.)... If our XML Agent looks useful, go ahead and download it and experiment with it." Contact: Joel Rees

[February 20, 2001] "X-Hive Corporation is an innovator in XML database technology. Its premier product, X-Hive/DB, enables developers of XML applications to store, search and retrieve XML documents in a fast and scalable manner. X-Hive/DB supports open standards to ensure ease of programming and effortless integration. X-Hive/DB's ability to instantly locate and retrieve the smallest element within very large documents or collections of documents, sets it apart from alternative XML storage solutions and makes it the ideal database for publishers and developers of content management systems. X-Hive Corporation is an active member of the World Wide Web Consortium (W3C) and is based in Rotterdam, The Netherlands." See the November 2000 product announcement. [From Frans van Gils, 20 Feb 2001]

[November 17, 2000] IDOOX XDB: XML Database. Miloslav Nic announced the pre-release publication of XDB: XML Database. "XDB is an XML document repository providing structured storage of XML data, at present using an RDBMS (Relational Database Management System) mapping over PostgreSQL. As the first step, our plan is to develop a lightweight XML persistent storage engine on top of a relational database backend to come up with a UI and API in short time and replace it by our native XML storage system in the second step to satisfy complex XML processing requirements. XDB intention is to offer a fast, reliable and scalable XML database framework with powerful querying techniques according to W3C standards (XPath, XML Query) and standard XML processing APIs (SAX, DOM)... the main purpose of XDB is to provide native storage of XML data. RDBMS is not the target, but just temporal method which will be replaced by dedicated storage within couple of months. Principal features: (1) Ability to store and process large collections of XML documents; (2) Stores any well-formed document; (3) Provides SAX interface; (4) RDBMS mapping of XML documents; (5) Access via XPath based query language; (6) Independence on database system." See also the associated white paper.

[November 06, 2000] DBDOM "bridging XML and relational databases... ITER is a small international group of software developers developing cutting edge XML database solutions. We are currently working on DBDOM, a product that will allow using any relational database as an XML application server. FAQs are available." Description: "DBDOM is an implementation of DOM in SQL, on a relational database platform. DBDOM turns an RDBMS into an XML application server. [What is the difference between DBDOM and Sun's, Microsoft's, IBM's or any number of Open Source implementations of DOM?] Standard DOM implementations keep DOM trees in RAM and use text files for persistent storage. DBDOM uses an RDBMS for both. There's nothing wrong with flat files, as long as they are small and infrequently updated by a small number of users. [But] There are five major problems with flat files, one of which is solved brilliantly by XML, and four by RDBMSs. With XML, there's no need to invent a file format for each application, and no need to write a parser for this format. The four other problems are: (1) Size/cache. As the size of the document (or the number of related documents) increases, you must implement some sort of cache, because you won't be able to keep large DOM trees in RAM and your users won't wait for a SAX parser to read the file from BOF to EOF each time. Fragmentation could be an alternative to caching, but here too, much custom code needs to be written that would keep these fragments in order. (2) Concurrent access. If you have more than one user making changes to your document(s), some mechanism (software or convention) must exist that would ensure that only one user makes changes to the document at any given moment. (3) Transactional integrity. In a networked environment, you need to ensure that any request is either processed fully, or not at all, i.e. you must be able to roll back a transaction. (4) Security and administration. An administrator's control over who can read, write and update what parts of an XML document are very limited. DBDOM brings together the best of the two worlds... DBDOM defines a fixed database schema with a predefined set of tables that represent XML structures and stored procedures that represent methods defined in the DOM API. DBDOM stored procedures can be accessed directly through a connection to the database, but to simplify things, we added a package of Java adapters that implements org.w3c.dom. This package encapsulates JDBC and takes care of all SQL interaction with the database." Contact: K. Ari Krupnikov.

[November 06, 2000] Kimbro Staken recently announced the release of dbXML Core Edition Version 0.3. The dbXML Core Edition is a "data management system designed specifically for collections of XML documents and is easily embedded into existing applications, highly configurable, and openly extensible. The source code has been released under the GNU Lesser General Public License and is available online at the dbXML Group's Core Edition web site. This release updates the dbXML distribution adding many new features. (1) Initial Compressed DOM implementation. (2) Basic indexing system. (3) Server side auto-linking of database resources. (4) Experimental support for XPath querying. (5) XML Schema Compiler. (6) SOAP Support -- All dbXML XMLObjects, Procedures and stored documents are automatically exposed by the server as SOAP services. (7) Command line administration tools. (8) Better documentation and examples. The dbXML Core Edition is available for download from the website at http://www.dbxmlgroup.com/.

[October 09, 2000] See preceding entry for update. XML:DB Standards Initiative for XML Databases. A standards initiative for XML databases was announced by Kimbro Staken (Chief Technology Officer, dbXML Group L.L.C) on the SourceForge DBXML list. From the announcement: "SMB GmbH, dbXML Group L.L.C and The OpenHealth Care Group have joined together to create the XML:DB initiative. Our goal is to develop open standards for XML databases along with open source implementations of those standards. Our first project will be the development of an XML update language. It is our goal to fast track the development of this update language and a reference implementation leveraging the open source development model. All implementations will be licensed under the Apache open source license." The announcement was made on behalf of XML:DB, "an industry initiative chartered with the development of open specifications for the XML database industry. Currently all XML database vendors are forced to develop their own proprietary mechanism for managing the data stored by their product. We are concerned that, without some initiative to bring these efforts together, this will lead to considerable confusion and duplication among users and, that as a result, the opportunities that XML databases offer to the market will not be maximized. Standards will facilitate the growth of a knowledgeable work force comfortable with the use of XML database products and the tools associated with them. Current database workers assume the existence of standards for RDBMS products and will therefore expect the same to be available for XML databases. More information about XML:DB can be found on our Web site http://www.xmldb.org/. The W3C has been the primary force behind the development of XML standards and is currently in the process of specifying a standard for XML query. The XML:DB initiative is not a replacement for the efforts of the W3C; however, it is our feeling that the development of standards for XML databases falls outside the current charter for the W3C. In particular, the first task for XML:DB will be the development of a standard XML update language. The current specification for XML Query states that update languages will be considered in a future version of the XML Query standard. This presents a serious problem for XML database companies who require an update language that is available today." From the web site: "XML:DB is also supported by a growing list of organizations with interest in XML and XML databases. XML:DB provides a community for collaborative development of specifications for XML databases and data manipulation technologies. Along with each specification an open source reference implementation will be developed to validate the ideas put forth in the specification and to more rapidly drive acceptance of the specification in real products. XML:DB's long term goals are: (1) Development of standardized technologies for managing the data in XML Databases; (2) Contribution of reference implementations of those technologies under an Open Source License; (3) Evangelism of XML database products and technologies to raise the visibility of XML databases in the marketplace. Membership in XML:DB is free and all interested parties are invited and encouraged to participate."

DataMirror Corp.'s XML/Database application. DB/XML Vision. Also: DB/XML Transform. Among the white papers: "XML and Tree-Structured Query Technology."

RAX (Record API for XML) is "a simple, record-oriented API for XML. Provides a simple, efficient interface for processing the sort of XML often generated from databases."

The Castor Project. "Castor is the shortest path between Java objects, XML documents, SQL tables and LDAP directories. It provides Java to XML binding, Java to SQL/LDAP persistence, and then some more. See also the PPT presentation (Spring 2000).

Oracle XML Development. See especially the white papers.

PENN Database Research Group. Project summary: "One of the most exciting developments in database research has been the convergence of ideas from the document and database communities. In order to represent data with loosely defined or irregular structure, the semistructured data model has emerged as a dynamically typed data model that allows a "schema-less" description format in which the data is less constrained than is usual in database work. At the same time the document community has developed XML as a format in which more structure is added to documents in order to simplify and standardize the transmission of data via documents. It turns out that these two representations are essentially identical! The Penn database group has had a long-standing involvement in semistructured data, especially in the development of query languages, structure description, constraints, and type systems. It has also been part of the development of the XML-QL proposal and has worked on other aspects of XML." See the group's research projects and publications list.

Ozone Database Project. "The Ozone Database Project is a open initiative for the creation of an open source , Java based, object-oriented database management system. You are welcome to join the project by writing code, discussing ideas, or just using the software and giving us feedback... ozone is a fully featured, object-oriented database management system completely implemented in Java and distributed under an open source license. The ozone project aims to evolve a database system that allows developers to build pure object-oriented, pure Java database applications. Just program your Java objects and let them run in a transactional database environment. ozone includes a fully W3C compliant DOM implementation that allows you to store XML data. You can use any XML tool to provide and access these data. Support classes for Apache Xerces-J and Xalan-J are included. Besides the native API, ozone provides a ODMG 3.0 interface. Although not fully ODMG compliant it helps you to port applications to/from ozone... Although there are a lot of databases, we have had some reasons to develop ozone. It owes its existence to various reasons, the most important of which is that we needed a tool with which to model and implement complex database applications, but we did not have access to a suitable package to adapt to our needs. We also did not want to to have to convert to new, package-specific methods. Because of this, we decided to develop one ourselves from scratch, according to the following requirements: (1) Natural, easy integration of database development with the main software effort; (2) Persistent objects freely available on the network; (3) Particularly powerful multiuser support; (4) Speed adequate to the task."

"XML-DBMS: Middleware for Transferring Data between XML Documents and Relational Databases." "XML-DBMS is middleware for transferring data between XML documents and relational databases. It views the XML document as a tree of objects in which element types are generally viewed as classes and attributes and PCDATA as properties of those classes. It then uses an object-relational mapping to map these objects to the database. An XML-based mapping language is used to define the view and map it to the database. XML-DBMS is available both as a set of Java packages and as a PERL module. XML-DBMS, along with its source code, is freely available for use in both commercial and non-commercial settings." See also the mapping language DTD [cache].

SODA2 - An XML Semistructured Database System. "SODA2 (for Semistructured Object DAtabase, Version 2) is a client-server, semistructured database system which is tailor-made for managing XML information. Query processing and optimization are implemented in and executed by clients while the server is responsible for storing and retrieving objects; handling transactions, object locks, garbage collection, database backups and recovery. Object access management policies and transaction models can be changed to fit the needs of a specific application without affecting the application code. Online database backup is supported without stopping the client working with the database. A lazy object conversion approach is used for versioning. Different clients can simultaneously work with different versions of DTD."

DB2XML - Transforming relational databases into XML documents. Written in Java, DB2XML provides three main functions: (1) Transforming the results of database queries or complete databases into XML documents or into HTML documents using XSLT stylesheets. (2) Providing attributes describing characteristics of the data (i.e. meta data). (3) Easy integration of XSLT stylesheet processors."

[March 07, 2001] "Ipedo Inc. Announces the Ipedo XML Database. Industry's First Dynamic XML Database Accelerates Performance of Web and Wireless Applications." - "Ipedo, Inc., a leading provider of dynamic content delivery acceleration products, today announced beta availability of its new Ipedo XML Database. Designed to speed data delivery and transformation in Web and wireless applications, the Ipedo XML Database combines advanced XML query processing with a high-speed native XML database engine. The all-Java server includes advanced XSLT and XPath processing features. 'The Ipedo XML database offers several integration and performance advantages over conventional relational databases,' said Nick Zhang, CEO and co-founder of Ipedo. 'It enables faster Web download time, allowing instant customization of Web portals, and reducing the cost of redesigning large Web sites, while easily integrating with companies existing systems.' The Ipedo XML Database will be available in the second quarter of 2001. Ipedo develops a range of high-performance dynamic content delivery products to accelerate Internet and wireless applications. Based on its Active Edge performance technology, Ipedo's product line offers caching for a range of user profile and XML data. Ipedo's products can be used to provide rapid personalization, instant delivery, and scalable data access to very large user populations in ASPs, ISPs, Web portals, B2B exchanges, wireless services and next-generation Internet telephony."

[October 26, 2000] XUpdate: XML database Update Language." By Andreas Laux and Lars Martin. Working Draft, Reference 2000-09-14. "This is an XML:DB Working Draft for review by all interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Working Drafts as reference material or to cite them as other than 'work in progress'. . . This specification defines the syntax and semantics of XUpdate, which is a language for updating XML documents. XUpdate is designed to be used independently of any kind of implementation. An update in the XUpdate language is expressed as a well-formed XML document. XUpdate makes extensively use of the expression language defined by XPath for selecting elements for updating and for conditional processing. XUpdate is a pure descriptive language which is designed with references to the definition of XSL Transformations." The XML DTD for XUpdate is provided in the text of the specification. XUpdate is a project of 'XML:DB the Standards Initiative for XML Databases'.

[October 03, 2000] "Software AG Inc. Announces the Tamino XML Platform for E-Business; Offers Industry's First Comprehensive Suite of Products for XML-enabled Data Transactions and Applications." - "Software AG has introduced the industry's first complete suite of XML-enabled products for building enterprise-scale applications capable of handling complex and transaction-intensive processes. Software AG's new Tamino XML platform, which includes a native XML database, an integration broker and development tools, leverages the performance and scalability benefits of native XML technology to optimize a company's internal E-Business communication and raise the value of existing IT investments. Software AG's new Tamino XML platform consists of the following components: (1) a native XML database, (2) products that allow for the seamless integration of data, existing applications and electronic business processes (X-Node and X-Bridge) and (3) a set of development tools for XML applications (X-Studio). The Tamino XML database is the only native XML database management system available on the market today. A native XML system runs faster, more reliably and with less administration than legacy database architectures that use slow XML adapters. Considered the "engine" of the Tamino platform, the native XML database was specifically designed to store, publish and retrieve XML elements directly in their original data structures ensuring outstanding performance. The database uses a standards-based query mechanism that also allows for sophisticated and powerful text retrieval. Furthermore, the native XML database allows systems administrators to control the whole system via one location-independent browser interface. The X-Node and X-Bridge components of the Tamino XML platform address the integration of traditional data sources into XML and the transformation and routing of XML documents between applications. X-Node provides users with a single server view of business data residing in both the Tamino XML database and other databases, offering a real-time conversion of externally-stored data into XML. X-Bridge serves as a central communication hub for enterprise-level XML-based information exchange by analyzing and routing the content and structure of XML documents according to user-defined rules. X-Bridge supports applications built in Java and can interface with established and emerging e-business standards like Biztalk, Business Object Documents of the OAG (Open Applications Group) and SOAP (Single Object Application Protocol). Software AG's Tamino XML platform also incorporates X-Studio, an application development tool set specifically geared to the requirements of programmers creating XML-centered applications. Tamino X-Studio allows for rapid development of scalable XML-based applications and for building XML-related documents or XSL stylesheets. Tamino X-Studio is a Java-based development environment for building applications that support heterogeneous IT systems. Software AG's Tamino XML platform is available immediately. For developers interested in evaluating the platform, Software AG will soon offer a starter kit, which allows for a time-limited evaluation of the components listed above. All Tamino XML platform products are compatible with the W3C recommendation, XML 1.0, and support multilingual documents by conforming to the Unicode standard for internationalization. Tamino protects data independence by following an open database management system philosophy, providing interfaces such as DCOM, ODBC and JDBC."

XDBM from Bowerbird Computing. "XDBM is an XML Database Manager. That is, it is an embedded database that you can include in your own software. You can also make your software use less memory with XDBM by only loading XML elements when you need them.It is not an enormous database engine, it allows individual programmes to more easily use the XML files they produce. XDBM is a lightweight database that will give you all of the advantages of searching through your XML files without the enormous size of database engines. XDBM only deals with individual XML files. It doesn't cram many XML files into a single binary file like a database engine will. This means that your customers will be able to examine and manage the XML files produced by hand if need be. Or, more likely, your customers will be able to easily write scripts to manage the files in any way that suits them. With XDBM you get: (1) Easy searching that will not only make your programmers' lives easier but also make your software much more efficient. (2) No parsing so that you will be able to open an XML file much faster. (3) On demand loading of elements so that your software will use a lot less memory by not loading every element right from the start..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY