Intelligent Enterprise Subscribe Article Index Contacts Resources Write the Editor

[This local archive copy is from the official and canonical URL,; please refer to the canonical source document if possible.]

Search Powered
by Thunderstone:

Intelligent Enterprise
DBPD Online
DBMS Archives

Articles by Topics

Also in this issue >>>

Editor's Page
David Stodder

CIO Insight
John Trustman &Susan Meshako

ERP Management
Rod Johnson

Decision Support:
From the Lab

Erik Thomsen

Terry Moriarty

Enterprise Developer
Stewart McKie

Scalable Systems
Richard Winter


Product Reviews
In the Field
News & Analysis

Database Programming & Design

May 11, 1999, Volume 2 - Number 7

XML:The Missing Link for B2B E-Commerce

At least one of your competitors is evaluating XML right now to streamline its partner relationships beyond the wildest dreams of EDI proponents. What are you waiting for?

By David Ritter

Applications as we know them are dead,” says Andy Roberts, vice president of technology at Bow Street Software. “They’re being replaced by Web services.” And although SAP AG and Siebel Systems Inc. probably wouldn’t state it in exactly the same terms, these major application vendors are rapidly adapting their enterprise solutions for use over the Web. But the potential for the open exchange of information directly among applications—a critical requirement for business-to-business e-commerce—is largely untapped. However, Bow Street and many others are betting that the Extensible Markup Language (XML) will overtake electronic data interchange (EDI), enabling technology that drives the transition to a more open form of electronic business.

What It Is

XML is a simple, more elegant descendant of the much larger and more cumbersome Standard Generalized Markup Language (SGML, ISO 8879). SGML has been used since the 1970s to format and classify documents on a huge scale, most predominantly in the automotive, aviation, and semiconductor industries.

Markup languages are based on human-readable text and contain two basic types of information: textual, presumably useful to someone for something, and “tags” that describe, label, format, or otherwise relate to the text. The markup language provides the syntax rules for putting the tags into the document. In SGML and its descendants, the tags are enclosed in angle brackets:

<EXAMPLE> Here’s some example text. </EXAMPLE>

Tags enclose the text they describe. The end marker has the same name as the opening tag, with an initial slash character (/) added to distinguish it. It should be apparent to any reader, human or machine, that the sentence above is an example. This simple combination of data and metadata is the essence of XML.

Who defines the tags? With XML, that’s the good part; anyone can define new tags. In fact, you can just invent them as you go. You don’t have to define the XML document structure up front; XML is ultimately flexible. But for business applications, it’s important to enforce rules and recognize bad data. Document Type Definitions (DTDs) and XML parsers provide the basis for ensuring information integrity in XML documents. A DTD defines the XML tags you should use in a particular class of documents. It’s essentially a vocabulary and simple grammar for the tags to describe information. For example, the DTD entry might look like this:


!ELEMENT is recognized by XML as a tag definition. The element name is followed by a description of the type of data that’s allowed to appear within the element. In this case, #PCDATA indicates that any string of characters is allowed as part of an EXAMPLE element. DTDs can specify the rules for groups of related elements, nested structures, repeating and optional sets, and other requirements. (See “The Play’s the Thing,” page 34.)

An XML parser is similar to the grammar-checking software in your word processor. The parser can determine whether an XML document has proper syntax. Just as the grammar checker verifies the proper order and relationships of nouns, verbs, and other parts of speech, XML elements must appear within the structure described by the DTD. Of course, my grammar checker does nothing to ensure that what I’m writing makes any sense. Similarly, the XML parser is ignorant of semantics—the meaning of the information in the document. What does it mean for a sentence to be an EXAMPLE? This interpretation depends entirely on the reader’s intelligence and perspective or the interpreting application.

SGML is also the parent of Hypertext Markup Language (HTML), the page-formatting language that sparked the growth of the Web. HTML has specific tags for particular formatting features, such as titles, heading, or columns. HTML tag interpretations are hard-coded into browser software, making HTML inflexible and difficult to extend without endless debate among browser developers. Here’s a key distinction: In XML, the tags describe what the information is; in HTML, the tags describe how to display the information. Understanding the content’s nature lets XML documents serve multiple purposes.

Archaeologists tell us that more lowly and generalized species survive more often than complex, specialized animals. Given its openness and flexibility, XML willl probably engulf and devour HTML. In future Web applications, HTML may simply be one specific DTD for describing page formatting, and HTML content will be contained within more general XML documents. HTML as we know it today becomes a dangling branch on the evolutionary language tree.

Down to Business

It’s a short jump from our simple example to a more meaningful business application. For any paper form or application screen used in your business, constructing an XML data representation is straightforward. Here’s an order with XML encoding placed by a bookstore to a distributor. This particular format is derived from an existing EDI standard:

	<?xml version="1.0" encoding="ISO-
   	8859-   1" standalone="no"?>
	<?xml-stylesheet href="edi-lite.xsl"
	<Title>EDItEUR Lite-EDI Book
	<Order-Line Reference-No="0528835">
	<Author-Title>Bryan, Martin/SGML and
	HTML Explained</Author-Title>
	<Order-Line Reference-No="0528836">
	<Author-Title>Light, Richard/Presenting

This order contains two line items. The publishers’ most common stock-keeping unit tag, an ISBN number, identifies each item. As in HTML, individual tags can have parameters, as in the “Reference-No” attribute associated with the Order-Line tag. Note the nesting of the ISBN, Author-Title, and Quantity tags as “children” within the Order-Line “parent.” It’s not rocket science, and that’s the good news.

Why It’s Important

Why does XML have the potential to reshape information management? There are a few primary reasons:

Simplicity. The uncomplicated structure of the language allows low-cost development and deployment.

Combining data and metadata. Similar to the way object-oriented languages such as C++ let you combine logic and data, XML lets you combine structure and data. This lets the same data set serve a variety of clients, from structured database searches to graphical formatting in Web pages.

Basic agreement. Older standards such as SGML and EDI are “brittle”; any disagreement in format causes the entire communication process to break down. In contrast, XML applications ignore unrecognized elements benignly. This seemingly small distinction is one of XML’s greatest strengths. XML lets you create layered standards where many participants may agree on a core set of elements without restricting participants’ ability to add their own custom extensions.

Wide adoption. The next generation of the major Web browsers (most notably, version 5 of both Microsoft Internet Explorer and Netscape Communications Corp.’s Navigator) have robust, built-in XML support. Major and minor software vendors are announcing XML support daily, most notably Microsoft, with its e-commerce strategy based on XML. Many vertical industries have begun initiatives to create standards for data exchange based on XML.

Table 1 Layers being built on top of basic XML.
Standard Description
Namespaces Allows the same document to use tags with the same name from different DTD vocabularies.
Xlink Powerful hypertext capabilities, such as links to collections, automatic link traversal, and the construction of composite documents.
Extensible Style (XSL) General output formatting for XML documents. Based on the Sheets Document Style Semantics and Specification Language, XSL encompasses the functionality of the earlier Cascading Style Sheet (CSS) HTML standard.
Resource Description (RDF) Structures information about the content of Web sites, so that Framework search engines, agents, and filtering programs can more effectively find (or avoid) specific information.
XML-Data Schema definition in XML, bypassing the need for a separate DTD syntax. Probably most useful for representing relational database structures in XML documents.
Document Object (DOM) Allows the contents of an XML document to be manipulated Model by program code, such as Java or JavaScript. Says one industry observer, “XML and the DOM finally give Java something to do.”
XML Metadata Interchange (XMI) Allows the exchange of software development repository information, especially object definitions. Sponsored by the Object Management Group (OMG)

The Next Layer

Even with all these advantages, basic XML isn’t enough. As we’ve discussed, it’s the semantics of the message that matter. XML immediately begs for the next layer of standards. General (or horizontal) capabilities apply broadly to information management or presentation. These standards are independent of specific industries or application categories. Table 1 summarizes the key general layers currently being built on top of basic XML. (Extensive information on each is available through the Web resources listed at Intelligent Enterprise Online.) These horizontal standards give Web technologies much broader application. For example, Web sites that incorporate resource description framework (RDF) information will be much more readily searchable. The document object model will improve the user interface experience in the browser by making Web content more dynamic. The XML Metadata Interchange (XMI) specification will facilitate creating common management tools for software repositories, much as the Simple Network Management Protocol (SNMP) allowed for the creation of common network administration tools.

In contrast, specific (or vertical) standards are emerging in individual industry segments. These new formats move data among applications directly. The transfer may occur between two systems within a company, or across company boundaries, creating levels of automation and openness.

Vertical exchange formats challenge the role of traditional EDI. Many companies haven’t implemented EDI because of its high cost and complexity. But in enterprises that already use EDI, IT will only adopt the new technology gradually as it replaces legacy systems. It’s not likely that established EDI users will convert to XML without a compelling reason. In fact, migrating to EDI traffic from expensive, value-added networks (VANs) to cheaper Internet connections will prolong the lives of many legacy EDI systems.

Who’s Driving?

Flexibility and openness are great. Just look at what they’ve done for Europe since the end of the Cold War. Germany is reunited, and Poland’s economy is growing. (But let’s not talk about Yugoslavia.)

Similarly, XML has taken the lid off data exchange standards. Ponderous standards bodies such as ANSI (X12) and the United Nations (EDIFACT) controlled EDI. XML is a free-for-all. First-mover advantage is there for the taking. Will the results be a glorious reunification or bloody civil war? It’s too soon to fully answer this question, but the early signs are fairly promising.

There are three major camps participating in the development of standards based on XML:

Standards bodies. Led by the Worldwide Web Consortium (W3C) and the Open Applications Group (OAG), these groups try to find the path that’s best for everyone. W3C has done an outstanding job of navigating this complex path in the past, although major vendors continue to confuse the process by introducing competing proposals. Industry representatives populate these bodies, and each delegation has its own particular ax to grind. XML’s core itself is so generic that there isn’t much to disagree about. When the discussion turns to more advanced features such as presentation styles and multimedia integration, the conversation frequently heats up.

Software vendors. Products move faster than standards. Software companies large and small see the strategic value of controlling data standards. Microsoft, past master of de facto standards, is very active in this area with proposals such as BizTalk for business transactions and XML-Data for database schema representation. Some vendors work cooperatively with the standards bodies. Others simply develop solutions and deploy them in the hopes that a critical mass of adoption will ensue.

Industry groups. Companies are beginning to see XML as an opportunity to realize some of the promise of supply-chain automation. They’re banding together into consortia and publishing their own standards. Prominent examples include the HL7 Kona initiative in healthcare, RosettaNet in the IT industry, and ECM Data for electronic components. (See a Web resource table in this feature’s online posting at www.

Who’s likely to control particular XML-based standards? For the horizontal extensions, such as style sheets or database schemas, the standards bodies are best suited to achieving common ground. Because these standards are so critical to software developers and to the Web’s health, the W3C will remain the prominent decision-making body.

Industry groups will likely determine the direction of core business-to-business data exchange. Where there’s a single dominant player, an individual company may be able to set and control the standards. Software vendors acknowledge that their ability to drive the definition of the more vertical standards is limited. “The independent vendors are the least likely to establish standards,” admits Andy Roberts of Bow Street. “Success is driven by business issues, not technical issues.” Still, major ERP vendors such as SAP AG may drive the adoption of new formats as its XML strategies become more developed. SAP, J.D. Edwards and Co., and other enterprise vendors have announced support for Microsoft’s BizTalk initiative. Catalog integration and purchasing automation standards proposed by other enterprise application companies such as Ariba Technologies Inc. are also gaining acceptance by vendors and customers. The tricky part is in translating the rosy promises of the recent press releases into the actual deployment of real applications.

How’s It Being Used?

Although the technology is new and still rapidly evolving, XML is finding a home in several different types of enterprise applications. These examples show that XML solutions are practical today, but generally require the assistance of a skilled systems integrator. The body of available tools and services is expanding daily.

Corporate portals. Companies such as DataChannel Inc. and ArborText Inc. are using these basic tools to create powerful intranets that consolidate and navigate massive legacy knowledge bases. Corporate information is often spread over many separate repositories. Tagging and transforming this information into a common format provides centralized access and control. Vendors such as Autonomy Inc. provide tools that automate document cataloging, categorization, and tagging by analyzing textual content and generating the appropriate metadata.

Application integration. Now that you’ve installed SAP, how are you going to populate the new central database from your legacy systems? Then, how is the data going to get out of SAP and onto your Web site? Integration providers such as OnDisplay and WebMethods Inc. use XML to solve these migration problems. In the process, you need to frequently process, filter, and transform the data according to a set of business rules. OnDisplay says it’s performed more than 70 such legacy adaptations and developed a toolkit of almost 200 reusable data transformation rules in the process.

Meta searching. Transformation technologies such as Junglee (recently acquired from its creators by use XML to transform a site’s information into a virtual relational database, letting you search and comparison shop across multiple sites and vendors. These Web-based tools are the precursors of autonomous agents.

Multi-vendor catalogs. Many large organizations are moving all their purchasing online. For this to work, their suppliers need to be online, too. Software vendors such as Commerce One Inc. and Ariba are offering solutions in the form of purchasing automation software and integrated access to suppliers’ catalog data. CommerceOne recently purchased Veo Systems Inc., whose Common Business Library defines XML components for basic business entities and transactions, providing XML equivalents to many common formats defined in ANSI X12 EDI. Ariba has announced Commerce XML (cXML), a set of lightweight DTDs for catalog information and purchasing transactions. The Commerce XML standard also includes recommended request and response process flows for purchasing. This new, specialty branch of enterprise computing is moving rapidly to the forefront of many IT agendas. For example, Ariba recently snared a contract to automate all the purchasing for the state of California. CommerceOne counts MCI and Pacific Gas & Electric among its customers.

What to Do

Depending on your perspective, XML is either revolutionary or simply evolving. New technology companies are rapidly embracing XML as a means of managing information. But upstart software vendors and system integrators won’t be able to drive this standard into use within enterprises by themselves. XML will reach critical mass as it becomes accepted and supported by mainstream developers, large corporations, and major ERP vendors. Over time, XML will become part of the Web’s standard infrastructure.

The early XML leaders emphasize that companies should take a long-term view of the technology. They see tremendous benefits from standard, interactive data, but acknowledge that development and wide adoption will take time. But today, someone in your industry is probably hard at work on a DTD to describe the products you build or sell. Your competitors are taming their internal knowledge management dragons. Companies are getting timely, accurate data out of their SAP databases and onto their Web sites. What are you doing while all this is happening?

Here’s an example of XML ENCODING, using Shakespeare’s The Tempest. This example uses the following document type description (DTD):

	<!-- DTD for Shakespeare J. Bosak -->
	<!-- A play consists of some required elements, such as a title
	and folio mark (FM), and other optional elements, such as a
	prologue and epilogue. At the heart of the play are one or
	more ACTs-->
	<!ELEMENT FM    (P+)>

	<!-- An ACT is primarily one or more SCENEs.Each SCENE
	contains SPEECHes, STAGEDIRections, and so on-->
	A selection from the encoded play follows:

	<?xml version="1.0"?>
	<!DOCTYPE PLAY SYSTEM "play.dtd">

	<TITLE>The Tempest</TITLE>
	<FM><P>Credit to Jon Bosak for XML encoding.</P></FM>

	<TITLE>Dramatis Personae</TITLE>
	<PERSONA>PROSPERO, the right Duke of Milan.
	<PERSONA>ANTONIO, his brother,the usurping Duke of
	<PERSONA>FERDINAND, son to the King of Naples.
	<PERSONA>GONZALO, an honest old Counsellor.
	<SCNDESCR>SCENE A ship at Sea: an island.</SCNDESCR>
	<SCENE><TITLE>SCENE I. On a ship at sea: a tempestuous
	   noise of thunder and lightning heard.</TITLE>
	<STAGEDIR>Enter a Master and a Boatswain</STAGEDIR>
	<SPEAKER>Master</SPEAKER> <LINE>Boatswain!</LINE>

	   <LINE>Here, master:
	   what cheer?</LINE>
	<LINE>Good, speak to the mariners: fall to't, yarely,</LINE>
	<LINE>or we run ourselves aground: bestir, bestir.</LINE>
	<STAGEDIR>Enter Mariners</STAGEDIR>

What sort of applications could you build to take advantage of the XML encoding? Here are some examples that demonstrate how XML can let you use the same data to serve different customers:

•A formatting application could print the play as a script for use by performers. A slightly different formatter could print the play for easy reading in book format.

•A program that incorporates voice synthesis could read various parts of the play so that performers could rehearse their parts without the need for other human players.

•The holodeck of the Starship Enterprise could perform the entire play using artificially constructed, three-dimensional characters.

I know which application I’d like to write.


ArborText Inc.:

Ariba Techologies Inc.:

Autonomy Inc.:

DataChannel Inc.:



Open Applications Group:


WebMethods Inc.:

Worldwide Web Consortium:


Contributing editor David Ritter is a senior IT specialist with the Boston Consulting Group. He has 18 years of software industry experience, most recently as the VP of engineering at Firefly Network. He has also been a director of engineering for Oracle, where he designed and developed Oracle’s OLAP client products. You can contact him via email at


Copyright © 1999 Miller Freeman Inc. ALL RIGHTS RESERVED
No Reproduction without permission
Intelligent Enterprise... subscribe... archives... media kit... resources