[This local archive copy mirrored from: http://www.strategies.com/randomxml.htm; see the canonical version of the document.]

Random Thoughts. . .

On information productivity

Regular if infrequent musings on subjects of interest to the publishing communityfrom the folks at Information Strategies.

XML, What will it REALLY Mean?

By Barry Schaeffer

Ó Copyright November 15, 1997 by Barry Schaeffer. All rights reserved


XML is well on its way to becoming a household word, catching even the eye of Time Magazine. To be so notorious before it is fully released much less integrated with available software and service offerings, XML must be a truly momentous development. While much has been written and said about it, XML' s true impact on publishing is still only sketchily understood. Conventional wisdom holds that this developing text standard will have considerable impact on the technology of content model design, browser architecture and "parserless" operation and what-have you. While these things are true, they leave most people still wondering just how much of a true sea change XML will mean in the developing electronic information world. This paper attempts a partial answer to that wonderment.


The answer, with no intent to denigrate the standard's importance, is no. XML, by its creators' design, is a linear extension of the concept of SGML. It carefully avoids adding anything significant to the earlier standard, aiming instead at achieving value by removing or simplifying several characteristics that have kept SGML from fully participating in the growing world of electronic communication embodied most clearly in the World Wide Web. By avoiding uncharted conceptual and technical ground, XML's creators have been able to formulate and publish their product as a fledgling standard in a couple of years. This contrasts markedly with the 7 to 10 years, according to the National Institute for Standards and Technology, normally required for new standards (remember that work on SGML, accepted as an ISO standard in 1986, was begun in 1969.)


If XML itself isn't radically new and different, one might wonder how it can possibly have a major impact on the electronic publishing community. While XML will facilitate some changes in the technical structure of publishing resources, it doesn't make possible any completely new functions. Indeed, XML doesn't require that much of anything change.

XML does, however, enable major changes that go far beyond technical issues, rearranging the very cultural relationships among providers, managers and consumers of information.

While merely enabling change, at any level, may sound less than earth-shaking, remember that, in today's highly competitive marketplace of ideas, what can be done, will be done. . . usually by one's competitors. Moreover, what can be done will first be requested and then demanded by one's consumers. So what XML makes possible will soon become the baseline for success. The trick is to stay ahead of the possibilities embodied in the standard and, conversely, to avoid becoming mired in its more readily apparent but less important characteristics.

Here are some major ways in which XML changes the playing field for electronic information:

XML allows a direct connection between the author and the final consumer of information.

Since the growth of the industrial society in the nineteenth century, those who knew something and those who wished to know and profit from it needed intermediaries to organize and deliver the knowledge to its intended recipients. Whether itinerant craftsmen, manuscript copiers, typographers and printers, or software tools and operators, these "middlemen" have made the path of knowledge transfer tortuous and expensive. Indeed, communication itself has usually been framed more by the limitations of the capture and delivery media than by the needs of the intellectual process. Gutenburg's revered invention of movable type, for example, was captive to the earlier invention of the "codex" or edge-sewn book to replace scrolls. Printing, with its "rectangular" approach to locating and using information, was an accommodation of knowledge to paper manufacture and the binder's art, not the other way around.

Through all of this, the ideal remained constant; direct, immediate and fluid communication between the expert and the consumer. One-to-one communication re-established this condition when the telephone replaced telegraph. But since the time, long ago, when expert and consumers could no longer face one another over the problem, one-to-many communication has been struggling. Technology has made things faster and more elegant but not much more direct. Even the Web and HTML, although a boon, perpetuated the disconnected nature of the knowledge provider-consumer relationship. No matter what can be captured by the expert author, it has to be strained through the narrow opening provided by the HTML tag set before it can be of value to its consumer. What can't be expressed in HTML can't be delivered or used, and even usable data often needs radical surgery to fit the HTML medium. The pernicious nature of this situation is evident in the lengths to which publishers and vendors have gone to extend or circumvent the limits of HTML.

XML, at least functionally, allows communication to shed this baggage, delivering to the consumer the very same data originally recorded by the expert author, presented in the way the author would have envisioned it. No longer must the communication process be captive to the ability of the intermediary, either human or technological. No longer must the cost of knowledge transfer be inflated by a whole host of processing steps related to neither the expertise of the author nor the needs of the consumer. No longer must the desires and whims of publishing technology upstage the real communication goals of publishing. Knowledge transfer can be, at once, better, faster and less expensive,

…if we understand our new options and move to leverage them.

Because it enables but does not require change,  XML creates entirely new modes of competition among would-be communicators:

Although all developments require some change, many can be successfully incorporated with relatively little effort. New and faster processor chips, for example, need only be designed into computers and, voila!, they are part of our reality. Other developments, and XML is one of these, present us with subtle challenges that we may choose to embrace, ignore or, worst of all, fail even to notice. If we fail to see and respond fully to these developments' meaning, the negative results are not as immediately apparent but may be, in the end, much more serious.

With XML, for example, we can rethink and restructure the nature of our intellectual and communication processes, but nothing guarantees that we will do so. It would (and will) be possible to merely graft XML capability onto the current electronic publishing model, reaping marginal benefits in a few technical and craft-related areas but missing completely the opportunity to radically reshape the nature and effectiveness of provider-consumer interaction. By definition, those following such a path will get a "quick hit" technologically but may find themselves, over time, at a disadvantage against their more prescient competitors. By the time these disadvantages begin to show, they may be masked by the details of cost, market share, customer demands and, lamentably, the desperate self-delusion that is characteristic of losers. It is up to us to think carefully about the New World that XML shows us and to act on it in such a way that we gain not the minimum from it but the maximum. Failing that effort, we are likely to learn all too soon, who was more successful at the effort.

If they understand what is changing, publishers can make decisions that will place their efforts in line to reap the benefits made possible by the new standard.

For example, there is a school of thought that XML is merely an "interchange" standard into which our core data, however structured, should be converted for delivery to the Web, etc. Authors will still use their existing tools to capture knowledge, allowing the production process to do the necessary conversion to XML for dissemination. While not without some merit, this approach fails completely to improve the critical relationships among what the author knows, what the consumer gets and what is required to make the process work. Communication, while marginally better, is still dependent on someone to convert the original knowledge into a deliverable resource and to keep the entire effort in harmony. This approach risks taking the ten-percent improvement, leaving the ninety unrealized. Those who do it will learn their folly only when their market begins to slip away because someone took the 90 percent. As evidence of this phenomenon, consider the publishing industry that thought, only a few short years ago, that fielding CD-ROM products required only a simple machine conversion of their existing data files created for paper publishing. Technically, they were correct but they soon found that the intended users understood that the result was different but not better and rejected the effort. Today, most publishers understand that developing an electronic product is a different and more demanding effort…but not all survived the lesson.


XML challenges knowledge providers and consumers to participate actively in its integration into the publishing world.

Put another way, XML is too important to be left to the crowd that developed it. Having come from the depths of the SGML catacombs, it must become the child of the masses, taking its future from what sheds light on human communication, not what fits with the mindset of its creators, however brilliant they may be.


XML can be the critical element of the most important advance in the way we communicate since the page…or it can be a ho-hum change in the technology of the World Wide Web. Which will be true is up to everyone with a stake in communication. XML blazes the right-of-way but doesn't, in itself, build the track or buy the rolling stock. If, however, we can rediscover and embrace the ideal of affordable, direct, immediate and fluid communication between expert and consumer, and demand the technology required for it to happen, today's technological community can and will deliver the tools and techniques to get there. None of this will happen overnight. Long after pages could be widely distributed, society still hadn't decided that reading was a skill fit for the masses. The dividends of perseverance, today as back then, far outweigh the cost.