[Archive copy mirrored from the URL: http://www.zdnet.com/macweek/seybold97/1001/standards.html; see this canonical version of the document.]

 Show Daily Index

 MacWEEK Online
 Macworld Online

  Search MacWEEK
  Back Issues

 MacWEEK Columns
  MacWEEK Insider

  Second Decade

  Steamed Crabb

  More Columns

 Electric Knife
  Current Column

  Knife Forum

  Current Column

  MacInTouch Home

 Mac Catalog
 Mac Download
 Mac Talk Forum

 ZDNet Mac
  Mailing List

 About MacWEEK
  Beat List
  Contact Info


Seybold Seminars San Francisco 97 Show Daily


September 30

SGML gains recognition, popularity in XML subset

Standards focus of session

By Erik Sherman

SGML, or Standard Generalized Markup Language, has been an esoteric realm since its introduction in 1986. SGML certainly has been put to good use by corporations in industries such as aerospace, automotive, pharmaceuticals and commercial publishing to create and maintain long technical documents that are impossible to manage in more typical desktop publishing programs. Yet for all the utility in larger businesses, the standard has received relatively little attention, especially compared with its Web-weaving cousin, HTML.

How things have changed. Today, for the first time in the history of its San Francisco show, Seybold is featuring a full-day seminar on the topic -- "SGML/XML Knowledge Day." Specialized vendors are exhibiting in the SGML Open Consortium Pavilion (Booth #3109 N) with a number planning to make product announcements, and Microsoft Corp. (Booth #1233 S) and Netscape Communications Corp. are at the show and are expected to demonstrate significant support for XML (eXtensible Markup Language) -- a new standard that is a subset of SGML.

Smaller is bigger
The XML subset has much to do with SGML's newly found popularity because of its potential applications, not only with printed documents, but as an adjunct to HTML.

"The biggest thing that's happening is XML," said Frank Gilbane, a director at CAP Ventures Inc., a Norwell, Mass.-based research firm. "It's designed to solve the main limitations of HTML without the complexities that a full SGML would carry with it."

HTML describes how text data will be presented on a Web page. Yet by its nature, HTML acts like an abstracted typographic language -- as it indicates whether text should be displayed as a headline, article text and so on -- and makes no distinction about the nature of a document's content. So HTML cannot distinguish between, say, a chapter and a section that belongs within the chapter. This has little impact on simple Web sites. Complex sites, however, become much more difficult to manage because a company cannot examine the nature of content.

"Right now it's hard to do any document management [in HTML] because the links are hard coded," said Robin Tomlin, executive director of the SGML Open Consortium, an SGML vendor association. "When you have [support for document structure], you can have some validations of the information, so you know that all the information you should have there exists."

XML is like a cross between SGML and HTML. It has the flexibility developers need to define types of content and how they are related while being less complex than full-blown SGML. Since for all practical purposes it encompasses HTML, existing Web content can remain as it currently stands. Because it has most of the power of SGML, a single tagged text document could conceivably drive not only display of Web sites, but production of paper documents.

"XML is not so format-driven, but it looks that way. Once you have information identified like that, then you can do more manipulation. It's not just dumb data," Tomlin said.

"In very few cases would you see anyone creating HTML documents, then using that source data to produce paper or some other output form of the document," said Mike Maziarka, director of Parlance product management for XyVision Inc. (Booth #3314 N). Such cross production should be easily achieved with XML.

Standard variations
Even though there is a good deal of commonality between SGML and XML, the two are not the same. XML, for example, does not require the use of a document type definition (DTD), which defines a document's entire structure. While XML allows the use of a DTD, users can also define new document types on the fly.

"The impact of XML on companies creating content is that the process of collecting and/or converting information into XML is much quicker and less expensive than for SGML. The fact that XML is not predefined allows the information to be more easily repurposed," said Gary Palmer, director of R&D for ActiveSystems Inc. (Booth #3202 N).

Similarly, XML and HTML are different, but only to the creator of a document. Users retrieving a page would not notice any difference, however, because an organization using XML, or even SGML, and publishing information on the Web could translate the document into HTML, which the browser would then display.

The result of the multiple standards tracks has been a splitting of the market. The authoring market has shaken out and what once was an entire category of high-end authoring tools has shrunk to a handful of offerings.

For XML to catch on, it needs wide vendor acceptance, especially if browsers are to support the standard. And that seems to be on the horizon, as both Netscape and Microsoft have signed on to back the SGML extravaganza at Seybold San Francisco.

"For them to be talking on SGML Knowledge Day about XML is a real leap," Tomlin said. "Typically, SGML has been a real niche standard." According to sources, the major vendor support could be impressive, with Microsoft conceivably creating an entire XML marketing group for its activities.

Gilbane said he also expects major support from the two browser companies to drive use of the standard. He noted that Microsoft is backing the use of XML to describe such Web-related activities as channels for push publishing.

Support for XML is coming from the niche vendors as well as the Internet giants. "A lot of the vendors are coming out with new XML browsers and viewers," Tomlin said. "And because XML allows you to manipulate data on the Web, I think you're also going to see data-management tools that support XML on the Web."

As an example, ArborText Inc. (Booth #3203 N) is announcing a major upgrade to its authoring system. XyVision (Booth #3314 N), while not planning any product releases with XML support for the show, is still formulating its plans and expects to make an announcement in the future. ActiveSystems (Booth #3202 N) will demonstrate how to use its existing products to integrate SGML and XML. A number of vendors in the SGML Open Consortium are exhibiting as part of an SGML pavilion at the show (Booth #3109 N). Several companies are also offering presentations on SGML in a theater area near the pavilion.

Users follow
Even with the strength of XML, real gains for users will come with the newest tools from vendors. Tools to manage and troubleshoot content will be necessary steps for many, but not all, said CAP Ventures' Gilbane. He noted that the Wall Street Journal Interactive version is actually based on SGML.

"They don't just want to publish stuff on the Web," Gilbane said. "They want to build a repository of this electronic information." By taking the additional work to create an SGML repository and creating tools to extract HTML content, there are more options open for publishing.

One problem with user implementation has been unnecessary complication. "If you design your application well enough, you don't really need a complex tool," Gilbane said. "Sometimes you can't make it that simple, but a lot more people could than do."

"SGML/XML Knowledge Day," Wednesday, 10:30 a.m.-5:30 p.m., Center for the Arts: Forum.

Show Daily Index

Copyright © 1997 Mac Publishing LLC. All rights reserved.