[This local archive copy mirrored from the canonical site: http://www.infoworld.com/cgi-bin/displayStory.pl?/interviews/971209sterken.htm; links may not have complete integrity, so use the canonical document at this URL if possible.]

The new DB2 Universal Database from IBM. DB2 UDB shopped Sept. 26. Click here for demo.
| Navigational map -- for text only please go to the bottom of the page ||Top News Stories|
December 9, 1997

ArborText's Sterken on XML's importance

By Michael Vizard
InfoWorld Electric

As Microsoft moves to embrace XML, one of its major partners in this area is ArborText, which supplies publishing tools that work across both print and electronic mediums. ArborText CEO Jim Sterken outlined the importance of XML and ArborText's overall strategy for embracing this new Web technology to InfoWorld Executive News Editor Michael Vizard.

InfoWorld: What impact is XML going to have in terms of Web publishing?

Sterken: XML is really the next step for the Web. But it's broader than just the Web because it provides a path that works for printed output, which our customers all still view as being important as well. XML enables us to overcome some of the real technology problems that are showing up on the Web today. Things like vastly too many hits when searching, to the point where some people just give up practically when it comes to trying to find something from scratch.

The Web is too slow, too static and too limited in support of workgroup collaboration. Another issue is that it's too difficult to build non-document applications, which is really where a lot of the potential is on the Web. Harking to its roots from SGML, XML enables building broader solutions that are more scalable.

InfoWorld: Can you give an example of what that would mean?

Sterken: The kind of things that are on the Web now, a human can look at and read just fine. But if you want a machine to look at the same Web page, there's just words. And short of cracking the artificial intelligence problem, which is a few decades off yet, there's nothing for having machines work with the information. And that's the kind of thing you need in order to be able to put a system in-place that's going to really creatively show information in different customized ways.

InfoWorld: What's going to drive the push towards XML?

Sterken: XML is also getting a boost in many areas related to its use as just a way for organizing data. It's also getting a lot of attention related to electronic commerce as a way that can be used to just encapsulate the information that has to be done for various transactions. For example, if I want to do banking transactions, there's been an XML-based application defined that can be used for that purpose.

InfoWorld: How will people learn to use XML?

Sterken: The basic idea with XML is that it's just an extension of what people do with word processors now. With word processors, you're interested in just typing-out paragraphs. With XML the idea is to give more information about what you're typing and break it up into some pieces.

InfoWorld: As a provider of general purpose tools for publishing, how has the Web changed the nature of what ArborText does as a tools supplier?

Sterken: What customers are facing is that in addition to just printing information on paper, they also want to make it available on the Web. The thing that the IT managers are facing is that there's an overwhelming constraint on the Web in that if information is more than like a week or so old, it's ancient. That wasn't a constraint that they faced before.

So now they've got to have a whole different system in-place in order to be able to support this new kind of thing where users are just going to want to be able to look at it, do a search, find the kind of information they were interested in and then display it in the way they want to look at it. If they're an expert, they'll want to see an expert viewpoint; if they're a novice they're just going to want to see some basic information.

What's changed in all of this is we're not talking about documents anymore. What we're talking about is a more general base of data of document-level information, but just data that people can search and display sort of in the same way that they might search and display information in a relational database.

InfoWorld: Is it viable for a corporate IT organization to have one tool to publish on the Web, to publish to print and to publish say to CD-ROM?

Sterken: Certainly! That's the whole promise with this technology and that's the single reason why there is so much excitement with customers in general because you're publishing out of a database. With relational databases, you can have the same relational data and then make it available a thousand different ways with a thousand different reports that subset the information in different ways and present it differently. What XML promises is to be able to do the same thing with text--and that isn't done now.

InfoWorld: A lot of sites today are basically channeling all their content updates through one person, the Webmaster. Does ArborText make it easy enough for a group of people to work in conjunction to update a Web site?

Sterken: Yes. We work in concert with various companies offering repository technology such as Documentum and Filenet. The way it works is you forget about long documents and instead what you work on is just small chunks of information. A screen or two of information then goes into this repository and it can be searched and called out for use in a wide variety of places.

InfoWorld: What's the relationship between ArborText and Microsoft?

Sterken: We did a joint demo with Microsoft at Seybold. The main excitement for us is that we have been rapidly developing expertise for creating databases using reusable pieces of information that can be demand-published in a wide variety of ways. The thing that was lacking was a good way to get that information directly out onto the Internet with the common browsers. We've got customers now that are taking the data in SGML or XML and then filtering it out into HTML. But if you do that, you lose a lot of the content value, so you can't do as much out on the client side.

But with Microsoft and Netscape both adding support for XML into their browsers, it offers a tremendous advance in terms of being able to ship rich data that actually has information beyond a few keywords that talks about how it's organized so that a Java-enabled application running on top of the browsers is going to be able to do great things with it.

InfoWorld: What about Netscape and Lotus, are you guys talking to them at all?

Sterken: Early stages. With Netscape especially, a whole lot of our customers are using Navigator. We're really encouraged by what we've been hearing from Netscape, but we're not as far along with them.

InfoWorld: It seems there are slightly different flavors of XML starting to be pushed by Microsoft and Netscape. Are these two efforts compatible?

Sterken: I think they will be compatible and certainly we're going to be working with everybody we can to make them be compatible. Netscape has been working on RDF, which is more of a superstructure above XML. It's a way that you have a document or a piece of data expressed in XML, but you also have a bunch of data about it. It's sort of like a generalization that allows you to provide additional information that can be of help when looking at the information and searching for it. So I think it's complimentary. It's an extra area where we'll need to make some advances anyway. So I consider that as part of a family of XML standards.

InfoWorld: A lot of people have been talking about how XML will be the core file format on intranets going forward, but that for the public presentation of that data it will be rendered in HTML. Does that make sense though if the browsers already support XML?

Sterken: I think what we're talking about here is short term and long term. Longer term, the plan with XML is there's an associated style sheet language. We co-authored that with Microsoft and Intel and submitted it to the WC3 in the past few months. That's a language which is XML-enabled; sort of an extension to the cascading style sheets. And that's one that the browsers will be supporting. I suspect that we're six months out yet before we see that support, but once they support that, that will provide a way that browsers can just directly display the XML data.

InfoWorld: At Seybold, there seemed to be some concern that XML would eliminate the need for SGML. Is that true?

Sterken: I guess I look at it the other way. I look at it that in terms of what's being used by customers. SGML is practically synonymous with XML in terms of the kind of capabilities they provide. So it may turn out that two or three years from now everybody refers to what they're doing in the area of keeping the structured information as XML. With SGML, there's a small layer of additional things that I think people by and large will want to use over and above what's in XML right now. But it's really a small layer and we're going to be working actively on the WC3 to have those additional things added to some future version of XML.

InfoWorld: Is there going to be some sort of ratification of an XML standard any time soon? And should people wait for this standard, or are people going to move forward and then the standard will be pretty much just ratifying what's already been done?

Sterken: Let me just give you the timeline on the standard. XML is broken up into three different parts. There's XML, which is the main thing. That's pointed toward end of this year as the time frame for submitting something for final approval to the WC3. And then related to it, there's XSL, which I talked about before, which is the style-sheet: That's probably more like the middle of next year. And then there's another one, XLL which is related to linking by offering some significant extensions to linking beyond what you can do right now with HTML on the World Wide Web. That's due in the second or third quarter. But we already support XML as it's conceived right now. So when it comes to the idea of creating structured information, we're ready to go right now.

InfoWorld: What's the enhanced linking functionality that you're talking about?

Sterken: It's a variety of features derived from High Time, which is an SGML-related standard that has a lot of extra technology defined for doing linking. It's for things as simple as being able to click on something and then be presented with a link to three different places.

InfoWorld: So if you were launching your own intranet project today with an eye towards deploying it sometime in the latter half of 1998, would you start writing that in XML?

Sterken: I certainly would.

InfoWorld: Any major changes in store of ArborText?

Sterken: Right now we're working with Fortune 2000 companies predominantly, but as we gain experience and package our solutions, we want to go down significantly lower into the companies that are more the $50 to $100 million size. That's our target over time.

InfoWorld: So what's your biggest worry?

Sterken: There's a lot of things to think about in terms of just getting all the infrastructure in-place. I guess the main thing for us that the current basis of just HTML-based display of information is just not scaling-up; it's running out of steam all over the place. The main thing that keeps me awake at night is just making sure that we stay at the front of the XML-based initiative.

For an overview of recent InfoWorld Electric interviews, go to Interviews at a glance.

Go to the Week's Top News Stories

Please direct your comments to InfoWorld Electric Deputy News Editor Carolyn April

Copyright © 1997 InfoWorld Media Group Inc.

InfoWorld Electric is a member of IDG.net

| SiteMap | Search | PageOne | Conferences | Reader/Ad Services |
| Enterprise Careers | Opinions | Test Center | Features |
| Forums | Interviews | InfoWorld Print | InfoQuote |