[Archive copy mirrored from: http://www.webmethods.com/news/stories/turning.html]


Building Blocks

Turning the Web Into a Data Source

By Leslie Marable

What if the Web could expose its information to any application in any way? This is the concept behind webMethods Inc. of Fairfax, Va., which has defined an eXtensible Markup Language (XML) that can directly address any HTML document or set of documents as a data source.

WebMethods' Web Interface Definition Language (WIDL) is at the heart of the webMethods Web Automation Toolkit, a Java class creator that "reads" a document's structure, transforms it into its constituent objects, and stores these objects in a repository for use in any Java application.

These classes can be called from existing business applications to get data from a Web site, which lets the developer bypass the browser and treat the Web as a data source. As a Java Bean, these classes can be incorporated into COM- or CORBA-based applications through the Java-ActiveX bridge or other interobject communication schema. Beans support is coming in a future release, according to Charles Allen, vice president of business development at webMethods.

The Web Interface Definition Language (WIDL) consists of six XML-compliant HTML extenders that define a universal schema for HTML documents based on the Document Object Model (DOM) as it is being defined by the World Wide Web Consortium.

Currently, the Automation Toolkit parses the HTML level supported by Microsoft and Netscape 3.x on top of the W3C's HTML 3.2 specification (what will be known as DOM level 0).

By dealing with HTML documents as a set of abstract objects, the Automation Toolkit can extract data from any place in any document at an extremely detailed level. For instance, defining a news story as a headline, a byline, and body copy organized as paragraphs, an application designed to monitor the Associated Press wire could enumerate, manipulate, and extract the specific paragraphs containing the word Zaire in any story by a specific reporter. It could then pull that story, paragraph, or whatever into a local application.

The company added native support for the Secure Sockets Layer 3.0 de facto standard earlier this month, enabling secure transactions with any Web-based source from nonbrowser applications.

Web developers working for webMethods clients, including DHL Worldwide Express, Texas Instruments, and the U.S. Postal Service, are building applications used to automate the exchange of data from suppliers, business partners, and customers.

Mark Lussier, the lead software engineer for DHL's customer automation division, said he used the technology to build a single-source package tracking facility. "In the case of [package] tracking, the Web results that a user sees are different depending on the current situation of the package. If it's been delivered, there's one response, if the package is in transit, there's a different response," he said. "The webMethods Toolkit allows us to handle all those conditions with one piece of code, so we didn't have to write code for each variation of the page." Ultimately, Lussier said, the Toolkit saved him time and saved DHL money, because Netscape priced its proposed solution for the same task at $27,000. "I did the same thing in two hours for one-tenth the cost." A 30-day evaluation version of the complete Web Automation Toolkit can be downloaded from the company's Web site. A beta version of Web Automation Toolkit 2.0 is also available. Individual developer licenses for the Toolkit begin at $995. Server run-time licenses begin at $2,495. Server run-time licenses with the SSL plug-in begin at $4,995.

The SSL plug-in requires the Phaos Technology Corp. SSL libraries and an encryption license from RSA Data Security Inc.

Reprinted from Web Week, Volume 3, Issue 11, April 21, 1997 © Mecklermedia Corp. All rights reserved. Keywords: development_tools java Date: 19970421