The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Created: May 10, 2001.
News: Cover StoriesPrevious News ItemNext News Item

University of Washington Tukwila Data Integration System.

The University of Washington Database Research Group is developing a 'Tukwila' system which "uses adaptive query processing techniques to efficiently deal with processing heterogeneous, XML-based data from across the Internet. The data integration system depends upon a mediated schema to represent a particular application domain and data sources are mapped as views over the mediated schema. The user asks a query over the mediated schema and the data integration system reformulates this into a query over the data sources and executes it. The system then intelligently processes the query, reading data across the network and responding to data source sizes, network conditions, and other factors. The Tukwila data integration system is designed to scale up to the amounts of data transmissible across intranets and the Internet (tens to hundreds of MBs), with large numbers of data sources. The Tukwila data integration system is designed to support adaptivity at its core using a two-pronged approach. A highly efficient query reformulation algorithm, MiniCon, maps the input query from the mediated schema to the data sources. Next, interleaved planning and execution with partial optimization are used to allow Tukwila to process the reformulated plan, quickly recovering if decisions were based on inaccurate estimates. The system provides integrated support for efficient processing of XML data, based on the x-scan operator. X-scan efficiently processes non-materialized XML data as it is being received by the data integration system; it matches regular path expression patterns from the query, returning results in pipelined fashion as the data streams across the network. XML provides a common encoding for data from many different sources; combined with standardization of schemas (DTDs) across certain domains, it greatly reduces the needs for wrappers and even query reformulation. The latest versions of Tukwila are built around an adaptive query processing architecture for XML, and can seamlessly combine XML and relational data into new XML content."

"The Tukwila data integration system introduces a number of new techniques for query reformulation, optimization, and execution. Query processing in data integration occurs over network-bound, autonomous data sources ranging from conventional databases on the LAN or intranet to web-based sources across the Internet. [High volume data] requires extensions to traditional optimization and execution techniques for three reasons: there is an absence of quality statistics about the data, data transfer rates are unpredictable and bursty, and slow or unavailable data sources can often be replaced by overlapping or mirrored sources; additional challenges are posed when we wish to integrate XML data... During execution, Tukwila uses adaptive query operators such as the double pipelined hash join, which produces answers quickly, and the dynamic collector, which robustly and efficiently computes unions across overlapping data sources... The Tukwila query processing components are designed to be self-contained modules that can be swapped out as needed. Each of the main components (reformulator, optimizer, execution engine, and wrappers) are separate code modules, each optionally in a different language and on a different platform. A sockets-based communication interface with a standardized request model allow us to interchange parts."


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI: http://xml.coverpages.org/ni2001-05-10-b.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org