The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Last modified: November 16, 2000
Architecture and Tools for Linguistic Analysis Systems (ATLAS)

[November 16, 2000] "ATLAS is a recent initiative involving NIST, LDC and MITRE. ATLAS addresses an array of applications needs spanning corpus construction, evaluation infrastructure, and multi-modal visualization. The principal goal of ATLAS is to provide powerful abstractions over annotation tools and formats in order to maximize flexibility and extensibility. Our approach has been to isolate and abstract over the physical and logical levels of annotation tools and formats, leaving application- and domain-specific issues to the side. ATLAS Level 0, also known as Annotation Graphs, provides a data model, interchange format, and application programming interfaces for working with linear signals (such as text and audio) indexed by intervals. ATLAS Level 1 is a generalized model, suitable for annotating signals of essentially arbitrary dimensionality with annotations having essentially arbitrary structure. An early application of ATLAS Level 1 is OCR annotation, where textual images are indexed using bounding boxes. Resources for ATLAS Level 1 are available from the main ATLAS Website."

One of the project goals is to develop "abstract annotation tools/formats to maximize flexibility and extensibility via separation of abstract physical and logical levels from application-specific levels; for the abstract physical level (persistent representation) for long-term storage, exchange, and pipelining XML-based ATLAS Interchange Format (AIF)." [As of 4/6/00, a Prototype XML-based ATLAS Interchange Format (AIF) has been defined.]

"ATLAS Interchange Format, or AIF, is intended to be a flexible and extensible file format that will facilitate widespread exchange and reuse of annotation data. AIF is a direct representation of the ATLAS model, which employs an extremely general notion of annotation. The philosophy behind ATLAS was not to make any assumptions on annotation schemes or signals that people would use. We were not as much interested in standardizing a format for annotations than in providing people with a way to define their own. In this respect, the point of view is one that would allow us to describe annotations in a very abstract way. ATLAS can thus be considered as a 'meta-annotation scheme' since its goal is to provide a way to describe any annotation that one might be interested in creating. An ATLAS annotation picks out a region of primary data, such as an extent of (possibly structured) text, or a region of n-dimensional signal data, and associates structured information to it. The associated information could be as simple as an orthographic string or a link to some region in another signal, or it could be a deeply nested structure. An ATLAS Annotation element specifies a Region plus some associated Content. There are many different kinds of region, many different ways to specify them, and many different units from which such specifications can be built. Instead of attempting to enumerate these and deciding how each would be represented in XML, we opted for a general, extensible approach whereby regions are built up recursively out of other regions, and out of Anchors. The same approach was taken for the Content element. Instead of enumerating all the conceivable kinds of coding and markup, and their legal elements, attributes, and values, we employ nested `feature structures' (the Feature element) and permit essentially arbitrary structures to be built up using the Feature and Parameter elements. This generality comes with a price - the DTD cannot guarantee that regions are geometrically well-formed or that the associated content conforms to a coding manual. These important needs are addressed with domain-specific validation tools. We will be providing such tools for several domains, and the ATLAS architecture (and AIF API) will support others who wish to provide these tools for other domains. A public beta is available for review. We developed AIF as an XML application using two DTDs: one containing the core elements (standard) and one containing the metadata element declaration which is not entirely part of ATLAS since we are comitted to adopt the standard that will be defined by the ISLE effort. If you have any question regarding AIF, you can find below a list of contacts to whom your comments and questions can be addressed. We have also developed an XML Schema version. This version is merely a conversion of the standard DTD, so far. We plan to continue development of the Schema version in the future to allow greater flexibility."


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: