The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: November 16, 2000
ACE Pilot Format DTDs

[November 16, 2000] Provisional. ACE Pilot Format (APF) and associated DTDs [07-January-2000 communiqué]: "The ACE Pilot Format is a form of XML stand-off annotation that is the target of our joint annotation efforts, and any way you can generate documents conforming to this format (DTD) is fine. .. The ACE Pilot Format (APF) has been designed cooperatively by NIST (George Doddington, John Garofolo and Jon Fiscus), and MITRE (John Henderson, Benjamin Wellner and David Day). This format is the result of a very time constrained effort to produce something appropriate and useful for the ACE Pilot evaluation as early as possible. Thus, we anticipate further work on the issue of appropriate data/annotation encoding standards between now and May, most likely coming from the ATLAS effort. In the meantime, we want to move forward with the current approach. The annotation is realized in the form of XML standoff annotation, which means that the file as a whole conforms to XML encoding standards, and the raw data (or 'signal') being annotated resides in a separate file. The annotations 'point' to portions of the signal via indices. Because we anticipate three different kinds of signals (text,speech,ocr), we have provided three different kinds of indices, though others can be proposed and adopted. As you can see from the DTD, these three are charspan (for text), timespan (for speech) and pixelboundingbox (for images). The usual operation of Alembic Workbench is to load a plain text or SGML-annotated file and produce additional SGML/XML annotations embedded directly in the same file, since this has been the de facto annotation interchange standard to date. Since the ACE Pilot Format (APF) for annotation takes the form of a stand-off annotation file, we have modified the Workbench to include a separate conversion utility that extracts the information saved in the file being edited and places it an appropriately formatted file. This conversion utility is called sgm2apf. It can be called directly from within the Workbench application and is also available as a standalone program. We do not yet have a conversion utility for taking an APF file and the associated source file and creating a file suitable for the Workbench to visualize and edit, but we will be writing this over the next week or so..."

References:


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/acePilot.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org