Contents
Overview
The TimeML Project web site hosted at Brandeis University is funded by the Advanced Research and Development Activity (ARDA). TimeML Working Group Members include [2004-01]: Branimir Boguraev, José Castaño, Rob Gaizauskas, Bob Ingria, Graham Katz, Bob Knippen, Jessica Littman, Inderjeet Mani, James Pustejovsky, Antonio Sanfilippo, Andrew See, Andrea Setzer, Roser Saurí, Beth Sundheim, and Svetlana Symonenko.
The goal of the TimeML effort is "to develop a robust specification language for events and temporal expressions in natural language. TimeML is designed to address four problems in event and temporal expression markup: (1) Time stamping of events — identifying an event and anchoring it in time; (2) Ordering events with respect to one another — lexical versus discourse properties of ordering; (3) Reasoning with contextually underspecified temporal expressions — temporal functions such as 'last week' and 'two weeks before'; (4) Reasoning about the persistence of events — how long does an event or the outcome of an event last)..."
January 2004 description: TimeML is "a specification language for events and temporal expressions, which was developed in the context of a six-month workshop, TERQAS (www.time2002.org), funded under the auspices of the AQUAINT program. The ARDA-funded program AQUAINT is a multi-project ehe recognition of events and their temporal anchorings. In this paper, we report on an AQUAINT project to create a specification language for event and temporal expressions in text. Events in articles are naturally anchored in time within the narrative of a text. For this reason, temporally grounded events are the very foundation from which we reason about how the world changes. Without a robust ability to identify and extract events and their temporal anchoring from a text, the real 'aboutness' of the article can be missed. Moreover, since entities and their properties change over time, a database of assertions about entities will be incomplete or incorrect if it does not capture how these properties are temporally updated. To this end, event recognition drives basic inferences from text...
"What is novel in this language, TimeML, we believe, is the integration of three efforts in the semantic annotation of text: TimeML systematically anchors event predicates to a broad range of temporally denotating expressions; it provides a language for ordering event expressions in text relative to one another, both intrasententially and in discourse; and it provides a semantics for underspecified temporal expressions, thereby allowing for a delayed interpretation. Significant efforts have been launched to annotate the temporal information in large textual corpora, according to the specification of TimeML described above. The result is a gold standard corpus of 300 articles, known as TIMEBANK, which has been completed and will be released early in 2004 for general use. We are also working towards integrating TimeML with the DAML-TIme language, for providing an explicit interpretation of the markup described in this paper. It is hoped that this effort will provide a platform on which to build a multi-lingual, multi-domain standard for the representation of events and temporal expressions..." [from The Specification Language TimeML]
There is a "preliminary release of the TimeBank corpus — a set of 186 news report documents annotated with the 1.1 version of the TimeML standard for temporal annotation... These documents were annotated during the creation of the TimeML standard and the Tango TimeML Graphical Organizer tool. They constitute both a test domain for development and a proof of concept..."
About the Advanced Research and Development Activity (ARDA)
"The Advanced Research and Development Activity (ARDA) is an Intelligence Community (IC) center for conducting advanced research and development related to information technology (IT) — information stored, transmitted, or manipulated by electronic means. ARDA sponsors high risk, high payoff research designed to produce new technology to address some of the most important and challenging IT problems faced by the intelligence community. The research is currently organized into five technology thrusts, Information Exploitation, Quantum Information Science, Global Infosystems Access, Novel Intelligence from Massive Data and Advanced Information Assurance... ARDA defines Information Exploitation as the process of extracting, synthesizing, and/or presenting relevant information from vast repositories of raw and structured data. Data includes multiple media and genre types in all the human languages and that also contains geospatial and abstract data. More specifically, Information Exploitation provides the core functionality to access information necessary for an analytic process, especially in the Intelligence Community. At a minimum, Information Exploitation includes: Content Data Transformation, Content Data Mark-up, Information Retrieval, Information Discovery, Analytic Knowledge-Bases, Information Understanding, Assessment and Interpretation, Synthesis and Fusion, and Presentation and Visualization. ARDA's Information Exploitation programs are attempting to significantly advance the state of the art in some of these areas with the expectation that advanced analytic tools will emerge..." [from the home page and overview]
Principal URLs
- TimeML web site. Maintained by James Pustejovsky (Professor of Computer Science, Department of Computer Science, Volen Center for Complex Systems, Brandeis University)
- TimeML Documents
- The Specification Language TimeML. By James Pustejovsky, Robert Ingria, Roser Saurí, José Castaño, and Jessica Littman (Brandeis University); Rob Gaizauskas and Andrea Setzer (University of Sheffield); Graham Katz (University of Osnabrück); Inderjeet Mani (Georgetown University). 15 pages. [cache]
- TimeML Specification Version 1.1. Edited by Bob Ingria and James Pustejovsky. April 19, 2004. [cache]
- TimeML Annotation Guidelines . Version 1.1, April 2, 2004.
- TimeML Schema. Version 1.1, April 2, 2004. [cache]
- TimeML XML Schema (xsd) [cache]
- TimeML XML DTD [cache]
- TimeBank. Corpus of news report documents annotated with the 1.1 version of the TimeML standard for temporal annotation.
- TimeML Documentation
- TANGO Documentation
- Specification for TimeML 1.1: Draft 1. June 19, 2003. [cache]
- Earlier: TimeML Specification: Draft 2 Release Date: May 14, 2002. By Bob Ingria and James Pustejovsky. Early v2.2 BNF for TimeML. [cache]
Articles, Papers, News
"TimeML: Robust Specification of Event and Temporal Expressions in Text." By James Pustejovsky, José Castaño, Robert Ingria, Roser Saurí, Robert Gaizauskas, Andrea Setzer, and Graham Katz. Paper presented at IWCS-5 (Fifth International Workshop on Computational Semantics), 2003. "In this paper we provide a description of TimeML, a rich specification language for event and temporal expressions in natural language text, developed in the context of the AQUAINT program on Question Answering Systems. Unlike most previous work on event annotation, TimeML captures three distinct phenomena in temporal markup: (1) it systematically anchors event predicates to a broad range of temporally denotating expressions; (2) it orders event expressions in text relative to one another, both intrasententially and in discourse; and (3) it allows for a delayed (underspecified) interpretation of partially determined temporal expressions. We demonstrate the expressiveness of TimeML for a broad range of syntactic and semantic contexts, including aspectual predication, modal subordination, and an initial treatment of lexical and constructional causation in text." [alt URL, cache]
"Temporal Information in Newswire Articles: an Annotation Scheme and Corpus Study". PhD Thesis by Andrea Setzer. Supervised by Dr. R. Gaizauskas. University of Sheffield, 2001. "Temporal information in newswire articles: An annotation scheme and corpus study Many natural language processing applications, such as information extraction, question answering, topic detection and tracking, would benefit significantly from the ability to accurately position reported events in time, either relatively with respect to other events or absolutely with respect to calendrical time. Nevertheless, relatively little work has been done to date on the extraction of temporal information from text, indicating the difficulty of the task. The Message Understanding Conferences (MUCs) addressed the problem in a limited way which only very recently has been extended. To significantly aid the applications mentioned above, it is necessary to get to a point where we can automatically extract the events and order them in time. The basis on which this goal can be achieved is an annotation scheme which identifies the events in a text as well as the temporal information necessary to order the events in time. I have developed an annotation scheme which I believe is a very good starting for this ambitious goal. My thesis describes the conceptual framework and temporal ontology I have defined and the annotation scheme I have developed on this basis. To aid the application of the scheme to text, I have developed a graphical annotation tool which also comprises an interactive component supporting the gathering of information about the temporal relation in the text. The annotation scheme and the tool have been validated through the construction of a trial corpus during a pilot study. In this study, a group of annotators was supplied with my temporal annotation guidelines and asked to apply the annotation scheme to a trial corpus. In particular I was interested in answers to questions like How unambiguous and comprehensive are our temporal guidelines? How much genuine disagreement is there about temporal relations in text?. The results of the pilot study are analysed, problems and avenues of improvement are identified as well as the answers to above questions, insofar as I have been able to determine them..." See related research in the author's publication list. [abstract from the research report]