Abstract
TimeBank is the only reference corpus for TimeML, an expressive language for annotating complex temporal information. It is a rich resource for a broad range of research into various aspects of the expression of time and temporally related events. This paper traces the development of TimeBank from its initial—and somewhat noisy—version (1.1) to a substantially revised release (1.2), now available via the Linguistic Data Consortium. The development path is motivated by the encouraging empirical results of TimeML-compliant annotators developed on the basis of TimeBank 1.1, and is informed by a detailed study of the characteristics of that initial release, which guides a clean-up process turning TimeBank 1.2 into a consistent and robust community resource.
Similar content being viewed by others
Notes
TimeBank (Version 1.2) is distributed by the Linguistic Data Consortium; see http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T08.
Temporal and Event Recognition for QA Systems; http://www.timeml.org/site/terqas.
At the time of writing, the specification of TimeML is undergoing revision, with respect to the makeinstance tag in particular. While alternative mechanisms are proposed as replacement expression of the tag’s semantics, we incorporate here a description of makeinstance as it was used in the annotation of TimeBank.
timex2 and timex3 differ substantially in their treatment of event anchoring and sets of times. In particular, relational time expressions (e.g., 2 days before departure) are a single timex2; under TimeML analysis, the same expression would be annotated as a collection of related timex3, signal and event tags, with an additional link anchoring the event. Sets of times (e.g., every day) would also get different analyses. This impacts both the boundaries of annotation spans, and attributes of the covering annotations (tags). Overall, timex3 is not a straightforward extension of timex2, as its analysis of a temporal expression is designed to interact with all the other TimeML components.
Strictly speaking, GUTime targets the timex2 tag, most recently popularized by the Time Expression Recognition and Normalization (TERN) program; see http://www.timex2.mitre.org/tern.htm. As far as extent and normalized value of the temporal expression are concerned, timex3 and timex2 are not that dissimilar.
A token may initiate a span of tokens which belong to a given category, or it may fall inside of such a category span, or it would not belong to any category. Thus, with respect to a given category x, a token would be tagged with one of begins_x, insideof_x, or outside tags. This kind of encoding models category assignment to token sequences as individual token tagging task; see (Boguraev and Ando 2005b).
Following the release of TimeBank 1.1, a dedicated effort focused on developing a custom annotation tool. TANGO (Pustejovsky et al. 2003c) specifically addresses the challenges of producing XML-compliant and internally consistent markup for ‘dense’ annotation tasks—of which TimeML is a particularly good example.
Translingual Information Detection, Extraction, and Summarization; http://www-nlpir.nist.gov/tides.
The TimeML working groups included people involved in TIDES and STAG.
TimeML ANnotation Graphical Organizer; http://www.timeml.org/site/tango; a workshop following TERQAS, focusing on developing annotation infrastructure for TimeML.
The most recent TimeML specifications and annotation guidelines are available at http://www.timeml.org.
One other change has been made to the TimeML specification since the completion of TimeBank 1.2; namely, the removal of the makeinstance tag. All the attributes associated with this tag (i.e., tense, aspect, modality, polarity) have been moved to the event tag itself. (See also Footnote 3.)
The annotators were all novices and received only one to two hours of training of TimeML annotation (see Sect. 5.1).
Technically, each annotator’s data should be considered both as the key and as the response, and recall and precision should be computed in both directions. However, with only two annotators only one direction is needed.
The Kappa coefficient adjusts for the number of agreements that would have occurred by chance and is defined by (p o−p e)/(1−p e), where p o is the observed probability and p e the expected probability. The Kappa coefficient, however, is not well suited for annotation tasks that cannot be construed as pure classification tasks and is therefore not used to measure agreement on whether links were introduced by both annotators. See also (Hirschman et al. 1998).
While certain conclusions can be drawn from the fact that TimeBank 1.1 IAA scores for link identification and typing are low, not a lot should rest on the actual figures: these were inexperienced annotators, whose IAA scores on timex3’s and events were about 10 points lower than those of their experienced counterparts.
One could say that each event–event pair, or event–timex3 pair, that has no temporal link defined between them by way of a tlink or slink tag, is in fact evidence of a missing link. This is clearly impractical given that the number of links is quadratic to the number of events and times in a text. Here, the number of missing links is calculated by finding events that are not temporally linked to any other event or time.
The difference in signal counts between the two corpora is due to reasons explained above in (5.2).
Abbreviations
- TimeML:
-
A Markup Language for Time
- timex :
-
Time Expression
- LDC:
-
Linguistic Data Consortium
- IE:
-
Information Extraction
- IAA:
-
Inter-Annotator Agreement
References
Allen, J. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11), 832–843.
Boguraev, B., & Ando, R. K. (2005a). TimeBank-driven TimeML analysis. In: G. Katz, J. Pustejovsky, & F. Schilder (Eds.), International Workshop on Annotating, Extracting, and Reasoning with Time. Dagstuhl, Germany.
Boguraev, B., & Ando, R. K. (2005b). TimeML-compliant text analysis for temporal reasoning. In: Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-05). Edinburgh, Scotland.
Ferro, L. (2001). tides: Instruction manual for the annotation of temporal expressions. Technical Report MTR 01W0000046V01, The MITRE Corporation.
Fikes, R., Jenkins, J., & Frank, G. (2003). JTP: A system architecture and component library for hybrid reasoning. Technical Report KSL-03-01, Knowledge Systems Laboratory, Stanford University.
Gaizauskas, R., Harkema, H., Hepple, M., & Setzer, A. (2006). Task-oriented extraction of temporal information: The case of clinical narratives. In: A. Montanari, J. Pustejovsky, & P. Revesz (Eds.), TIME 2006: International Symposium on Temporal Representation and Reasoning. Budapest, Hungary.
Han, B., & Lavie, A. (2004). A framework for resolution of time in natural language. TALIP Special Issue on Spatial and Temporal Information Processing, 3(1), 11–35.
Hirschman, L., Robinson, P., Burger, J., & Vilain, M. (1998). Automatic coreference: The role of annotated training data. In: AAAI 1998 Spring Symposium on Applying Machine Learning to Discourse Processing. Stanford, USA, pp. 1419–1422.
Hobbs, J., & Pan, F. (2004). An ontology of time for the semantic web. TALIP Special Issue on Spatial and Temporal Information Processing, 3(1), 66–85.
Hobbs, J., & Pustejovsky, J. (2004). Annotating and reasoning about time and events. In: AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning. Stanford, CA.
Lee, K., Pustejovsky, J., & Boguraev, B. (2006). Towards an international standard for annotating temporal information. In: Third International Conference on Terminology, Standardization and Technology Transfer. Beijing, China.
Mani, I., Wellner, B., Verhagen, M., Lee, C. M., & Pustejovsky, J. (2006). Machine Learning of Temporal Relations. In: Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics. Sydney, Australia.
Pustejovsky, J., Castaño, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G., & Radev, D. (2003a).TimeML: Robust specification of event and temporal expressions in text. In: AAI Spring Symposium on New Directions in Question-Answering (Working Papers). Stanford, CA, pp. 28–34.
Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., & Lazo, M. (2003b). The TimeBank corpus. In: T. McEnery (Ed.), Corpus Linguistics (pp. 47–656). Lancaster.
Pustejovsky, J., Knippen, R., Littman, J., & Saurí, R. (2005). Temporal and event information in natural language text. Language Resources and Evaluation 39(2–3), 123–164.
Pustejovsky, J., Mani, I., Bélanger, L., Boguraev, B., Knippen, B., Littman, J., Rumshisky, A., See, A.,Symonenko, S., Guilder, J. V., Guilder, L. V., Verhagen, M., & Ingria, R. (2003c). Graphical Annotation Kit for TimeML’. Technical report, TANGO (TimeML ANnotation Graphical Organizer) Workshop Version 1.4, <http://www.timeml.org/tango> [date of citation: 2005-06-20]
Saurí, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., & Pustejovsky, J. (2005). TimeML Annotation Guidelines, Version 1.2.1’. Technical report, TERQAS Workshop/Linguistic Data Consortium. <http://www.timeml.org/site/publications/timeMLdocs/AnnGuide_1.2.1.pdf> [date of citation: 2006-07-16].
Setzer, A. (2001). Temporal information in newswire articles: An annotation scheme and corpus study. Ph.D. thesis, University of Sheffield, Sheffield, UK.
Verhagen, M. (2005). Temporal closure in an annotation environment. Language Resources and Evaluation, 39(2–3), 211–241.
Verhagen, M., Mani, I., Sauri, R., Littman, J., Knippen, R., Jang, S. B., Rumshisky, A., Phillips, J., & Pustejovsky, J. (2005). Automating Temporal Annotation with tarsqi’. In: 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05). Ann Arbor, Michigan, Poster/Demo.
Acknowledgements
This work was supported in part by the ARDA NIMD and AQUAINT programs, PNWD-SW-6059 and NBCHC040027-MOD-0003.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boguraev, B., Pustejovsky, J., Ando, R. et al. TimeBank evolution as a community resource for TimeML parsing. Lang Resources & Evaluation 41, 91–115 (2007). https://doi.org/10.1007/s10579-007-9018-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-007-9018-8