TimeBank evolution as a community resource for TimeML parsing

Boguraev, Branimir; Pustejovsky, James; Ando, Rie; Verhagen, Marc

doi:10.1007/s10579-007-9018-8

TimeBank evolution as a community resource for TimeML parsing

Original Paper
Published: 14 September 2007

Volume 41, pages 91–115, (2007)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Branimir Boguraev¹,
James Pustejovsky²,
Rie Ando¹ &
…
Marc Verhagen²

440 Accesses
27 Citations
Explore all metrics

Abstract

TimeBank is the only reference corpus for TimeML, an expressive language for annotating complex temporal information. It is a rich resource for a broad range of research into various aspects of the expression of time and temporally related events. This paper traces the development of TimeBank from its initial—and somewhat noisy—version (1.1) to a substantially revised release (1.2), now available via the Linguistic Data Consortium. The development path is motivated by the encouraging empirical results of TimeML-compliant annotators developed on the basis of TimeBank 1.1, and is informed by a detailed study of the characteristics of that initial release, which guides a clean-up process turning TimeBank 1.2 into a consistent and robust community resource.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ISO-TimeML and the Annotation of Temporal Information

It-TimeML and the Ita-TimeBank: Language Specific Adaptations for Temporal Annotation

Designing Annotation Schemes: From Theory to Model

Notes

TimeBank (Version 1.2) is distributed by the Linguistic Data Consortium; see http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T08.
Temporal and Event Recognition for QA Systems; http://www.timeml.org/site/terqas.
At the time of writing, the specification of TimeML is undergoing revision, with respect to the makeinstance tag in particular. While alternative mechanisms are proposed as replacement expression of the tag’s semantics, we incorporate here a description of makeinstance as it was used in the annotation of TimeBank.
timex2 and timex3 differ substantially in their treatment of event anchoring and sets of times. In particular, relational time expressions (e.g., 2 days before departure) are a single timex2; under TimeML analysis, the same expression would be annotated as a collection of related timex3, signal and event tags, with an additional link anchoring the event. Sets of times (e.g., every day) would also get different analyses. This impacts both the boundaries of annotation spans, and attributes of the covering annotations (tags). Overall, timex3 is not a straightforward extension of timex2, as its analysis of a temporal expression is designed to interact with all the other TimeML components.
Strictly speaking, GUTime targets the timex2 tag, most recently popularized by the Time Expression Recognition and Normalization (TERN) program; see http://www.timex2.mitre.org/tern.htm. As far as extent and normalized value of the temporal expression are concerned, timex3 and timex2 are not that dissimilar.
http://www.cis.upenn.edu/^∼treebank.
http://www.cnts.ua.ac.be/conll2003/ner.
A token may initiate a span of tokens which belong to a given category, or it may fall inside of such a category span, or it would not belong to any category. Thus, with respect to a given category x, a token would be tagged with one of begins_x, insideof_x, or outside tags. This kind of encoding models category assignment to token sequences as individual token tagging task; see (Boguraev and Ando 2005b).
Following the release of TimeBank 1.1, a dedicated effort focused on developing a custom annotation tool. TANGO (Pustejovsky et al. 2003c) specifically addresses the challenges of producing XML-compliant and internally consistent markup for ‘dense’ annotation tasks—of which TimeML is a particularly good example.
Translingual Information Detection, Extraction, and Summarization; http://www-nlpir.nist.gov/tides.
The TimeML working groups included people involved in TIDES and STAG.
TimeML ANnotation Graphical Organizer; http://www.timeml.org/site/tango; a workshop following TERQAS, focusing on developing annotation infrastructure for TimeML.
The most recent TimeML specifications and annotation guidelines are available at http://www.timeml.org.
One other change has been made to the TimeML specification since the completion of TimeBank 1.2; namely, the removal of the makeinstance tag. All the attributes associated with this tag (i.e., tense, aspect, modality, polarity) have been moved to the event tag itself. (See also Footnote 3.)
The annotators were all novices and received only one to two hours of training of TimeML annotation (see Sect. 5.1).
Technically, each annotator’s data should be considered both as the key and as the response, and recall and precision should be computed in both directions. However, with only two annotators only one direction is needed.
The Kappa coefficient adjusts for the number of agreements that would have occurred by chance and is defined by (p _o−p _e)/(1−p _e), where p _o is the observed probability and p _e the expected probability. The Kappa coefficient, however, is not well suited for annotation tasks that cannot be construed as pure classification tasks and is therefore not used to measure agreement on whether links were introduced by both annotators. See also (Hirschman et al. 1998).
While certain conclusions can be drawn from the fact that TimeBank 1.1 IAA scores for link identification and typing are low, not a lot should rest on the actual figures: these were inexperienced annotators, whose IAA scores on timex3’s and events were about 10 points lower than those of their experienced counterparts.
One could say that each event–event pair, or event–timex3 pair, that has no temporal link defined between them by way of a tlink or slink tag, is in fact evidence of a missing link. This is clearly impractical given that the number of links is quadratic to the number of events and times in a text. Here, the number of missing links is calculated by finding events that are not temporally linked to any other event or time.
The difference in signal counts between the two corpora is due to reasons explained above in (5.2).

Abbreviations

TimeML:: A Markup Language for Time
timex :: Time Expression
LDC:: Linguistic Data Consortium
IE:: Information Extraction
IAA:: Inter-Annotator Agreement

References

Allen, J. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11), 832–843.
Google Scholar
Boguraev, B., & Ando, R. K. (2005a). TimeBank-driven TimeML analysis. In: G. Katz, J. Pustejovsky, & F. Schilder (Eds.), International Workshop on Annotating, Extracting, and Reasoning with Time. Dagstuhl, Germany.
Google Scholar
Boguraev, B., & Ando, R. K. (2005b). TimeML-compliant text analysis for temporal reasoning. In: Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-05). Edinburgh, Scotland.
Ferro, L. (2001). tides: Instruction manual for the annotation of temporal expressions. Technical Report MTR 01W0000046V01, The MITRE Corporation.
Fikes, R., Jenkins, J., & Frank, G. (2003). JTP: A system architecture and component library for hybrid reasoning. Technical Report KSL-03-01, Knowledge Systems Laboratory, Stanford University.
Gaizauskas, R., Harkema, H., Hepple, M., & Setzer, A. (2006). Task-oriented extraction of temporal information: The case of clinical narratives. In: A. Montanari, J. Pustejovsky, & P. Revesz (Eds.), TIME 2006: International Symposium on Temporal Representation and Reasoning. Budapest, Hungary.
Han, B., & Lavie, A. (2004). A framework for resolution of time in natural language. TALIP Special Issue on Spatial and Temporal Information Processing, 3(1), 11–35.
Google Scholar
Hirschman, L., Robinson, P., Burger, J., & Vilain, M. (1998). Automatic coreference: The role of annotated training data. In: AAAI 1998 Spring Symposium on Applying Machine Learning to Discourse Processing. Stanford, USA, pp. 1419–1422.
Hobbs, J., & Pan, F. (2004). An ontology of time for the semantic web. TALIP Special Issue on Spatial and Temporal Information Processing, 3(1), 66–85.
Google Scholar
Hobbs, J., & Pustejovsky, J. (2004). Annotating and reasoning about time and events. In: AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning. Stanford, CA.
Lee, K., Pustejovsky, J., & Boguraev, B. (2006). Towards an international standard for annotating temporal information. In: Third International Conference on Terminology, Standardization and Technology Transfer. Beijing, China.
Mani, I., Wellner, B., Verhagen, M., Lee, C. M., & Pustejovsky, J. (2006). Machine Learning of Temporal Relations. In: Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics. Sydney, Australia.
Pustejovsky, J., Castaño, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G., & Radev, D. (2003a).TimeML: Robust specification of event and temporal expressions in text. In: AAI Spring Symposium on New Directions in Question-Answering (Working Papers). Stanford, CA, pp. 28–34.
Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., & Lazo, M. (2003b). The TimeBank corpus. In: T. McEnery (Ed.), Corpus Linguistics (pp. 47–656). Lancaster.
Pustejovsky, J., Knippen, R., Littman, J., & Saurí, R. (2005). Temporal and event information in natural language text. Language Resources and Evaluation 39(2–3), 123–164.
Article Google Scholar
Pustejovsky, J., Mani, I., Bélanger, L., Boguraev, B., Knippen, B., Littman, J., Rumshisky, A., See, A.,Symonenko, S., Guilder, J. V., Guilder, L. V., Verhagen, M., & Ingria, R. (2003c). Graphical Annotation Kit for TimeML’. Technical report, TANGO (TimeML ANnotation Graphical Organizer) Workshop Version 1.4, <http://www.timeml.org/tango> [date of citation: 2005-06-20]
Saurí, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., & Pustejovsky, J. (2005). TimeML Annotation Guidelines, Version 1.2.1’. Technical report, TERQAS Workshop/Linguistic Data Consortium. <http://www.timeml.org/site/publications/timeMLdocs/AnnGuide_1.2.1.pdf> [date of citation: 2006-07-16].
Setzer, A. (2001). Temporal information in newswire articles: An annotation scheme and corpus study. Ph.D. thesis, University of Sheffield, Sheffield, UK.
Verhagen, M. (2005). Temporal closure in an annotation environment. Language Resources and Evaluation, 39(2–3), 211–241.
Article Google Scholar
Verhagen, M., Mani, I., Sauri, R., Littman, J., Knippen, R., Jang, S. B., Rumshisky, A., Phillips, J., & Pustejovsky, J. (2005). Automating Temporal Annotation with tarsqi’. In: 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05). Ann Arbor, Michigan, Poster/Demo.

Download references

Acknowledgements

This work was supported in part by the ARDA NIMD and AQUAINT programs, PNWD-SW-6059 and NBCHC040027-MOD-0003.

Author information

Authors and Affiliations

IBM T.J. Watson Research Center, Hawthorne, NY, 10532, USA
Branimir Boguraev & Rie Ando
Brandeis University, Waltham, MA 02454, USA
James Pustejovsky & Marc Verhagen

Authors

Branimir Boguraev
View author publications
You can also search for this author in PubMed Google Scholar
James Pustejovsky
View author publications
You can also search for this author in PubMed Google Scholar
Rie Ando
View author publications
You can also search for this author in PubMed Google Scholar
Marc Verhagen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Branimir Boguraev.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boguraev, B., Pustejovsky, J., Ando, R. et al. TimeBank evolution as a community resource for TimeML parsing. Lang Resources & Evaluation 41, 91–115 (2007). https://doi.org/10.1007/s10579-007-9018-8

Download citation

Received: 13 September 2006
Accepted: 30 March 2007
Published: 14 September 2007
Issue Date: February 2007
DOI: https://doi.org/10.1007/s10579-007-9018-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TimeBank evolution as a community resource for TimeML parsing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ISO-TimeML and the Annotation of Temporal Information

It-TimeML and the Ita-TimeBank: Language Specific Adaptations for Temporal Annotation

Designing Annotation Schemes: From Theory to Model

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

TimeBank evolution as a community resource for TimeML parsing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ISO-TimeML and the Annotation of Temporal Information

It-TimeML and the Ita-TimeBank: Language Specific Adaptations for Temporal Annotation

Designing Annotation Schemes: From Theory to Model

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation