This paper presents the annotation guidelines and specifications which have been developed for the creation of the Italian TimeBank, a language resource composed of two corpora manually annotated with temporal and event information. In... more
This paper presents the annotation guidelines and specifications which have been developed for the creation of the Italian TimeBank, a language resource composed of two
corpora manually annotated with temporal and event information. In particular, the adaptation of the TimeML scheme to Italian is described, and a special attention is given to the methodology used for the realization of the annotation specifications, which are strategic in order to create good quality annotated resources and to justify the annotated items. The reliability of the It-TimeML guidelines and specifications is evaluated on the basis of the results of
the inter-coder agreement performed during the annotation of the two corpora.
Abstract: We describe the TempEval-3 task which is currently in preparation for the SemEval-2013 evaluation exercise. The aim of TempEval is to advance research on temporal information processing. TempEval-3 follows on from previous... more
Abstract: We describe the TempEval-3 task which is currently in preparation for the SemEval-2013 evaluation exercise. The aim of TempEval is to advance research on temporal information processing. TempEval-3 follows on from previous TempEval events, incorporating: a three-part task structure covering event, temporal expression and temporal relation extraction; a larger dataset; and single overall task quality scores.
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to... more
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs in NLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank).
Temporal Information Processing is a subfield of Natural Language Processing, valuable in many tasks like Question Answering and Summarization. Temporal Information Processing is broadened, ranging from classical theories of time and... more
Temporal Information Processing is a subfield of Natural Language Processing, valuable in many tasks like Question Answering and Summarization. Temporal Information Processing is broadened, ranging from classical theories of time and language to current computational approaches for Temporal Information Extraction. This later trend consists on the automatic extraction of events and temporal expressions. Such issues have attracted great attention especially with the development of annotated corpora and annotations schemes mainly TimeBank and TimeML. In this paper, we give a survey of Temporal Information Extraction from Natural Language texts.
This paper describes a system to extract events and time information from football match reports generated through minute-by-minute reporting. We describe a method that uses regular expressions to find the events and divides them into... more
This paper describes a system to extract events and time information from football match reports generated through minute-by-minute reporting. We describe a method that uses regular expressions to find the events and divides them into different types to determine in which order they occurred. In addition, our system detects time expressions and we present a way to structure the collected data using XML.
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to... more
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs in NLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank).
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to... more
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs in NLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank).
Temporal information extraction is a popular and interesting research field in the area of Natural Language Processing (NLP) applications such as summarization, question answering (QA) and information extraction. In this paper, we have... more
Temporal information extraction is a popular and interesting research field in the area of Natural Language Processing (NLP) applications such as summarization, question answering (QA) and information extraction. In this paper, we have reported extraction of events and identification of different temporal relations between event-time and even-document creation time (DCT) within the TimeML framework. Our long term plan is to make temporal structure that can be used in the applications like question answering, textual entailment, summarization etc. In our approach, we propose a voted approach for (i) event extraction (ii) event – document creation time (DCT) relation identification (iii) event – time relation identification from the text under the TempEval-2 framework. The contributions of this work are twofold ; initially features are extracted from the training corpus and used to train a CRF and SVM framework. Then, the proposal of a voted approach for event extraction, event-DCT and event-time relation identification by combining the supervised classifiers such as Conditional Random Field (CRF) and Support Vector Machine (SVM). In total we generate 20 models, 10 each with CRF and SVM, by varying the available features and/or feature templates. All these 20 models are then combined together into a final system by defining appropriate voting scheme.
This study approaches a methodology for the integration of temporal information belonging to a historical corpus in a Geographic Information System (GIS), with the purpose of analyzing and visualizing the textual information. The selected... more
This study approaches a methodology for the integration of temporal information belonging to a historical corpus in a Geographic Information System (GIS), with the purpose of analyzing and visualizing the textual information. The selected corpus is composed of business letters of the Castilian merchant Simón Ruiz (1553-1597), in the context of the DynCoopNet Project (Dynamic Complexity of Cooperation-Based Self-Organizing Commercial Networks in the First Global Age), that aims to analyze the dynamic cooperation procedures of social networks. The integration of historical corpus into a GIS has involved the following phases: (1) recognition and normalization of temporal expressions and events in 16th century Castilian following the TimeML annotation guidelines and (2) storage of tagged expressions into a Geodatabase. The implementation of this process in a GIS would allow to later carrying out temporal queries, dynamic visualization of historical events and thus, it addresses the recognition of human activity patterns and behaviours over time.
This study approaches a methodology for the integration of temporal information belonging to a historical corpus in a Geographic Information System (GIS), with the purpose of analyzing and visualizing the textual information. The selected... more
This study approaches a methodology for the integration of temporal information belonging to a historical corpus in a Geographic Information System (GIS), with the purpose of analyzing and visualizing the textual information. The selected corpus is composed of business letters of the Castilian merchant Simón Ruiz (1553-1597), in the context of the DynCoopNet Project (Dynamic Complexity of Cooperation-Based Self-Organizing Commercial Networks in the First Global Age), that aims to analyze the dynamic cooperation procedures of social networks. The integration of historical corpus into a GIS has involved the following phases: (1) recognition and normalization of temporal expressions and events in 16th century Castilian following the TimeML annotation guidelines and (2) storage of tagged expressions into a Geodatabase. The implementation of this process in a GIS would allow to later carrying out temporal queries, dynamic visualization of historical events and thus, it addresses the recognition of human activity patterns and behaviours over time.