KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechnik... more KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechniki Wroclawskiej) is a corpus of written and spoken documents available on the Creative Common license. The texts are divided into 15 subcorpuses (blogs, science, stenographic recordings, etc.). The documents are annotated on the level of chunks and selected predicate-argument relations, named entities, relations between named entities, anaphora relations, word senses, events, temporal expressions, spatial relations between entities, keywords and semantic roles within nominal and adjective phrases
Towards an event annotated corpus of PolishThe paper presents a typology of events built on the b... more Towards an event annotated corpus of PolishThe paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels – ontology level (language independent) and text mentions (language dependant). The various types of event mentions in Polish text are discussed. A procedure for annotation of event mentions in Polish texts is presented and evaluated. In the evaluation a randomly selected set of documents from the Corpus of Wrocław University of Technology (called KPWr) was annotated by two linguists and the annotator agreement was calculated. The evaluation was done in two iterations. After the first evaluation we revised and improved the annotation procedure. The second evaluation showed a significant improvement of the agreement between annotators. The current work w...
Temporal Expressions in Polish Corpus KPWrThis article presents the result of the recent research... more Temporal Expressions in Polish Corpus KPWrThis article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepar...
Temporal expressions annotation guidelines describing the process of manual annotation of documen... more Temporal expressions annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wroclaw University of Technology (KPWr)
KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechnik... more KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechniki Wroclawskiej) is a corpus of written and spoken documents available on the Creative Common license. The texts are divided into 15 subcorpuses (blogs, science, stenographic recordings, etc.). The documents are annotated on the level of chunks and selected predicate-argument relations, named entities, relations between named entities, anaphora relations, word senses, events, temporal expressions, spatial relations between entities, keywords and semantic roles within nominal and adjective phrases
This article introduces the issue of recognition and normalisation of temporal expressions for th... more This article introduces the issue of recognition and normalisation of temporal expressions for the Polish language. We describe what temporal information is and we present TimeML specification, adapted to Polish as a model for the description of temporal expressions. Classes of temporal expressions are presented as well as guidelines for annotation, normalisation of these expressions and our approach to corpus annotation and temporal expressions recognition. The key aspect of the work is the description of the features used for the recognition, the use of the method for selection and creation of feature templates for the model of Conditional Random Fields. We demonstrate the experiments and conclusions drawn from them.
This article introduces the issue of recognition and normalisation of temporal expressions for th... more This article introduces the issue of recognition and normalisation of temporal expressions for the Polish language. We describe what temporal information is and we present TimeML specification, adapted to Polish as a model for the description of temporal expressions. Classes of temporal expressions are presented as well as guidelines for annotation, normalisation of these expressions and our approach to corpus annotation and temporal expressions recognition. The key aspect of the work is the description of the features used for the recognition, the use of the method for selection and creation of feature templates for the model of Conditional Random Fields. We demonstrate the experiments and conclusions drawn from them.
STRESZCZENIE: Celem artykułu jest omówienie problematyki związanej z badaniami nad językami migo-... more STRESZCZENIE: Celem artykułu jest omówienie problematyki związanej z badaniami nad językami migo-wymi w ogóle i polskim językiem migowym w szczególe. Analizie poddana została droga rozwoju Deaf Studies zarówno na świecie, zwłaszcza w Stanach Zjednoczonych, jak i w Polsce. Deaf Studies jest w Polsce stosunkowo młodą dziedziną naukową, ale jej rozwój jest bardzo szybki: począwszy od pierwszych prac oma-wiających w różnorodny sposób ten temat, poprzez organizację spotkań naukowych, na powstaniu jednostek badawczych i stowarzyszeń związanych z językiem migowym skończywszy. W pracy przedstawiono przykła-dy ilustrujące poszczególne kamienie milowe rozwoju tychże badań. Analiza ta ma także za zadanie ukazać perspektywy badawcze stojące przed polskim językiem migowym. SŁOWA KLUCZOWE: język migowy, polski język migowy, PJM, Deaf Studies
RESEARCH ON POLISH SIGN LANGUAGE AND OTHER SIGN LANGUAGES ABSTRACT: The aim of this article is to introduce issues connected to research on sign languages in general and Polish Sign Language in particular. The development of Deaf Studies in the world, especially in the USA, as well as in Poland is analyzed in this article. Deaf Studies is a relatively young scientific discipline in Poland, but it is developing rapidly. From the first papers presenting this topic in various ways through the organization of conferences and symposia to the foundation of research units and associations connected to sign language. The goal is to demonstrate the scientific opportunities with Polish Sign Language. Every milestone is illustrated with numerous examples of articles and researchers.
This article presents the result of the recent research in the interpretation of Polish expressio... more This article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines.
The paper presents a typology of events built on the basis of TimeML specification adapted to Pol... more The paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels — ontology level (language independent) and text mentions (language dependant). The various types of event mentions in Polish text are discussed. A procedure for annotation of event mentions in Polish texts is presented and evaluated. In the evaluation a randomly selected set of documents from the Corpus of Wrocław University of Technology (called KPWr) was annotated by two linguists and the annotator agreement was calculated. The evaluation was done in two iterations. After the first evaluation we revised and improved the annotation procedure. The second evaluation showed a significant improvement of the agreement between annotators. The current work was focused on annotation and categorisation of event mentions in text. The future work will be focused on description of event with a set of attributes, arguments and relations.
This article presents the result of the recent research in the interpretation of Polish expressio... more This article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines.
KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechnik... more KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechniki Wroclawskiej) is a corpus of written and spoken documents available on the Creative Common license. The texts are divided into 15 subcorpuses (blogs, science, stenographic recordings, etc.). The documents are annotated on the level of chunks and selected predicate-argument relations, named entities, relations between named entities, anaphora relations, word senses, events, temporal expressions, spatial relations between entities, keywords and semantic roles within nominal and adjective phrases
Towards an event annotated corpus of PolishThe paper presents a typology of events built on the b... more Towards an event annotated corpus of PolishThe paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels – ontology level (language independent) and text mentions (language dependant). The various types of event mentions in Polish text are discussed. A procedure for annotation of event mentions in Polish texts is presented and evaluated. In the evaluation a randomly selected set of documents from the Corpus of Wrocław University of Technology (called KPWr) was annotated by two linguists and the annotator agreement was calculated. The evaluation was done in two iterations. After the first evaluation we revised and improved the annotation procedure. The second evaluation showed a significant improvement of the agreement between annotators. The current work w...
Temporal Expressions in Polish Corpus KPWrThis article presents the result of the recent research... more Temporal Expressions in Polish Corpus KPWrThis article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepar...
Temporal expressions annotation guidelines describing the process of manual annotation of documen... more Temporal expressions annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wroclaw University of Technology (KPWr)
KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechnik... more KPWr (Polish Corpus of Wroclaw University of Technology, pol. Korpus Jezyka Polskiego Politechniki Wroclawskiej) is a corpus of written and spoken documents available on the Creative Common license. The texts are divided into 15 subcorpuses (blogs, science, stenographic recordings, etc.). The documents are annotated on the level of chunks and selected predicate-argument relations, named entities, relations between named entities, anaphora relations, word senses, events, temporal expressions, spatial relations between entities, keywords and semantic roles within nominal and adjective phrases
This article introduces the issue of recognition and normalisation of temporal expressions for th... more This article introduces the issue of recognition and normalisation of temporal expressions for the Polish language. We describe what temporal information is and we present TimeML specification, adapted to Polish as a model for the description of temporal expressions. Classes of temporal expressions are presented as well as guidelines for annotation, normalisation of these expressions and our approach to corpus annotation and temporal expressions recognition. The key aspect of the work is the description of the features used for the recognition, the use of the method for selection and creation of feature templates for the model of Conditional Random Fields. We demonstrate the experiments and conclusions drawn from them.
This article introduces the issue of recognition and normalisation of temporal expressions for th... more This article introduces the issue of recognition and normalisation of temporal expressions for the Polish language. We describe what temporal information is and we present TimeML specification, adapted to Polish as a model for the description of temporal expressions. Classes of temporal expressions are presented as well as guidelines for annotation, normalisation of these expressions and our approach to corpus annotation and temporal expressions recognition. The key aspect of the work is the description of the features used for the recognition, the use of the method for selection and creation of feature templates for the model of Conditional Random Fields. We demonstrate the experiments and conclusions drawn from them.
STRESZCZENIE: Celem artykułu jest omówienie problematyki związanej z badaniami nad językami migo-... more STRESZCZENIE: Celem artykułu jest omówienie problematyki związanej z badaniami nad językami migo-wymi w ogóle i polskim językiem migowym w szczególe. Analizie poddana została droga rozwoju Deaf Studies zarówno na świecie, zwłaszcza w Stanach Zjednoczonych, jak i w Polsce. Deaf Studies jest w Polsce stosunkowo młodą dziedziną naukową, ale jej rozwój jest bardzo szybki: począwszy od pierwszych prac oma-wiających w różnorodny sposób ten temat, poprzez organizację spotkań naukowych, na powstaniu jednostek badawczych i stowarzyszeń związanych z językiem migowym skończywszy. W pracy przedstawiono przykła-dy ilustrujące poszczególne kamienie milowe rozwoju tychże badań. Analiza ta ma także za zadanie ukazać perspektywy badawcze stojące przed polskim językiem migowym. SŁOWA KLUCZOWE: język migowy, polski język migowy, PJM, Deaf Studies
RESEARCH ON POLISH SIGN LANGUAGE AND OTHER SIGN LANGUAGES ABSTRACT: The aim of this article is to introduce issues connected to research on sign languages in general and Polish Sign Language in particular. The development of Deaf Studies in the world, especially in the USA, as well as in Poland is analyzed in this article. Deaf Studies is a relatively young scientific discipline in Poland, but it is developing rapidly. From the first papers presenting this topic in various ways through the organization of conferences and symposia to the foundation of research units and associations connected to sign language. The goal is to demonstrate the scientific opportunities with Polish Sign Language. Every milestone is illustrated with numerous examples of articles and researchers.
This article presents the result of the recent research in the interpretation of Polish expressio... more This article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines.
The paper presents a typology of events built on the basis of TimeML specification adapted to Pol... more The paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels — ontology level (language independent) and text mentions (language dependant). The various types of event mentions in Polish text are discussed. A procedure for annotation of event mentions in Polish texts is presented and evaluated. In the evaluation a randomly selected set of documents from the Corpus of Wrocław University of Technology (called KPWr) was annotated by two linguists and the annotator agreement was calculated. The evaluation was done in two iterations. After the first evaluation we revised and improved the annotation procedure. The second evaluation showed a significant improvement of the agreement between annotators. The current work was focused on annotation and categorisation of event mentions in text. The future work will be focused on description of event with a set of attributes, arguments and relations.
This article presents the result of the recent research in the interpretation of Polish expressio... more This article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines.
Uploads
Papers by Tomasz Bernaś
RESEARCH ON POLISH SIGN LANGUAGE AND OTHER SIGN LANGUAGES ABSTRACT: The aim of this article is to introduce issues connected to research on sign languages in general and Polish Sign Language in particular. The development of Deaf Studies in the world, especially in the USA, as well as in Poland is analyzed in this article. Deaf Studies is a relatively young scientific discipline in Poland, but it is developing rapidly. From the first papers presenting this topic in various ways through the organization of conferences and symposia to the foundation of research units and associations connected to sign language. The goal is to demonstrate the scientific opportunities with Polish Sign Language. Every milestone is illustrated with numerous examples of articles and researchers.
RESEARCH ON POLISH SIGN LANGUAGE AND OTHER SIGN LANGUAGES ABSTRACT: The aim of this article is to introduce issues connected to research on sign languages in general and Polish Sign Language in particular. The development of Deaf Studies in the world, especially in the USA, as well as in Poland is analyzed in this article. Deaf Studies is a relatively young scientific discipline in Poland, but it is developing rapidly. From the first papers presenting this topic in various ways through the organization of conferences and symposia to the foundation of research units and associations connected to sign language. The goal is to demonstrate the scientific opportunities with Polish Sign Language. Every milestone is illustrated with numerous examples of articles and researchers.