Rossana Cunha is a PhD Student in Linguistics and Applied Linguistics at the Federal University of Minas Gerais (UFMG) and a researcher at the Laboratory for Experimentation in Translation (LETRA/UFMG). She holds a BSc in Computer Science (Federal University of Para), a BA in English and an MA in Translation Studies (Federal University of Santa Catarina). She is also a WiNLP 2019 workshop chair (ACL 2019). Her main areas of interest and research include translation technologies, computational linguistics, usability, cognitive ergonomics, translator education and software engineering. Supervisors: Fábio Alves and Adriana Pagano
While Natural Language Processing (NLP) models have gained substantial attention, only in recent ... more While Natural Language Processing (NLP) models have gained substantial attention, only in recent years has research opened new paths for tackling Human-Computer Design (HCD) from the perspective of natural language. We focus on developing a human-centered corpus, more specifically, a persona-based corpus in a particular healthcare domain (diabetes mellitus self-care). In order to follow an HCD approach, we created personas to model interpersonal interaction (expert and non-expert users) in that specific domain. We show that an HCD approach benefits language generation from different perspectives, from machines to humans-contributing with new directions for low-resource contexts (languages other than English and sensitive domains) where the need to promote effective communication is essential.
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressã... more Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão, Programa de Pós-Graduação em Estudos de Tradução, Florianópolis, 2016.Este trabalho tem por objetivo principal avaliar um sistema de tradução com base em corpus, denominado COPA-TRAD, sob a perspectiva do usuário (pesquisador, tradutor, estudante ? da área de tradução), considerando características de usabilidade e ergonomia cognitiva. A intenção é compreender como se dá a interação dos usuários com o software investigado, visto o crescimento em nível de complexidade e diversidade das tecnologias de tradução com base em corpus, e a pouca atenção empregada às recomendações da área de interação humano-computador (IHC). A pesquisa foi dividida em etapas distintas: primeiramente as conversas informais com os participantes da pesquisa, e seguidas pela aplicação de um questionário de usabilidade. Ademais, foram conduzidas a avaliação heurística; a inspeção ergonômica por meio de listas de v...
Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciai... more Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciais no crescimento dos Estudos da Tradução Baseados em Corpus (ETBC), assim como no desenvolvimento de sistemas de informação e técnicas que fazem uso destes. Este artigo apresenta uma breve revisão de sistemas web baseados em corpus no par linguístico inglês-português, a partir de uma perspectiva de aplicação ao ensino, à pesquisa e à prática tradutória. Para tanto, buscamos proporcionar uma significação no âmbito tecnológico por meio de (i) uma breve contextualização teórica sobre o uso de corpora, (ii) as suas principais características e (iii) as aplicações mais conhecidas. Posteriormente, apresenta-se uma síntese das ferramentas web gratuitas: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) e VVV (2013). Em seguida, elencamos os usos e benefícios mais comuns de sistemas de compilação, análise, classificação e exploração de corpora. Por fim, a análise revela o mome...
This paper reports on a study that analyses the impact of two different machine translation (MT) ... more This paper reports on a study that analyses the impact of two different machine translation (MT) outputs on the cognitive effort required to post-edit machine-translated metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the statistical MT output would have a positive effect on reducing cognitive effort. In order to test this hypothesis, a post-editing experiment was conducted with two different groups of participants. Each experimental group had two post-editing tasks using the language pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to postedit a Google machine-translated output whereas on Task 2 (T2) the same participants were assigned to post-edit a Systran machine translated output. Data collection was conducted under the experimental paradigm of data triangulation in translation process research. Data analysis focuses on eye tracking data related to fixation duration and pupil dilation as well as think-aloud protoc...
Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, 2021
The continuous development of translation technologies has fundamentally changed the way users of... more The continuous development of translation technologies has fundamentally changed the way users of this area interact with computers. The need for understanding and measuring how a vast number of resources and software applications can impact users and other stakeholders had led to recommendations related to human-computer interaction (HCI), presented as guidelines and best practices. Unfortunately, when developing translation tools limited attention is still paid to usability and ergonomics, be it during the design, implementation or deployment phases. Meanwhile, the level of complexity of corpus-based translation tools has increased in difficulties and diversity, however, this evolution does not take into consideration HCI recommendations yet. The goal of this study is to bridge this gap between corpus-based tools, ergonomics, and usability, by presenting the results of a user-oriented methodology. With this in mind, a corpus analysis software, called COPA-TRAD, was used as the basis for applying some existing methods within usability and ergonomics area. The proposed study was composed of three main stages: (i) usability questionnaire – administered to participants of this knowledge area; (ii) heuristics analysis – performed by five usability experts; and (iii) ergonomics checklist inspection, to analyze general elements. The results indicated that despite the concern of providing a “user-friendly” interface, the analyzed system had not made use of known usability and ergonomics methods, just guidelines of the third-party software used as part of COPA-TRAD. The study points out directions on which a corpus-based tool can be adapted to user needs and further indicate some important criteria that require improvement. After applying the necessary changes, a complementary analysis needs to be carried out to verify if those identified issues were accurately adjusted. We believe translation technology should concern with building adequate interfaces, allowing humans to interact effectively with tools data and facilitating the process of retrieving information.
Data-to-text Natural Language Generation (NLG) is the computational process of generating natural... more Data-to-text Natural Language Generation (NLG) is the computational process of generating natural language in the form of text or voice from non-linguistic data. A core micro-planning task within NLG is referring expression generation (REG), which aims to automatically generate noun phrases to refer to entities mentioned as discourse unfolds. A limitation of novel REG models is not being able to generate referring expressions to entities not encountered during the training process. To solve this problem, we propose two extensions to NeuralREG, a state-of-the-art encoder-decoder REG model. The first is a copy mechanism, whereas the second consists of representing the gender and type of the referent as inputs to the model. Drawing on the results of automatic and human evaluation as well as an ablation study using the WebNLG corpus, we contend that our proposal contributes to the generation of more meaningful referring expressions to unseen entities than the original system and related...
This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Ama... more This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Amazon. The robot-journalist is based on a pipeline architecture of Natural Language Generation, which yields multilingual daily and monthly reports based on the public data provided by DETER, a real-time deforestation satellite monitor developed and maintained by the Brazilian National Institute for Space Research (INPE). DaMata automatically generates reports in Brazilian Portuguese and English and publishes them on the Twitter platform. Corpus and code are publicly available.
Investigating the post-editing effort associated with machinetranslated metaphors: a process-driven analysis, 2019
This paper reports on a study that analyses the impact of two different machine
translation (MT) ... more This paper reports on a study that analyses the impact of two different machine translation (MT) outputs on the cognitive effort required to post-edit machine-translated metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the statistical MT output would have a positive effect on reducing cognitive effort. In order to test this hypothesis, a post-editing experiment was conducted with two different groups of participants. Each experimental group had two post-editing tasks using the language pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to postedit a Google machine-translated output whereas on Task 2 (T2) the same participants were assigned to post-edit a Systran machine translated output. Data collection was conducted under the experimental paradigm of data triangulation in translation process research. Data analysis focuses on eye tracking data related to fixation duration and pupil dilation as well as think-aloud protocols. This analysis shows that the cognitive effort required to post-edit the pure statistical MT output might be lower in comparison to the hybrid output when conventional metaphors are machine translated.
The Journal of Specialised Translation (JoSTrans), 2019
This paper reports on a study that analyses the impact of two different machine translation (MT) ... more This paper reports on a study that analyses the impact of two different machine translation (MT) outputs on the cognitive effort required to post-edit machine-translated metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the statistical MT output would have a positive effect on reducing cognitive effort. In order to test this hypothesis, a post-editing experiment was conducted with two different groups of participants. Each experimental group had two post-editing tasks using the language pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to post-edit a Google machine-translated output whereas on Task 2 (T2) the same participants were assigned to post-edit a Systran machine translated output. Data collection was conducted under the experimental paradigm of data triangulation in translation process research. Data analysis focuses on eye tracking data related to fixation duration and pupil dilation as well as think-aloud protocols. This analysis shows that the cognitive effort required to post-edit the pure statistical MT output might be lower in comparison to the hybrid output when conventional metaphors are machine translated.
Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciai... more Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciais no crescimento dos Estudos da Tradução Baseados em Corpus (ETBC), assim como no desenvolvimento de sistemas de informação e técnicas que fazem uso destes. Este artigo apresenta uma breve revisão de sistemas web baseados em corpus no par linguístico inglês-português, a partir de uma perspectiva de aplicação ao ensino, à pesquisa e à prática tradutória. Para tanto, buscamos proporcionar uma significação no âmbito tecnológico por meio de (i) uma breve contextualização teórica sobre o uso de corpora, (ii) as suas principais características e (iii) as aplicações mais conhecidas. Posteriormente, apresenta-se uma síntese das ferramentas web gratuitas: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) e VVV (2013). Em seguida, elencamos os usos e benefícios mais comuns de sistemas de compilação, análise, classificação e exploração de corpora. Por fim, a análise revela o momento vivenciado nos ETBC por meio de um resumo do aparato tecnológico existente na área. Desta maneira, almejamos que a presente discussão venha a proporcionar o desenvolvimento de pesquisas relacionadas aos sistemas baseados em corpus, haja vista a constante evolução tecnológica e a variedade de aplicações que podem se beneficiar do uso de corpora, seja no contexto prático ou profissional.
Abstract: With the advent of Internet and continuous technological advances, corpora have become essential in the growth of Corpus-Based Translation Studies (CTS), as well as in the development of information systems and techniques that make use of them. This paper presents a brief revision of corpus-based web systems in the English-Portuguese language pair, from a perspective of application in translation teaching, research and practice. To this end, we aim to provide a meaning in the technological scope through (i) a brief theoretical contextualization on the use of corpora, (ii) its key features and (iii) the best-known applications. Afterwards, a summary of the open-source web-based tools is presented: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) and VVV (2013). Next, we list the most common uses and benefits of systems for compiling, analyzing, classifying, and exploiting corpora. Finally, the analysis reveals the moment experienced by CTS through a synthesis of the technological apparatus in the area. To sum up, we aim to encourage the development of corpus-based systems research, due to the constant technological evolution and the variety of applications that can benefit from the use of corpora, either in the practical or professional context.
O objetivo deste trabalho é apresentar uma análise comparativa, centrada no usuário, que foi real... more O objetivo deste trabalho é apresentar uma análise comparativa, centrada no usuário, que foi realizada em duas ferramentas de apoio à tradução disponíveis on-line: Google Translator Toolkit e Wordfast Anywhere. Como principal método de investigação, temos a avaliação exploratória e aplicação de lista de verificação de usabilidade e ergonomia cognitiva. Os resultados iniciais mostram que o emprego de métodos de avaliação deste tipo pode ter baixo custo e ser de fácil acesso, necessitando-se apenas de uma disseminação maior sobre como aplicá-los às ferramentas de tradução.
Abstract: The aim of this paper is to present a user-centered comparative analysis of two CAT tools available on-line: Google Translator Toolkit and Wordfast Anywhere. As the central research method, we have applied an exploratory evaluation and a usability and cognitive ergonomics checklist. Initial results show that the use of some of these methods may be low-cost and easily accessible, requiring only further dissemination about how applying them to translation tools.
Translation is a profession highly connected to technology, and for this reason, most of today's ... more Translation is a profession highly connected to technology, and for this reason, most of today's translators are in contact with a variety of tools, services and programs, such as word processors, e-mail, electronic dictionaries, among others. In this paper, we argue that while translation and technology have a strong relationship, there are few researches in Corpus-based Translation Studies area, which are related to analyze and evaluate translation software. The corpus-based information system, called COPA- TRAD is analyzed considering ergonomics and software usability aspects, in order to those who are involved in the area can have access to a more familiar system that can be used to translation research; teaching; and practice. Due to the nature and still little explored subject, we intend to provide to Translation Studies area, and more specifically, to those familiar to Corpus-based Translation Studies, features/characteristics that can lead to even more studies about this subject, resulting in possible improvements and/or development of translation-based tools.
A usabilidade e ergonomia evoluíram para facilitar o desenvolvimento de aplicações que trazem uma... more A usabilidade e ergonomia evoluíram para facilitar o desenvolvimento de aplicações que trazem uma experiência de uso melhor aos seus usuários. A área de tradução seguiu a mesma tendência no que diz respeito ao avanço tecnológico, seja com a disponibilização de novas ferramentas no setor, ou com a atualização de sistemas já existentes. Apesar do crescente número de aplicações disponibilizadas, verifica-se uma menor prioridade a fatores relacionados à facilidade de uso dessas ferramentas pelos seus usuários. O objetivo deste workshop é apresentar métodos e técnicas de usabilidade e ergonomia cognitiva, que podem ser aplicados aos Estudos da Tradução. Será dada uma breve visão geral teórica sobre a área de interação humano-computador (IHC), além de recursos para que os participantes possam avaliar interfaces da área de tradução. Espera-se assim proporcionar um olhar mais investigador aos usuários dessa área, através da utilização de princípios de usabilidade e como estes podem ser aplicados para a identificação dos principais problemas de interfaces.
Traditional teaching approaches reflect very little impact regarding new practices and at the sam... more Traditional teaching approaches reflect very little impact regarding new practices and at the same time offer no room for improvement. Still today, formal training in pedagogy purport conservative beliefs on knowledge, colleagueship, and advocacy. On the other hand, cognitive theory and social constructivism suggest that learning and development is a social process, where each and every student can become the center of his/her process, towards becoming an independent learner. Within this context, teachers are mediators, promoting and encouraging critical thinking through problem-solving tasks; reflection and analysis; creativity and interpretation. Students build their meanings and knowledge, which are linked to their socio-cultural context.
Within this theoretical framework, we would like to question current teaching practices by proposing the use of corpora in the classroom, aiming to redefine roles of the subjects that are part of the teaching/learning process as a new form of interaction mediated by new technologies. The use of information technology (IT) tools encourages proactivity and facilitates group work, as well as promotes greater awareness on the part of the students while a new strategy is built. We believe this approach is more appropriate for these times of globalization and virtual work. As a guiding motivation, we expect to generate changes in attitudes to traditional roles, responsibilities, and beliefs.
The “sketch engine” software will be used to provide for activities in the classroom. At the same time, a general questionnaire for teachers of English as an additional language is applied once the tasks are fulfilled. This inquiry is based on possibilities and beliefs over the use of corpus systems in our everyday practice and is analyzed so as to describe teachers’ reactions to our proposal. Once the new technologies are introduced in the language classroom, students become aware of the potential of such systems and the advantages of choosing and leading their own learning process. Is it possible to rethink formal training in pedagogy, while including virtual learning and a different share of responsibility in the process?
Further research is required, but self-assessment by the students proved this appropriate; particularly with those preparing international exams and planning to apply for foreign universities.
A experiência do usuário é um tópico comumente abordado em Sistemas de Informação, se... more A experiência do usuário é um tópico comumente abordado em Sistemas de Informação, seja pelas necessidades identificadas durante a elaboração de ferramentas e aplicativos especialistas ou pela sua importância no que se refere à interação humano-computador. No entanto, apesar de ser uma preocupação compartilhada por muitos, seu reflexo se limita a um pequeno número de pesquisas na área dos Estudos da Tradução. Esse número é ainda menor quando tratamos de pesquisas relacionadas aos Estudos da Tradução Baseados em Corpus (ETBC). Esse trabalho tem por objetivo principal analisar e avaliar um sistema sob a perspectiva do usuário (pesquisador, tradutor, estudante – da área de tradução), abordando características de usabilidade e ergonomia. O ponto de partida da investigação é um sistema de tradução com base em corpus denominado COPA-TRAD – desenvolvido junto à Universidade Federal de Santa Catarina (UFSC) –, que teve com uma de suas principais preocupações oferecer uma ferramenta de fácil utilização aos seus usuários. Neste contexto, investiga-se até que ponto critérios de usabilidade e ergonomia foram utilizados no sistema avaliado. Para tanto, um conjunto de métodos foi selecionado para coletar informações sobre o nível de satisfação do usuário, bem como suas atitudes em relação ao sistema. Os resultados preliminares demonstram uma tendência à preocupação com o usuário, porém sem a aplicação prévia de avaliações de ergonomia e usabilidade, seja durante o desenvolvimento ou após a conclusão do software. Acredita-se que a contribuição metodológica venha a fomentar um melhor desenvolvimento de sistemas e/ou ferramentas, assim como promover mais discussões sobre o tema.
The continuous development of translation technologies has fundamentally changed the way users o... more The continuous development of translation technologies has fundamentally changed the way users of this area interact with computers. The need for understanding and measuring how a vast number of resources and software applications can impact users and other stakeholders had led to recommendations related to human-computer interaction (HCI), presented as guidelines and best practices. Unfortunately, when developing translation tools limited attention is still paid to usability and ergonomics, be it during the design, implementation or deployment phases. Meanwhile, the level of complexity of corpus-based translation tools has increased in difficulties and diversity, however, this evolution does not take into consideration HCI recommendations yet. The goal of this study is to bridge this gap between corpus-based tools, ergonomics, and usability, by presenting the results of a user-oriented methodology. With this in mind, a corpus analysis software, called COPA-TRAD, was used as the basis for applying some existing methods within usability and ergonomics area. The proposed study was composed of three main stages: (i) usability questionnaire – administered to participants of this knowledge area; (ii) heuristics analysis – performed by five usability experts; and (iii) ergonomics checklist inspection, to analyze general elements. The results indicated that despite the concern of providing a “user-friendly” interface, the analyzed system had not made use of known usability and ergonomics methods, just guidelines of the third-party software used as part of COPA-TRAD. The study points out directions on which a corpus-based tool can be adapted to user needs and further indicate some important criteria that require improvement. After applying the necessary changes, a complementary analysis needs to be carried out to verify if those identified issues were accurately adjusted. We believe translation technology should concern with building adequate interfaces, allowing humans to interact effectively with tools data and facilitating the process of retrieving information.
While Natural Language Processing (NLP) models have gained substantial attention, only in recent ... more While Natural Language Processing (NLP) models have gained substantial attention, only in recent years has research opened new paths for tackling Human-Computer Design (HCD) from the perspective of natural language. We focus on developing a human-centered corpus, more specifically, a persona-based corpus in a particular healthcare domain (diabetes mellitus self-care). In order to follow an HCD approach, we created personas to model interpersonal interaction (expert and non-expert users) in that specific domain. We show that an HCD approach benefits language generation from different perspectives, from machines to humans-contributing with new directions for low-resource contexts (languages other than English and sensitive domains) where the need to promote effective communication is essential.
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressã... more Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão, Programa de Pós-Graduação em Estudos de Tradução, Florianópolis, 2016.Este trabalho tem por objetivo principal avaliar um sistema de tradução com base em corpus, denominado COPA-TRAD, sob a perspectiva do usuário (pesquisador, tradutor, estudante ? da área de tradução), considerando características de usabilidade e ergonomia cognitiva. A intenção é compreender como se dá a interação dos usuários com o software investigado, visto o crescimento em nível de complexidade e diversidade das tecnologias de tradução com base em corpus, e a pouca atenção empregada às recomendações da área de interação humano-computador (IHC). A pesquisa foi dividida em etapas distintas: primeiramente as conversas informais com os participantes da pesquisa, e seguidas pela aplicação de um questionário de usabilidade. Ademais, foram conduzidas a avaliação heurística; a inspeção ergonômica por meio de listas de v...
Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciai... more Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciais no crescimento dos Estudos da Tradução Baseados em Corpus (ETBC), assim como no desenvolvimento de sistemas de informação e técnicas que fazem uso destes. Este artigo apresenta uma breve revisão de sistemas web baseados em corpus no par linguístico inglês-português, a partir de uma perspectiva de aplicação ao ensino, à pesquisa e à prática tradutória. Para tanto, buscamos proporcionar uma significação no âmbito tecnológico por meio de (i) uma breve contextualização teórica sobre o uso de corpora, (ii) as suas principais características e (iii) as aplicações mais conhecidas. Posteriormente, apresenta-se uma síntese das ferramentas web gratuitas: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) e VVV (2013). Em seguida, elencamos os usos e benefícios mais comuns de sistemas de compilação, análise, classificação e exploração de corpora. Por fim, a análise revela o mome...
This paper reports on a study that analyses the impact of two different machine translation (MT) ... more This paper reports on a study that analyses the impact of two different machine translation (MT) outputs on the cognitive effort required to post-edit machine-translated metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the statistical MT output would have a positive effect on reducing cognitive effort. In order to test this hypothesis, a post-editing experiment was conducted with two different groups of participants. Each experimental group had two post-editing tasks using the language pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to postedit a Google machine-translated output whereas on Task 2 (T2) the same participants were assigned to post-edit a Systran machine translated output. Data collection was conducted under the experimental paradigm of data triangulation in translation process research. Data analysis focuses on eye tracking data related to fixation duration and pupil dilation as well as think-aloud protoc...
Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, 2021
The continuous development of translation technologies has fundamentally changed the way users of... more The continuous development of translation technologies has fundamentally changed the way users of this area interact with computers. The need for understanding and measuring how a vast number of resources and software applications can impact users and other stakeholders had led to recommendations related to human-computer interaction (HCI), presented as guidelines and best practices. Unfortunately, when developing translation tools limited attention is still paid to usability and ergonomics, be it during the design, implementation or deployment phases. Meanwhile, the level of complexity of corpus-based translation tools has increased in difficulties and diversity, however, this evolution does not take into consideration HCI recommendations yet. The goal of this study is to bridge this gap between corpus-based tools, ergonomics, and usability, by presenting the results of a user-oriented methodology. With this in mind, a corpus analysis software, called COPA-TRAD, was used as the basis for applying some existing methods within usability and ergonomics area. The proposed study was composed of three main stages: (i) usability questionnaire – administered to participants of this knowledge area; (ii) heuristics analysis – performed by five usability experts; and (iii) ergonomics checklist inspection, to analyze general elements. The results indicated that despite the concern of providing a “user-friendly” interface, the analyzed system had not made use of known usability and ergonomics methods, just guidelines of the third-party software used as part of COPA-TRAD. The study points out directions on which a corpus-based tool can be adapted to user needs and further indicate some important criteria that require improvement. After applying the necessary changes, a complementary analysis needs to be carried out to verify if those identified issues were accurately adjusted. We believe translation technology should concern with building adequate interfaces, allowing humans to interact effectively with tools data and facilitating the process of retrieving information.
Data-to-text Natural Language Generation (NLG) is the computational process of generating natural... more Data-to-text Natural Language Generation (NLG) is the computational process of generating natural language in the form of text or voice from non-linguistic data. A core micro-planning task within NLG is referring expression generation (REG), which aims to automatically generate noun phrases to refer to entities mentioned as discourse unfolds. A limitation of novel REG models is not being able to generate referring expressions to entities not encountered during the training process. To solve this problem, we propose two extensions to NeuralREG, a state-of-the-art encoder-decoder REG model. The first is a copy mechanism, whereas the second consists of representing the gender and type of the referent as inputs to the model. Drawing on the results of automatic and human evaluation as well as an ablation study using the WebNLG corpus, we contend that our proposal contributes to the generation of more meaningful referring expressions to unseen entities than the original system and related...
This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Ama... more This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Amazon. The robot-journalist is based on a pipeline architecture of Natural Language Generation, which yields multilingual daily and monthly reports based on the public data provided by DETER, a real-time deforestation satellite monitor developed and maintained by the Brazilian National Institute for Space Research (INPE). DaMata automatically generates reports in Brazilian Portuguese and English and publishes them on the Twitter platform. Corpus and code are publicly available.
Investigating the post-editing effort associated with machinetranslated metaphors: a process-driven analysis, 2019
This paper reports on a study that analyses the impact of two different machine
translation (MT) ... more This paper reports on a study that analyses the impact of two different machine translation (MT) outputs on the cognitive effort required to post-edit machine-translated metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the statistical MT output would have a positive effect on reducing cognitive effort. In order to test this hypothesis, a post-editing experiment was conducted with two different groups of participants. Each experimental group had two post-editing tasks using the language pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to postedit a Google machine-translated output whereas on Task 2 (T2) the same participants were assigned to post-edit a Systran machine translated output. Data collection was conducted under the experimental paradigm of data triangulation in translation process research. Data analysis focuses on eye tracking data related to fixation duration and pupil dilation as well as think-aloud protocols. This analysis shows that the cognitive effort required to post-edit the pure statistical MT output might be lower in comparison to the hybrid output when conventional metaphors are machine translated.
The Journal of Specialised Translation (JoSTrans), 2019
This paper reports on a study that analyses the impact of two different machine translation (MT) ... more This paper reports on a study that analyses the impact of two different machine translation (MT) outputs on the cognitive effort required to post-edit machine-translated metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the statistical MT output would have a positive effect on reducing cognitive effort. In order to test this hypothesis, a post-editing experiment was conducted with two different groups of participants. Each experimental group had two post-editing tasks using the language pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to post-edit a Google machine-translated output whereas on Task 2 (T2) the same participants were assigned to post-edit a Systran machine translated output. Data collection was conducted under the experimental paradigm of data triangulation in translation process research. Data analysis focuses on eye tracking data related to fixation duration and pupil dilation as well as think-aloud protocols. This analysis shows that the cognitive effort required to post-edit the pure statistical MT output might be lower in comparison to the hybrid output when conventional metaphors are machine translated.
Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciai... more Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciais no crescimento dos Estudos da Tradução Baseados em Corpus (ETBC), assim como no desenvolvimento de sistemas de informação e técnicas que fazem uso destes. Este artigo apresenta uma breve revisão de sistemas web baseados em corpus no par linguístico inglês-português, a partir de uma perspectiva de aplicação ao ensino, à pesquisa e à prática tradutória. Para tanto, buscamos proporcionar uma significação no âmbito tecnológico por meio de (i) uma breve contextualização teórica sobre o uso de corpora, (ii) as suas principais características e (iii) as aplicações mais conhecidas. Posteriormente, apresenta-se uma síntese das ferramentas web gratuitas: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) e VVV (2013). Em seguida, elencamos os usos e benefícios mais comuns de sistemas de compilação, análise, classificação e exploração de corpora. Por fim, a análise revela o momento vivenciado nos ETBC por meio de um resumo do aparato tecnológico existente na área. Desta maneira, almejamos que a presente discussão venha a proporcionar o desenvolvimento de pesquisas relacionadas aos sistemas baseados em corpus, haja vista a constante evolução tecnológica e a variedade de aplicações que podem se beneficiar do uso de corpora, seja no contexto prático ou profissional.
Abstract: With the advent of Internet and continuous technological advances, corpora have become essential in the growth of Corpus-Based Translation Studies (CTS), as well as in the development of information systems and techniques that make use of them. This paper presents a brief revision of corpus-based web systems in the English-Portuguese language pair, from a perspective of application in translation teaching, research and practice. To this end, we aim to provide a meaning in the technological scope through (i) a brief theoretical contextualization on the use of corpora, (ii) its key features and (iii) the best-known applications. Afterwards, a summary of the open-source web-based tools is presented: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) and VVV (2013). Next, we list the most common uses and benefits of systems for compiling, analyzing, classifying, and exploiting corpora. Finally, the analysis reveals the moment experienced by CTS through a synthesis of the technological apparatus in the area. To sum up, we aim to encourage the development of corpus-based systems research, due to the constant technological evolution and the variety of applications that can benefit from the use of corpora, either in the practical or professional context.
O objetivo deste trabalho é apresentar uma análise comparativa, centrada no usuário, que foi real... more O objetivo deste trabalho é apresentar uma análise comparativa, centrada no usuário, que foi realizada em duas ferramentas de apoio à tradução disponíveis on-line: Google Translator Toolkit e Wordfast Anywhere. Como principal método de investigação, temos a avaliação exploratória e aplicação de lista de verificação de usabilidade e ergonomia cognitiva. Os resultados iniciais mostram que o emprego de métodos de avaliação deste tipo pode ter baixo custo e ser de fácil acesso, necessitando-se apenas de uma disseminação maior sobre como aplicá-los às ferramentas de tradução.
Abstract: The aim of this paper is to present a user-centered comparative analysis of two CAT tools available on-line: Google Translator Toolkit and Wordfast Anywhere. As the central research method, we have applied an exploratory evaluation and a usability and cognitive ergonomics checklist. Initial results show that the use of some of these methods may be low-cost and easily accessible, requiring only further dissemination about how applying them to translation tools.
Translation is a profession highly connected to technology, and for this reason, most of today's ... more Translation is a profession highly connected to technology, and for this reason, most of today's translators are in contact with a variety of tools, services and programs, such as word processors, e-mail, electronic dictionaries, among others. In this paper, we argue that while translation and technology have a strong relationship, there are few researches in Corpus-based Translation Studies area, which are related to analyze and evaluate translation software. The corpus-based information system, called COPA- TRAD is analyzed considering ergonomics and software usability aspects, in order to those who are involved in the area can have access to a more familiar system that can be used to translation research; teaching; and practice. Due to the nature and still little explored subject, we intend to provide to Translation Studies area, and more specifically, to those familiar to Corpus-based Translation Studies, features/characteristics that can lead to even more studies about this subject, resulting in possible improvements and/or development of translation-based tools.
A usabilidade e ergonomia evoluíram para facilitar o desenvolvimento de aplicações que trazem uma... more A usabilidade e ergonomia evoluíram para facilitar o desenvolvimento de aplicações que trazem uma experiência de uso melhor aos seus usuários. A área de tradução seguiu a mesma tendência no que diz respeito ao avanço tecnológico, seja com a disponibilização de novas ferramentas no setor, ou com a atualização de sistemas já existentes. Apesar do crescente número de aplicações disponibilizadas, verifica-se uma menor prioridade a fatores relacionados à facilidade de uso dessas ferramentas pelos seus usuários. O objetivo deste workshop é apresentar métodos e técnicas de usabilidade e ergonomia cognitiva, que podem ser aplicados aos Estudos da Tradução. Será dada uma breve visão geral teórica sobre a área de interação humano-computador (IHC), além de recursos para que os participantes possam avaliar interfaces da área de tradução. Espera-se assim proporcionar um olhar mais investigador aos usuários dessa área, através da utilização de princípios de usabilidade e como estes podem ser aplicados para a identificação dos principais problemas de interfaces.
Traditional teaching approaches reflect very little impact regarding new practices and at the sam... more Traditional teaching approaches reflect very little impact regarding new practices and at the same time offer no room for improvement. Still today, formal training in pedagogy purport conservative beliefs on knowledge, colleagueship, and advocacy. On the other hand, cognitive theory and social constructivism suggest that learning and development is a social process, where each and every student can become the center of his/her process, towards becoming an independent learner. Within this context, teachers are mediators, promoting and encouraging critical thinking through problem-solving tasks; reflection and analysis; creativity and interpretation. Students build their meanings and knowledge, which are linked to their socio-cultural context.
Within this theoretical framework, we would like to question current teaching practices by proposing the use of corpora in the classroom, aiming to redefine roles of the subjects that are part of the teaching/learning process as a new form of interaction mediated by new technologies. The use of information technology (IT) tools encourages proactivity and facilitates group work, as well as promotes greater awareness on the part of the students while a new strategy is built. We believe this approach is more appropriate for these times of globalization and virtual work. As a guiding motivation, we expect to generate changes in attitudes to traditional roles, responsibilities, and beliefs.
The “sketch engine” software will be used to provide for activities in the classroom. At the same time, a general questionnaire for teachers of English as an additional language is applied once the tasks are fulfilled. This inquiry is based on possibilities and beliefs over the use of corpus systems in our everyday practice and is analyzed so as to describe teachers’ reactions to our proposal. Once the new technologies are introduced in the language classroom, students become aware of the potential of such systems and the advantages of choosing and leading their own learning process. Is it possible to rethink formal training in pedagogy, while including virtual learning and a different share of responsibility in the process?
Further research is required, but self-assessment by the students proved this appropriate; particularly with those preparing international exams and planning to apply for foreign universities.
A experiência do usuário é um tópico comumente abordado em Sistemas de Informação, se... more A experiência do usuário é um tópico comumente abordado em Sistemas de Informação, seja pelas necessidades identificadas durante a elaboração de ferramentas e aplicativos especialistas ou pela sua importância no que se refere à interação humano-computador. No entanto, apesar de ser uma preocupação compartilhada por muitos, seu reflexo se limita a um pequeno número de pesquisas na área dos Estudos da Tradução. Esse número é ainda menor quando tratamos de pesquisas relacionadas aos Estudos da Tradução Baseados em Corpus (ETBC). Esse trabalho tem por objetivo principal analisar e avaliar um sistema sob a perspectiva do usuário (pesquisador, tradutor, estudante – da área de tradução), abordando características de usabilidade e ergonomia. O ponto de partida da investigação é um sistema de tradução com base em corpus denominado COPA-TRAD – desenvolvido junto à Universidade Federal de Santa Catarina (UFSC) –, que teve com uma de suas principais preocupações oferecer uma ferramenta de fácil utilização aos seus usuários. Neste contexto, investiga-se até que ponto critérios de usabilidade e ergonomia foram utilizados no sistema avaliado. Para tanto, um conjunto de métodos foi selecionado para coletar informações sobre o nível de satisfação do usuário, bem como suas atitudes em relação ao sistema. Os resultados preliminares demonstram uma tendência à preocupação com o usuário, porém sem a aplicação prévia de avaliações de ergonomia e usabilidade, seja durante o desenvolvimento ou após a conclusão do software. Acredita-se que a contribuição metodológica venha a fomentar um melhor desenvolvimento de sistemas e/ou ferramentas, assim como promover mais discussões sobre o tema.
The continuous development of translation technologies has fundamentally changed the way users o... more The continuous development of translation technologies has fundamentally changed the way users of this area interact with computers. The need for understanding and measuring how a vast number of resources and software applications can impact users and other stakeholders had led to recommendations related to human-computer interaction (HCI), presented as guidelines and best practices. Unfortunately, when developing translation tools limited attention is still paid to usability and ergonomics, be it during the design, implementation or deployment phases. Meanwhile, the level of complexity of corpus-based translation tools has increased in difficulties and diversity, however, this evolution does not take into consideration HCI recommendations yet. The goal of this study is to bridge this gap between corpus-based tools, ergonomics, and usability, by presenting the results of a user-oriented methodology. With this in mind, a corpus analysis software, called COPA-TRAD, was used as the basis for applying some existing methods within usability and ergonomics area. The proposed study was composed of three main stages: (i) usability questionnaire – administered to participants of this knowledge area; (ii) heuristics analysis – performed by five usability experts; and (iii) ergonomics checklist inspection, to analyze general elements. The results indicated that despite the concern of providing a “user-friendly” interface, the analyzed system had not made use of known usability and ergonomics methods, just guidelines of the third-party software used as part of COPA-TRAD. The study points out directions on which a corpus-based tool can be adapted to user needs and further indicate some important criteria that require improvement. After applying the necessary changes, a complementary analysis needs to be carried out to verify if those identified issues were accurately adjusted. We believe translation technology should concern with building adequate interfaces, allowing humans to interact effectively with tools data and facilitating the process of retrieving information.
A usabilidade e ergonomia evoluíram com o surgimento da web 2.0, onde novas tecnologias e ferrame... more A usabilidade e ergonomia evoluíram com o surgimento da web 2.0, onde novas tecnologias e ferramentas surgiram para facilitar o desenvolvimento de aplicações que trazem uma experiência de uso melhor aos seus usuários. A área de tradução seguiu a mesma tendência no que diz respeito ao avanço tecnológico, seja com a disponibilização de novas ferramentas no setor, ou com a atualização de sistemas já existentes. Apesar do crescente número de aplicações disponibilizadas, verifica-se uma menor prioridade a fatores relacionados à facilidade de uso dessas ferramentas pelos seus usuários. Tal condição ocorre devido a limitações financeiras, falta de conhecimento, ou por não ser prioritária para o processo. O objetivo deste trabalho é apresentar uma análise, centrada no usuário, que foi realizada em ferramentas de apoio à tradução. Espera-se assim proporcionar um olhar mais investigador aos usuários das aplicações da área de tradução. O ponto de partida é uma ferramenta de apoio à tradução, disponível online e que serve como modelo para análises futuras. Como principal método de investigação, temos a aplicação de listas de verificação de Usabilidade e Ergonomia. O propósito é identificar requisitos necessários em ferramentas de apoio à tradução, e consequentemente, aumentar o grau de usabilidade (utilização simples e intuitiva) de seus usuários. Os resultados iniciais mostram que o emprego de alguns métodos, que consideram princípios de usabilidade e ergonomia, podem ter baixo custo e ser de fácil acesso, necessitando apenas uma disseminação maior dessas informações. Acredita-se que proporcionando tal conhecimento aos usuários da área de tradução, bem como estudantes, pesquisadores e professores de idiomas, essa necessidade se torne requisito primordial durante o desenvolvimento de sistemas e aplicações da área.
Uploads
Papers by Rossana Cunha
translation (MT) outputs on the cognitive effort required to post-edit machine-translated
metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the
statistical MT output would have a positive effect on reducing cognitive effort. In order to
test this hypothesis, a post-editing experiment was conducted with two different groups
of participants. Each experimental group had two post-editing tasks using the language
pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to postedit a Google machine-translated output whereas on Task 2 (T2) the same participants
were assigned to post-edit a Systran machine translated output. Data collection was
conducted under the experimental paradigm of data triangulation in translation process
research. Data analysis focuses on eye tracking data related to fixation duration and
pupil dilation as well as think-aloud protocols. This analysis shows that the cognitive
effort required to post-edit the pure statistical MT output might be lower in comparison to
the hybrid output when conventional metaphors are machine translated.
Abstract: With the advent of Internet and continuous technological advances, corpora have become essential in the growth of Corpus-Based Translation Studies (CTS), as well as in the development of information systems and techniques that make use of them. This paper presents a brief revision of corpus-based web systems in the English-Portuguese language pair, from a perspective of application in translation teaching, research and practice. To this end, we aim to provide a meaning in the technological scope through (i) a brief theoretical contextualization on the use of corpora, (ii) its key features and (iii) the best-known applications. Afterwards, a summary of the open-source web-based tools is presented: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) and VVV (2013). Next, we list the most common uses and benefits of systems for compiling, analyzing, classifying, and exploiting corpora. Finally, the analysis reveals the moment experienced by CTS through a synthesis of the technological apparatus in the area. To sum up, we aim to encourage the development of corpus-based systems research, due to the constant technological evolution and the variety of applications that can benefit from the use of corpora, either in the practical or professional context.
Abstract: The aim of this paper is to present a user-centered comparative analysis of two CAT tools available on-line: Google Translator Toolkit and Wordfast Anywhere. As the central research method, we have applied an exploratory evaluation and a usability and cognitive ergonomics checklist. Initial results show that the use of some of these methods may be low-cost and easily accessible, requiring only further dissemination about how applying them to translation tools.
Conference Presentations by Rossana Cunha
Within this theoretical framework, we would like to question current teaching practices by proposing the use of corpora in the classroom, aiming to redefine roles of the subjects that are part of the teaching/learning process as a new form of interaction mediated by new technologies. The use of information technology (IT) tools encourages proactivity and facilitates group work, as well as promotes greater awareness on the part of the students while a new strategy is built. We believe this approach is more appropriate for these times of globalization and virtual work. As a guiding motivation, we expect to generate changes in attitudes to traditional roles, responsibilities, and beliefs.
The “sketch engine” software will be used to provide for activities in the classroom. At the same time, a general questionnaire for teachers of English as an additional language is applied once the tasks are fulfilled. This inquiry is based on possibilities and beliefs over the use of corpus systems in our everyday practice and is analyzed so as to describe teachers’ reactions to our proposal. Once the new technologies are introduced in the language classroom, students become aware of the potential of such systems and the advantages of choosing and leading their own learning process. Is it possible to rethink formal training in pedagogy, while including virtual learning and a different share of responsibility in the process?
Further research is required, but self-assessment by the students proved this appropriate; particularly with those preparing international exams and planning to apply for foreign universities.
na área dos Estudos da Tradução. Esse número é ainda menor quando tratamos de pesquisas relacionadas aos Estudos da Tradução Baseados em Corpus (ETBC). Esse trabalho tem por objetivo principal analisar e avaliar um sistema sob a perspectiva do usuário (pesquisador, tradutor, estudante – da área de tradução), abordando
características de usabilidade e ergonomia. O ponto de partida da investigação é um sistema de tradução com base em corpus denominado COPA-TRAD – desenvolvido junto à Universidade Federal de Santa Catarina (UFSC) –, que teve com uma de suas principais preocupações oferecer uma ferramenta de fácil utilização aos seus usuários. Neste contexto, investiga-se até que ponto critérios de usabilidade e ergonomia foram utilizados no sistema avaliado. Para tanto, um conjunto de métodos foi selecionado para coletar informações sobre o nível de satisfação do usuário, bem como suas atitudes em relação ao sistema. Os resultados preliminares demonstram uma tendência à preocupação com o usuário, porém
sem a aplicação prévia de avaliações de ergonomia e usabilidade, seja durante o desenvolvimento ou após a conclusão do software. Acredita-se que a contribuição metodológica venha a fomentar um melhor desenvolvimento de sistemas e/ou ferramentas, assim como promover mais discussões sobre o tema.
important criteria that require improvement. After applying the necessary changes, a complementary analysis needs to be carried out to verify if those identified issues were accurately adjusted. We believe translation technology should concern with building adequate interfaces, allowing humans to interact effectively with tools data and facilitating the process of retrieving information.
translation (MT) outputs on the cognitive effort required to post-edit machine-translated
metaphors by means of eye tracking and think-aloud protocols. We hypothesise that the
statistical MT output would have a positive effect on reducing cognitive effort. In order to
test this hypothesis, a post-editing experiment was conducted with two different groups
of participants. Each experimental group had two post-editing tasks using the language
pair English into Brazilian Portuguese. On Task 1 (T1), participants were asked to postedit a Google machine-translated output whereas on Task 2 (T2) the same participants
were assigned to post-edit a Systran machine translated output. Data collection was
conducted under the experimental paradigm of data triangulation in translation process
research. Data analysis focuses on eye tracking data related to fixation duration and
pupil dilation as well as think-aloud protocols. This analysis shows that the cognitive
effort required to post-edit the pure statistical MT output might be lower in comparison to
the hybrid output when conventional metaphors are machine translated.
Abstract: With the advent of Internet and continuous technological advances, corpora have become essential in the growth of Corpus-Based Translation Studies (CTS), as well as in the development of information systems and techniques that make use of them. This paper presents a brief revision of corpus-based web systems in the English-Portuguese language pair, from a perspective of application in translation teaching, research and practice. To this end, we aim to provide a meaning in the technological scope through (i) a brief theoretical contextualization on the use of corpora, (ii) its key features and (iii) the best-known applications. Afterwards, a summary of the open-source web-based tools is presented: COMPARA (2000), CorTrad (2009), COPA-TRAD (2011), OPUS-CORPUS (2012) and VVV (2013). Next, we list the most common uses and benefits of systems for compiling, analyzing, classifying, and exploiting corpora. Finally, the analysis reveals the moment experienced by CTS through a synthesis of the technological apparatus in the area. To sum up, we aim to encourage the development of corpus-based systems research, due to the constant technological evolution and the variety of applications that can benefit from the use of corpora, either in the practical or professional context.
Abstract: The aim of this paper is to present a user-centered comparative analysis of two CAT tools available on-line: Google Translator Toolkit and Wordfast Anywhere. As the central research method, we have applied an exploratory evaluation and a usability and cognitive ergonomics checklist. Initial results show that the use of some of these methods may be low-cost and easily accessible, requiring only further dissemination about how applying them to translation tools.
Within this theoretical framework, we would like to question current teaching practices by proposing the use of corpora in the classroom, aiming to redefine roles of the subjects that are part of the teaching/learning process as a new form of interaction mediated by new technologies. The use of information technology (IT) tools encourages proactivity and facilitates group work, as well as promotes greater awareness on the part of the students while a new strategy is built. We believe this approach is more appropriate for these times of globalization and virtual work. As a guiding motivation, we expect to generate changes in attitudes to traditional roles, responsibilities, and beliefs.
The “sketch engine” software will be used to provide for activities in the classroom. At the same time, a general questionnaire for teachers of English as an additional language is applied once the tasks are fulfilled. This inquiry is based on possibilities and beliefs over the use of corpus systems in our everyday practice and is analyzed so as to describe teachers’ reactions to our proposal. Once the new technologies are introduced in the language classroom, students become aware of the potential of such systems and the advantages of choosing and leading their own learning process. Is it possible to rethink formal training in pedagogy, while including virtual learning and a different share of responsibility in the process?
Further research is required, but self-assessment by the students proved this appropriate; particularly with those preparing international exams and planning to apply for foreign universities.
na área dos Estudos da Tradução. Esse número é ainda menor quando tratamos de pesquisas relacionadas aos Estudos da Tradução Baseados em Corpus (ETBC). Esse trabalho tem por objetivo principal analisar e avaliar um sistema sob a perspectiva do usuário (pesquisador, tradutor, estudante – da área de tradução), abordando
características de usabilidade e ergonomia. O ponto de partida da investigação é um sistema de tradução com base em corpus denominado COPA-TRAD – desenvolvido junto à Universidade Federal de Santa Catarina (UFSC) –, que teve com uma de suas principais preocupações oferecer uma ferramenta de fácil utilização aos seus usuários. Neste contexto, investiga-se até que ponto critérios de usabilidade e ergonomia foram utilizados no sistema avaliado. Para tanto, um conjunto de métodos foi selecionado para coletar informações sobre o nível de satisfação do usuário, bem como suas atitudes em relação ao sistema. Os resultados preliminares demonstram uma tendência à preocupação com o usuário, porém
sem a aplicação prévia de avaliações de ergonomia e usabilidade, seja durante o desenvolvimento ou após a conclusão do software. Acredita-se que a contribuição metodológica venha a fomentar um melhor desenvolvimento de sistemas e/ou ferramentas, assim como promover mais discussões sobre o tema.
important criteria that require improvement. After applying the necessary changes, a complementary analysis needs to be carried out to verify if those identified issues were accurately adjusted. We believe translation technology should concern with building adequate interfaces, allowing humans to interact effectively with tools data and facilitating the process of retrieving information.