Authors:
Aluizio Haendchen Filho
1
;
Filipe Sateles Porto de Lima
2
;
Hércules Antônio do Prado
2
;
Edilson Ferneda
2
;
Adson Marques da Silva Esteves
1
and
Rudimar L. S. Dazzi
1
Affiliations:
1
Laboratory of Technological Innovation in Education (LITE), University of the Itajai Valley (UNIVALI), Itajai, Brazil
;
2
Catholic University of Brasilia, Graduate Program in Governance, Technology and Innovation, Brasilia, Brazil
Keyword(s):
Textual Cohesion, Automated Essay Grading, Machine Learning, Text Classification.
Abstract:
Aiming to contribute to studies on the evaluation of textual cohesion in Brazilian Portuguese, this paper presents an approach based on machine learning for automated scoring of textual cohesion, according to the evaluation model adopted in Brazil. The purpose is to verify the mastery of skills and abilities of students who have completed high school. Based on features groups such as lexicon diversity, connectives, readability indexes and overlap of sentences and paragraphs, 91 features, based in TAACO (Tool for the Automatic Analysis of Cohesion), were adopted. Beyond features specifically related to textual cohesion, other were defined for capturing general aspects of the text. The efficiency of the classification model based on Support Vector Machines was measured. It was also demonstrated how normalization and class balancing techniques are essential to improve results using the small dataset available for this task.