Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3386392.3397605acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
research-article

An Analysis of Transfer Learning Methods for Multilingual Readability Assessment

Published: 13 July 2020 Publication History

Abstract

Recent advances in readability assessment have lead to the introduction of multilingual strategies that can predict the reading-level of a text regardless of its language. These strategies, however, tend to be limited to just operating in different languages rather than taking any explicit advantage of the multilingual corpora they utilize. In this manuscript, we discuss the results of the in-depth empirical analysis we conducted to assess the language transfer capabilities of four different strategies for readability assessment with increasing multilingual power. Results showcase that transfer learning is a valid option for improving the performance of readability assessment, particularly in the case of typologically-similar languages and when training corpora availability is limited.

Supplementary Material

VTT File (3386392.3397605.vtt)
MP4 File (3386392.3397605.mp4)
Supplemental Video

References

[1]
Sandra Aluisio, Lucia Specia, Caroline Gasperin, and Carolina Scarton. 2010. Readability assessment for text simplification. In North American Chapter of Association of Computational Linguistics: Human Language Technologies Workshop on Innovative Use of NLP for Building Educational Applications. Association of Computational Linguistics, 1--9.
[2]
Alberto Anula. 2007. Tipos de textos, complejidad lingüistica y facilicitación lectora. In Actas del Sexto Congreso de Hispanistas de Asia. 45--61.
[3]
Rebekah George Benjamin. 2012. Reconstructing readability: Recent developments and recommendations in the analysis of text difficulty. Educational Psychology Review, Vol. 24, 1 (2012), 63--88.
[4]
Dania Bilal and Li-Min Huang. 2019. Readability and word complexity of SERPs snippets and web pages on children's search queries. Aslib Journal of Information Management (2019).
[5]
Yaw-Huei Chen, Yi-Han Tsai, and Yu-Ta Chen. 2011. Chinese readability assessment using TF-IDF and SVM. In Proceedings of the International Conference on Machine Learning and Computing, Vol. 2. IEEE, 705--710.
[6]
Edgar Dale and Jeanne S Chall. 1948. A formula for predicting readability: Instructions. Educational Research Bulletin (1948), 37--54.
[7]
Orphée De Clercq and Véronique Hoste. 2016. All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch. Computational Linguistics, Vol. 42, 3 (2016), 457--490.
[8]
Felice Dell'Orletta, Simonetta Montemagni, and Giulia Venturi. 2011. Read-it: Assessing readability of italian texts with a view to text simplification. In Workshop on Speech and Language Processing for Assistive Technologies. ACL, 73--83.
[9]
Rudolph Flesch. 1948. A new readability yardstick. Journal of Applied Psychology, Vol. 32, 3 (1948), 221.
[10]
Jonathan Neil Forsyth. 2014. Automatic Readability Prediction for Modern Standard Arabic. Ph.D. Dissertation. Brigham Young University.
[11]
Thomas Francc ois and Cédrick Fairon. 2012. An AI readability formula for French as a foreign language. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. ACL, 466--477.
[12]
Itziar Gonzalez-Dios, Mar'ia Jesús Aranzabe, Arantza D'iaz de Ilarraza, and Haritz Salaberri. 2014. Simple or complex? assessing the readability of basque texts. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 334--344.
[13]
Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning Word Vectors for 157 Languages. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018) .
[14]
Debora Jeske, Mammed Bagher, and Nadia Pantidi. 2017. Current and alternate approaches to personalization in online learning. In Proceedings of the 13th International Symposium on Open Collaboration. 1--4.
[15]
Nikolay Karpov, Julia Baranova, and Fedor Vitugin. 2014. Single-sentence readability prediction in Russian. In Proceedings of the International Conference on Analysis of Images, Social Networks and Texts. Springer, 91--100.
[16]
Liadh Kelly, Lorraine Goeuriot, Hanna Suominen, Mariana Neves, Evangelos Kanoulas, Rene Spijker, Leif Azzopardi, Dan Li, Jo ao Palotti, Guido Zuccon, et al. 2019. CLEF ehealth 2019 evaluation lab. In European Conference on Information Retrieval. Springer, 267--274.
[17]
Ion Madrazo Azpiazu. 2019. Multilingual Information Retrieval: A Representation Building Perspective. Ph.D. Dissertation. Boise State University.
[18]
Ion Madrazo Azpiazu and Maria Soledad Pera. 2019. Multiattentive Recurrent Neural Network Architecture for Multilingual Readability Assessment. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 421--436.
[19]
Ion Madrazo Azpiazu and Maria Soledad Pera. 2020 a. A Framework for Hierarchical Multilingual Machine Translation. arxiv: cs.CL/2005.05507
[20]
Ion Madrazo Azpiazu and Maria Soledad Pera. 2020 b. Hierarhical Compositional Mapping for Cross-lingual Embedding Generation. In Transactions of the Association for Computational Linguistics. In-press .
[21]
Ion Madrazo Azpiazu and Maria Soledad Pera. 2020 c. Is cross-lingual readability assessment possible? Journal of the Association for Information Science & Technology, Vol. 71, 6 (2020), 644--656.
[22]
Ashlee Milton, Emiliana Murgia, Monica Landoni, Theo Huibers, and Maria Soledad Pera. 2019. Here, There, and Everywhere: Building a Scaffolding for Children's Learning Through Recommendations. Proceedings of the 1st Workshop on the Impact of Recommender Systems co-located with the 13th ACM Conference on Recommender Systems (2019).
[23]
Joao Palotti, Lorraine Goeuriot, Guido Zuccon, and Allan Hanbury. 2016. Ranking health web pages with relevance and understandability. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 965--968.
[24]
Chenhao Tan, Evgeniy Gabrilovich, and Bo Pang. 2012. To each his own: personalized content selection based on text comprehensibility. In Proceedings of the fifth ACM International Conference on Web Search and Data Mining. 233--242.
[25]
Carla Teixeira Lopes and Cristina Ribeiro. 2019. Interplay of Documents' Readability, Comprehension and Consumer Health Search Performance Across Query Terminology. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval. 193--201.
[26]
David Joseph Weiss and Eleni Miltsakaki. 2017. Adaptive Reading Level Assessment for Personalized Search. US Patent App. 15/650,173.

Cited By

View all
  • (2023)Unsupervised Readability Assessment via Learning from Weak Readability SignalsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591695(1324-1334)Online publication date: 19-Jul-2023
  • (2023)Where a Little Change Makes a Big Difference: A Preliminary Exploration of Children’s QueriesAdvances in Information Retrieval10.1007/978-3-031-28238-6_43(522-533)Online publication date: 17-Mar-2023
  • (2022)Supercalifragilisticexpialidocious: Why Using the “Right” Readability Formula in Children’s Web Search MattersAdvances in Information Retrieval10.1007/978-3-030-99736-6_1(3-18)Online publication date: 5-Apr-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
UMAP '20 Adjunct: Adjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization
July 2020
395 pages
ISBN:9781450379502
DOI:10.1145/3386392
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multilingual
  2. personalization
  3. readability assessment
  4. text analysis

Qualifiers

  • Research-article

Conference

UMAP '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Unsupervised Readability Assessment via Learning from Weak Readability SignalsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591695(1324-1334)Online publication date: 19-Jul-2023
  • (2023)Where a Little Change Makes a Big Difference: A Preliminary Exploration of Children’s QueriesAdvances in Information Retrieval10.1007/978-3-031-28238-6_43(522-533)Online publication date: 17-Mar-2023
  • (2022)Supercalifragilisticexpialidocious: Why Using the “Right” Readability Formula in Children’s Web Search MattersAdvances in Information Retrieval10.1007/978-3-030-99736-6_1(3-18)Online publication date: 5-Apr-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media