Rongtao Huang
2020
NUT-RC: Noisy User-generated Text-oriented Reading Comprehension
Rongtao Huang
|
Bowei Zou
|
Yu Hong
|
Wei Zhang
|
AiTi Aw
|
Guodong Zhou
Proceedings of the 28th International Conference on Computational Linguistics
Reading comprehension (RC) on social media such as Twitter is a critical and challenging task due to its noisy, informal, but informative nature. Most existing RC models are developed on formal datasets such as news articles and Wikipedia documents, which severely limit their performances when directly applied to the noisy and informal texts in social media. Moreover, these models only focus on a certain type of RC, extractive or generative, but ignore the integration of them. To well address these challenges, we come up with a noisy user-generated text-oriented RC model. In particular, we first introduce a set of text normalizers to transform the noisy and informal texts to the formal ones. Then, we integrate the extractive and the generative RC model by a multi-task learning mechanism and an answer selection module. Experimental results on TweetQA demonstrate that our NUT-RC model significantly outperforms the state-of-the-art social media-oriented RC models.
基于多任务学习的生成式阅读理解(Generative Reading Comprehension via Multi-task Learning)
Jin Qian (钱锦)
|
Rongtao Huang (黄荣涛)
|
Bowei Zou (邹博伟)
|
Yu Hong (洪宇)
Proceedings of the 19th Chinese National Conference on Computational Linguistics
生成式阅读理解是机器阅读理解领域一项新颖且极具挑战性的研究。与主流的抽取式阅读理解相比,生成式阅读理解模型不再局限于从段落中抽取答案,而是能结合问题和段落生成自然和完整的表述作为答案。然而,现有的生成式阅读理解模型缺乏对答案在段落中的边界信息以及对问题类型信息的理解。为解决上述问题,本文提出一种基于多任务学习的生成式阅读理解模型。该模型在训练阶段将答案生成任务作为主任务,答案抽取和问题分类任务作为辅助任务进行多任务学习,同时学习和优化模型编码层参数;在测试阶段加载模型编码层进行解码生成答案。实验结果表明,答案抽取模型和问题分类模型能够有效提升生成式阅读理解模型的性能。