LFED: A Literary Fiction Evaluation Dataset for Large Language Models

Yu, Linhao; Liu, Qun; Xiong, Deyi

Computer Science > Computation and Language

arXiv:2405.10166 (cs)

[Submitted on 16 May 2024]

Title:LFED: A Literary Fiction Evaluation Dataset for Large Language Models

Authors:Linhao Yu, Qun Liu, Deyi Xiong

View PDF HTML (experimental)

Abstract:The rapid evolution of large language models (LLMs) has ushered in the need for comprehensive assessments of their performance across various dimensions. In this paper, we propose LFED, a Literary Fiction Evaluation Dataset, which aims to evaluate the capability of LLMs on the long fiction comprehension and reasoning. We collect 95 literary fictions that are either originally written in Chinese or translated into Chinese, covering a wide range of topics across several centuries. We define a question taxonomy with 8 question categories to guide the creation of 1,304 questions. Additionally, we conduct an in-depth analysis to ascertain how specific attributes of literary fictions (e.g., novel types, character numbers, the year of publication) impact LLM performance in evaluations. Through a series of experiments with various state-of-the-art LLMs, we demonstrate that these models face considerable challenges in effectively addressing questions related to literary fictions, with ChatGPT reaching only 57.08% under the zero-shot setting. The dataset will be publicly available at this https URL

Subjects:	Computation and Language (cs.CL); Performance (cs.PF)
Cite as:	arXiv:2405.10166 [cs.CL]
	(or arXiv:2405.10166v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.10166

Submission history

From: Linhao Yu [view email]
[v1] Thu, 16 May 2024 15:02:24 UTC (1,822 KB)

Computer Science > Computation and Language

Title:LFED: A Literary Fiction Evaluation Dataset for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LFED: A Literary Fiction Evaluation Dataset for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators