ResearchArena: Benchmarking LLMs' Ability to Collect and Organize Information as Research Agents

Kang, Hao; Xiong, Chenyan

Computer Science > Artificial Intelligence

arXiv:2406.10291 (cs)

[Submitted on 13 Jun 2024]

Title:ResearchArena: Benchmarking LLMs' Ability to Collect and Organize Information as Research Agents

Authors:Hao Kang, Chenyan Xiong

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have exhibited remarkable performance across various tasks in natural language processing. Nevertheless, challenges still arise when these tasks demand domain-specific expertise and advanced analytical skills, such as conducting research surveys on a designated topic. In this research, we develop ResearchArena, a benchmark that measures LLM agents' ability to conduct academic surveys, an initial step of academic research process. Specifically, we deconstructs the surveying process into three stages 1) information discovery: locating relevant papers, 2) information selection: assessing papers' importance to the topic, and 3) information organization: organizing papers into meaningful structures. In particular, we establish an offline environment comprising 12.0M full-text academic papers and 7.9K survey papers, which evaluates agents' ability to locate supporting materials for composing the survey on a topic, rank the located papers based on their impact, and organize these into a hierarchical knowledge mind-map. With this benchmark, we conduct preliminary evaluations of existing techniques and find that all LLM-based methods under-performing when compared to basic keyword-based retrieval techniques, highlighting substantial opportunities for future research.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as:	arXiv:2406.10291 [cs.AI]
	(or arXiv:2406.10291v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2406.10291

Submission history

From: Hao Kang [view email]
[v1] Thu, 13 Jun 2024 03:26:30 UTC (218 KB)

Computer Science > Artificial Intelligence

Title:ResearchArena: Benchmarking LLMs' Ability to Collect and Organize Information as Research Agents

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:ResearchArena: Benchmarking LLMs' Ability to Collect and Organize Information as Research Agents

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators