MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

Ni, Shiwen; Tan, Minghuan; Bai, Yuelin; Niu, Fuqiang; Yang, Min; Zhang, Bowen; Xu, Ruifeng; Chen, Xiaojun; Li, Chengming; Hu, Xiping; Li, Ye; Fan, Jianping

Computer Science > Computation and Language

arXiv:2402.16389 (cs)

[Submitted on 26 Feb 2024]

Title:MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

Authors:Shiwen Ni, Minghuan Tan, Yuelin Bai, Fuqiang Niu, Min Yang, Bowen Zhang, Ruifeng Xu, Xiaojun Chen, Chengming Li, Xiping Hu, Ye Li, Jianping Fan

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have demonstrated impressive performance in various natural language processing (NLP) tasks. However, there is limited understanding of how well LLMs perform in specific domains (e.g, the intellectual property (IP) domain). In this paper, we contribute a new benchmark, the first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for the evaluation of LLMs in the IP domain. The MoZIP benchmark includes three challenging tasks: IP multiple-choice quiz (IPQuiz), IP question answering (IPQA), and patent matching (PatentMatch). In addition, we also develop a new IP-oriented multilingual large language model (called MoZi), which is a BLOOMZ-based model that has been supervised fine-tuned with multilingual IP-related text data. We evaluate our proposed MoZi model and four well-known LLMs (i.e., BLOOMZ, BELLE, ChatGLM and ChatGPT) on the MoZIP benchmark. Experimental results demonstrate that MoZi outperforms BLOOMZ, BELLE and ChatGLM by a noticeable margin, while it had lower scores compared with ChatGPT. Notably, the performance of current LLMs on the MoZIP benchmark has much room for improvement, and even the most powerful ChatGPT does not reach the passing level. Our source code, data, and models are available at \url{this https URL}.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.16389 [cs.CL]
	(or arXiv:2402.16389v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2402.16389
Journal reference:	LREC-COLING 2024

Submission history

From: Shiwen Ni [view email]
[v1] Mon, 26 Feb 2024 08:27:50 UTC (2,307 KB)

Computer Science > Computation and Language

Title:MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators