ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models

Zhang, Zhaowei; Bai, Fengshuo; Gao, Jun; Yang, Yaodong

Computer Science > Computation and Language

arXiv:2310.00378 (cs)

[Submitted on 30 Sep 2023 (v1), last revised 17 Jun 2024 (this version, v4)]

Title:ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models

Authors:Zhaowei Zhang, Fengshuo Bai, Jun Gao, Yaodong Yang

View PDF

Abstract:Personal values are a crucial factor behind human decision-making. Considering that Large Language Models (LLMs) have been shown to impact human decisions significantly, it is essential to make sure they accurately understand human values to ensure their safety. However, evaluating their grasp of these values is complex due to the value's intricate and adaptable nature. We argue that truly understanding values in LLMs requires considering both "know what" and "know why". To this end, we present a comprehensive evaluation metric, ValueDCG (Value Discriminator-Critique Gap), to quantitatively assess the two aspects with an engineering implementation. We assess four representative LLMs and provide compelling evidence that the growth rates of LLM's "know what" and "know why" capabilities do not align with increases in parameter numbers, resulting in a decline in the models' capacity to understand human values as larger amounts of parameters. This may further suggest that LLMs might craft plausible explanations based on the provided context without truly understanding their inherent value, indicating potential risks.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
ACM classes:	I.2.m; K.4.m
Cite as:	arXiv:2310.00378 [cs.CL]
	(or arXiv:2310.00378v4 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.00378

Submission history

From: Zhaowei Zhang [view email]
[v1] Sat, 30 Sep 2023 13:47:55 UTC (1,311 KB)
[v2] Sat, 7 Oct 2023 09:18:51 UTC (1,311 KB)
[v3] Thu, 19 Oct 2023 03:18:58 UTC (1,303 KB)
[v4] Mon, 17 Jun 2024 07:58:00 UTC (373 KB)

Computer Science > Computation and Language

Title:ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators