Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches

Biedma, Pablo; Yi, Xiaoyuan; Huang, Linus; Sun, Maosong; Xie, Xing

Computer Science > Computation and Language

arXiv:2404.12744 (cs)

[Submitted on 19 Apr 2024 (v1), last revised 10 May 2024 (this version, v2)]

Title:Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches

Authors:Pablo Biedma, Xiaoyuan Yi, Linus Huang, Maosong Sun, Xing Xie

View PDF HTML (experimental)

Abstract:Recent advancements in Large Language Models (LLMs) have revolutionized the AI field but also pose potential safety and ethical risks. Deciphering LLMs' embedded values becomes crucial for assessing and mitigating their risks. Despite extensive investigation into LLMs' values, previous studies heavily rely on human-oriented value systems in social sciences. Then, a natural question arises: Do LLMs possess unique values beyond those of humans? Delving into it, this work proposes a novel framework, ValueLex, to reconstruct LLMs' unique value system from scratch, leveraging psychological methodologies from human personality/value research. Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs, synthesizing a taxonomy that culminates in a comprehensive value framework via factor analysis and semantic clustering. We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system. Based on this system, we further develop tailored projective tests to evaluate and analyze the value inclinations of LLMs across different model sizes, training methods, and data sources. Our framework fosters an interdisciplinary paradigm of understanding LLMs, paving the way for future AI alignment and regulation.

Comments:	16 pages, work in progress
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.12744 [cs.CL]
	(or arXiv:2404.12744v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.12744

Submission history

From: Xiaoyuan Yi [view email]
[v1] Fri, 19 Apr 2024 09:44:51 UTC (10,794 KB)
[v2] Fri, 10 May 2024 06:09:02 UTC (10,794 KB)

Computer Science > Computation and Language

Title:Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators