Non-discrimination Criteria for Generative Language Models

Sterlie, Sara; Weng, Nina; Feragen, Aasa

Computer Science > Computation and Language

arXiv:2403.08564 (cs)

[Submitted on 13 Mar 2024]

Title:Non-discrimination Criteria for Generative Language Models

Authors:Sara Sterlie, Nina Weng, Aasa Feragen

View PDF

Abstract:Within recent years, generative AI, such as large language models, has undergone rapid development. As these models become increasingly available to the public, concerns arise about perpetuating and amplifying harmful biases in applications. Gender stereotypes can be harmful and limiting for the individuals they target, whether they consist of misrepresentation or discrimination. Recognizing gender bias as a pervasive societal construct, this paper studies how to uncover and quantify the presence of gender biases in generative language models. In particular, we derive generative AI analogues of three well-known non-discrimination criteria from classification, namely independence, separation and sufficiency. To demonstrate these criteria in action, we design prompts for each of the criteria with a focus on occupational gender stereotype, specifically utilizing the medical test to introduce the ground truth in the generative AI context. Our results address the presence of occupational gender bias within such conversational language models.

Comments:	14 pages, 5 figures. Submitted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2403.08564 [cs.CL]
	(or arXiv:2403.08564v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.08564

Submission history

From: Sara Sterlie [view email]
[v1] Wed, 13 Mar 2024 14:19:08 UTC (2,800 KB)

Computer Science > Computation and Language

Title:Non-discrimination Criteria for Generative Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Non-discrimination Criteria for Generative Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators